Life in the Fast Lane: data.table Intro and Best Practices

Date: 
Monday, September 17, 2018 - 18:30
Source: 
New York Open Statistical Programming Meetup
Attendees: 
130
City: 
New York
Price: 
5.00

As a counterpoint to JD Long's dplyr talk (https://www.meetup.com/nyhackr/events/250182828/) Bill Gold will show the same functionality in data.table.

To learn more about the meetup and see videos of past presentations visit nyhackr.org.

About the Talk:

data.table, an extension of R's data.frame, is a popular package that enables the user to focus more on finding new insights and less on managing data and also provides fast data processing capabilities.

Reasons to consider using data.table include: it’s fast, really fast; it works great with large datasets; it offers concise and easy to read code.

We will show fast manipulation of 100GB+ of data and share best practices and sample code. For some people data.table syntax may at first appear less than intuitive so our goal is to reduce data.table’s learning curve by connecting data.table syntax with SQL and dplyr equivalents.

About Bill:

Bill Gold’s career intersects data science, management consulting and technology. He has delivered $500MM+ in ROI to financial services and healthcare clients. Across the customer lifecycle he’s developed and deployed: hundreds of models (some patented), commercial software and data science infrastructure with trillions of transactions and hundreds of terabytes of data, all servicing thousands of data scientists.

Bill guest lectures as Columbia and NYU. He has a BS in Electrical Engineering from Hofstra University.

Pizza (nyhackr.org/pizzapoll.html) begins at 6:30, the talk starts at 7, then after we head to the local bar.

AWS Loft

350 West Broadway