Bay Area Apache Spark Meetup @ Salesforce SF

Thursday, June 13, 2019 - 18:00
Bay Area Spark Meetup
San Francisco

Join us for an evening of Bay Area Apache Spark Meetup featuring tech-talks about Apache Spark, Machine Learning, and Delta Lake at scale from Salesforce and Databricks

This meetup is hosted and sponsored by Salesforce.


6:00 - 6:30 pm: Social Hour with Food & Drinks
6:30 - 6:35 pm: Introduction & Announcements
6:35 - 7:15 pm: Tech Talk from Salesforce Einstein
7:15 - 7:55 pm: Tech Talk from Databricks: Delta Lake
7:55 - 8:15 pm: Additional Networking, Q&A

Talk 1 Title: Apache Spark at Salesforce Einstein for Marketing Cloud

Presenters: Peter Krmpotic and Kexin Xie


Salesforce Einstein makes more than 6.5B+ predictions per day to deliver highly personalized customer experiences on behalf of some of the biggest brands in the world and their engineering/product teams have an extensive background in using Apache Spark in production.

This talk will cover some of their biggest use cases, their successful transition from Apache Hadoop to Apache Spark, valuable insights about using Spark in production and an architecture review for one of their core capabilities.

Bio: Peter Krmpotic
Peter Krmpotic is a director of product management for Salesforce Marketing Cloud Einstein and former product leader at Adobe Experience Cloud, BrightEdge and Boost Media. This career experience has given him a comprehensive view of what brands must develop to provide unified customer experiences at scale, the bedrock of any digitally transformed organization. He works with customers to instill customer-centricity to set them up as AI-first companies and leaders of the technology-based emergence of the fourth industrial revolution.

In his current role at Salesforce, Peter focuses on democratizing artificial intelligence, specifically deep learning, and natural language processing, for the purpose of personalizing customer experiences at scale. He holds MSc in Computer Science from Karlsruhe Institute of Technology (KIT) in Germany

Bio: Kexin Xie

At Salesforce, Kexin Xie is responsible for researching and designing the core distributed data processing and machine learning architecture for Marketing Cloud Einstein and Salesforce DMP. He leads a team of data science engineers with a strong focus on continuously improving operational aspects such as performance, fault tolerance, scale, automation, and cost.

Before Salesforce, Kexin worked for Krux, BigCommerce, NICTA, Brandscreen, Freelancer and Microsoft Research building large-scale machine learning, data mining, real-time bidding, intelligent marketing, anti-fraud and anti-money laundering software systems. Kexin also holds a Ph.D. degree in computer science.

Talk 2 Title: Open Source Reliability for Data Lake with Apache Spark
Presenter: Michael Armbrust
Abstract: Delta Lake is an open source storage layer that brings reliability to data lakes. Delta Lake offers ACID transactions, scalable metadata handling, and unifies the streaming and batch data processing. It runs on top of your existing data lake and is fully compatible with Apache Spark APIs.

In this talk, we will cover
* What data quality problems Delta helps address
* How to convert your existing application to delta
* How the Delta transaction protocol works internally
* The Delta roadmap for the next few releases
* How to get involved!

Bio: Michael Armbrust is a committer and PMC member of Apache Spark and the original creator of Spark SQL. He currently leads the team at Databricks that designed and built Structured Streaming and the Delta Lake open source project. He received his Ph.D. from UC Berkeley in 2013 and was advised by Michael Franklin, David Patterson, and Armando Fox. His thesis focused on building systems that allow developers to rapidly build scalable interactive applications and specifically defined the notion of scale independence. His interests broadly include distributed systems, large-scale structured storage, and query optimization.

Salesforce East

350 Mission St