"Optimizing Distinct Counts on PostgreSQL with HLL" with Sai Srirampur

Date: 
Tuesday, October 16, 2018 - 18:30
Source: 
SF Bay Area PostgreSQL Meetup
Attendees: 
38
City: 
San Francisco

Thanks to Citus Data for hosting!

Sai Srirampur, a solution architect for PostgreSQL and Citus databases, will present "Optimizing distinct counts on PostgreSQL with HLL".

In this talk, we will focus on HyperLogLog (HLL) algorithm and its PostgreSQL extension postgresql-hll. HLL can provide approximate answers to COUNT(DISTINCT) queries in mathematically provable error bounds. HLL is not only fast and memory-efficient but also has very interesting properties which especially shine in a distributed environment. During the talk, first, we’ll look at the internals of the HLL. Then, we will look to understand why HLL algorithm is useful to get efficient pre-aggregations and distinct counts in scalable way. Finally, we will look at how HLL can be used in a distributed Postgres database cluster with Citus.

RSVP NOTE: If your Meetup profile doesn't show your full name, please contact the organizers via Direct Message to give us your first and last name. We have to provide a list to the building for access. If you do not provide your full name, you will not be able to attend. Thanks!

CitusData

599 3rd Street