Building data abstractions with streaming at Yelp
-
Hakampreet Singh Pandher, Software Engineer
- Mar 8, 2024
Yelp relies heavily on streaming to synchronize enormous volumes of data in real time. This is facilitated by Yelp’s underlying data pipeline infrastructure, which manages the real-time flow of millions of messages originating from a plethora of services. This blog post covers how we leverage Yelp’s extensive streaming infrastructure to build robust data abstractions for our offline and streaming data consumers. We will use Yelp’s Business Properties ecosystem (explained in the upcoming sections) as an example. Key terminology Let’s start by covering certain key terms used throughout the post: Offline systems - data warehousing platforms such as AWS Redshift or...