Supporting Spark as a First-Class Citizen in Yelp’s Computing Platform
-
Jason Sleight, ML Platform Group Tech Lead; Huadong Liu, Software Engineer; and Stuart Elston, Software Engineer
- Mar 2, 2020
Yelp extensively utilizes distributed batch processing for a diverse set of problems and workflows. Some examples include: Computation over Yelp’s review corpus to identify restaurants that have great views Training ML models to predict personalized business collections for individual users Analytics to extract the most in-demand service offerings for Request a Quote projects On-demand workloads to investigate surges in bot traffic so we can quickly react to keep Yelp safe Over the past two years, Yelp engineering has undertaken a series of projects to consolidate our batch processing technologies and standardize on Apache Spark. These projects aimed to simultaneously accelerate...