Engineering Blog

Lessons from A/B Testing on Bandit Subjects

Abstract   Compared to full-scale ML, multi-armed bandit is a lighter weight solution that can help teams quickly optimize their product features without major commitments. However, bandits need to have a candidate selection step when they have too many items to choose from. Using A/B testing to optimize the candidate selection step causes new bandit bias and convergence selection bias. New bandit bias occurs when we try to compare new bandits with established ones in an experiment; convergence selection bias creeps in when we try to solve the new bandit bias by defining and selecting established bandits. We discuss our...

Continue reading

Spark Data Lineage

In this blog post, we introduce Spark-Lineage, an in-house product to track and visualize how data at Yelp is processed, stored, and transferred among our services. What is Spark-Lineage? Spark and Spark-ETL: At Yelp, Spark is considered a first-class citizen, handling batch jobs in all corners, from crunching reviews to identify similar restaurants in the same area, to performing reporting analytics about optimizing local business search. Spark-ETL is our inhouse wrapper around Spark, providing high-level APIs to run Spark batch jobs and abstracting away the complexity of Spark. Spark-ETL is used extensively at Yelp, helping save time that our engineers...

Continue reading

Android in Analytics Infra

At Yelp, we have a reasonably large Android community for a company of Yelp’s size. These talented and skilled Android engineers work on Yelp’s client and business applications. We would like to share some of the unique challenges that we’ve experienced along with our various efforts to overcome those challenges. Analytics Infra is a team at Yelp that works on experimentation and logging platforms and supports them across the entire Yelp ecosystem. Within the Analytics Infra team, we have an Android working group. You may consider our team as an infrastructure team - a team that implements end-user functionality -...

Continue reading

Writing Emails Using React

As part of our effort to connect users with great local businesses, Yelp sends out tens of millions of emails every month. In order to support the scale of those sends, we rely on third-party Email Service Providers (ESPs) as well as our internal email system, Mercury. Delivering the emails is just part of the challenge—we also need to give email developers a way to craft sophisticated templates that conform to our Yelp design guidelines. In the past, Yelp web and full stack engineers would rely on our legacy template language, Cheetah, to write emails. However, as the Yelp design...

Continue reading

Migrating from Styleguidist to Storybook

One of the core tenets for our infrastructure and engineering effectiveness teams at Yelp is ensuring we have a best-in-class developer experience. Our React monorepo codebase has steadily grown as developers create new React components, but our existing React Styleguidist (Styleguidist, for short) development environment has failed to scale in parallel. By transitioning from Styleguidist to Storybook, we were able to offer a faster and more user-friendly development environment for React components along with better alignment to developer and designer workflows. In this post we’ll take a deep dive into how and why we migrated to Storybook. Background Status Quo...

Continue reading

A Simply, Ordinary Reduction

Experimentation has become standard practice for companies, and one of the most important aspects is how to evaluate the results to make ship/no-ship decisions. Have you run into experiments where you don’t have enough data for statistically significant results or perhaps the performance of your primary metric seemingly disagrees with that of your secondary metrics? If so, leveraging existing features to perform variance reduction may help with coming to a conclusion. At Yelp, we have found that using features typically used in ML modeling, in particular, can help with measuring treatment effects better than solely using t-tests! Introduction Before deciding...

Continue reading

Data Sanitization with Vitess

Our community of users will always come first, which is why Yelp takes significant measures to protect sensitive user information. In this spirit, the Database Reliability Engineering team implemented a data sanitization process long ago to prevent any sensitive information from leaving the production environment. The data sanitization process still enables developers to test new features and asynchronous jobs against a complete, real time dataset without complicated data imports. MySQL and other open source project innovations over the last decade have led us on a journey to Vitess, which is now responsible for over 1500 workflows across more than 100...

Continue reading

Beyond Matrix Factorization: Using hybrid features for user-business recommendations

Yelp’s mission is to connect people with great local businesses. On the Recommendations & Discovery team, we sift through billions of users-business interactions to learn user preferences. Our solutions power several products across Yelp such as personalized push notifications, email engagement campaigns, the home feed, Collections and more. Here we discuss the generalized user to business recommendation model which is crucial to a lot of these applications. High level overview of our recommendation system. Our previous approach for user to business recommendation was based on Spark’s Alternating Least Squares (ALS) algorithm which factorized the user-business interaction matrix to user-vectors and...

Continue reading