Engineering Blog

Fine-tuning AWS ASGs with Attribute Based Instance Selection

This is the next installment of our blog series on improving our autoscaling infrastructure. In the previous blog posts (Open-sourcing Clusterman, Recycling kubernetes nodes) we explained the architecture and inner-working of Clusterman. This time we are discussing how attribute based instance selection in the autoscaling group has helped us make our infrastructure more reliable and cost effective, while also decreasing the operation overhead. This will also cover how these changes enabled us to migrate from Clusterman to Karpenter. (Spoiler alert: Karpenter blog post is coming soon!) Motivation At Yelp we run most of our workload on AWS spot instances, and...

Continue reading

Moderating Inappropriate Video Content at Yelp

One of Yelp’s top priorities is the trust and safety of our users. Yelp’s platform is most well-known for its reviews, and its moderation practices have been recognised in academic research for mitigating misinformation and building consumer trust. In addition to reviews, Yelp’s Trust and Safety team takes significant measures when it comes to protecting its users from inappropriate material posted through other content types. This blog post discusses how Yelp protects its users from inappropriate content in videos. Videos at Yelp Recently, Yelp revamped its review experience by giving users the ability to upload videos alongside their review text....

Continue reading

Phone Number Masking for Yelp Services Projects

In this blog post, we highlight how phone number masking helps build consumer trust in the services marketplace at Yelp, decreases the friction in communication with service professionals, and allows for seamless switching between the Yelp app and a user’s phone. We present a high level overview of our in-house phone masking system and dive into the details of the engineering challenge of optimizing the usage of proxy phone number resources at Yelp’s scale. Background Every year, millions of requests for quotes, consultations or other messages are sent to businesses on Yelp. Those users choose to use Yelp to connect...

Continue reading

CHAOS: Yelp's Unified Framework for Server-Driven UI

Yelp develops two major applications, Yelp & Yelp for Business, for Web (Desktop & Mobile), iOS, and Android platforms. That’s eight unique clients! Keeping a fresh, consistent UI on all these clients is a major challenge. Server-driven UI (SDUI) has become a standard industry technique for managing UI on multiple platforms. At Yelp, many product teams created SDUI frameworks for their features. Though successful, these frameworks were expensive to develop and maintain, and no single SDUI framework supported all our clients. In late 2021, we began building a unified SDUI framework called CHAOS or “Content Hosting Architecture with Optimization Strategies”....

Continue reading

Keeping track of engineering-wide goals and migrations

What is Engineering Effectiveness Metrics (EE Metrics)? EE Metrics was envisioned as a hub that helps teams manage their technical debt. EE Metrics provides every team with a detailed web page that contains information about technical debt that needs to be addressed. It also serves as a platform to highlight top engineering initiatives at the organization level. EE Metrics empowers infrastructure teams to surface important migrations or metrics that could improve the health of software projects. Organization-wide migrations of technologies can often be difficult to surface and keep track of. General EE Metrics lifecycle Figure 1: Diagram showing how EE...

Continue reading

Yelp’s AI pipeline for inappropriate language detection in reviews

Yelp’s mission is to connect consumers with great local businesses by giving them access to reliable and useful information. Consumer trust is one of our top priorities, which is why we make significant investments in technology and human moderation to protect the integrity and quality of content on Yelp. As a platform for user-generated content, we rely on our community of users and business owners to help report reviews that they believe may violate our Terms of Service and Content Guidelines. Our User Operations team investigates flagged content and, if it’s found to be in violation of our policies, may...

Continue reading

Building data abstractions with streaming at Yelp

Yelp relies heavily on streaming to synchronize enormous volumes of data in real time. This is facilitated by Yelp’s underlying data pipeline infrastructure, which manages the real-time flow of millions of messages originating from a plethora of services. This blog post covers how we leverage Yelp’s extensive streaming infrastructure to build robust data abstractions for our offline and streaming data consumers. We will use Yelp’s Business Properties ecosystem (explained in the upcoming sections) as an example. Key terminology Let’s start by covering certain key terms used throughout the post: Offline systems - data warehousing platforms such as AWS Redshift or...

Continue reading

Coordinator - The Gateway For Nrtsearch

While we once used Elasticsearch at Yelp, we have since built a replacement called Nrtsearch. The benefits and motivations of this switch can be found in our blog post: Nrtsearch: Yelp’s Fast, Scalable and Cost Effective Search Engine. However in this blog post, we will discuss the motivations behind building Nrtsearch Coordinator - a gateway for Nrtsearch clusters. We will also go over how Nrtsearch Coordinator adds sharding logic to Nrtsearch, handles scatter-gather queries, and adds support for dark/live launching cluster improvements. Motivations We traditionally used a gateway to call Elasticsearch, which provides metrics, isolation rate-limiting per client, and geo...

Continue reading