Engineering Blog

Introducing Bento

Today we’re proud to introduce Bento, an open source framework for building modularized Android user interfaces, created here at Yelp. Over the past year, we’ve seen great developer productivity gains and product design flexibility from using Bento on our most critical screens. In this post we’ll explain a bit about how Bento works, why you might want to use it, and where we want to go next. What is Bento? We named this framework after the wonderfully compartmentalized Japanese lunch container. A Bento box is a container with dividers to separate different food items from each other. If you squint...

Continue reading

Jira & Ansible: Scaling Jira Server Administration for the Enterprise

In 2017, Yelp had over 40 Jira administrators to allow different teams across the organization to perform administrative tasks. With lots of admins came lots of changes, which lead to our Jira environment accumulating hundreds of orphaned workflows, screens, and schemes. To solve this problem, we built a scalable solution that empowers our engineers to create Jira projects themselves using code and source control – ensuring 100% standardization across all engineering projects and making Jira easier to manage, simpler to use, and better performing. This blog post will cover Yelp's use of Ansible to manage Jira server project creation, updates,...

Continue reading

Tech Intersections Conference

This year Yelp sponsored the second annual Tech Intersections conference in Oakland, CA. It was a great opportunity to celebrate womxn of color in tech and to come together and learn from each other’s successes, challenges, and experiences. The conference, which featured ALL womxn speakers and attendees, highlighted topics ranging from tech entrepreneurship to self-care and career skills. Kelly Greenia, Engineering Recruiter, with some Yelp Swag! Two members of Yelp’s Awesome Women in Engineering (AWE) group attended the conference and below are some of their takeaways. This past weekend, we had the opportunity to attend the Tech Intersections conference at...

Continue reading

Autoscaling Mesos Clusters with Clusterman

Here at Yelp, we host a lot of servers in the cloud. In order to make our website more reliable—yet cost-efficient during periods of low utilization—we need to be able to autoscale clusters based on usage metrics. There are quite a few existing technologies for this purpose, but none of them really meet our needs of autoscaling extremely diverse workloads (microservices, machine learning jobs, etc.) at Yelp’s scale. In this post, we’ll describe our new in-house autoscaler called Clusterman (the “Cluster Manager”) and its magical ability to unify autoscaling resource requests for diverse workloads. We’ll also describe the Clusterman simulator,...

Continue reading

Yelp Dataset Challenge: Round 11 Winners

The eleventh round of the Yelp Dataset Challenge ran throughout the first half of 2018 and we received many impressive, original, and fascinating submissions. As usual, we were struck by the quality of the entries: keep up the good work, folks! Today, we are proud to announce the grand prize winner of the $5,000 award: “Generalized Latent Variable Recovery for Generative Adversarial Networks” by Nicholas Egan, Jeffrey Zhang, and Kevin Shen (from the Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science). The authors used a Deep Convolutional Generative Adversarial Network (DCGAN) to create photo-realistic pictures of food...

Continue reading

Migrating Kafka's Zookeeper With No Downtime

Here at Yelp we use Kafka extensively. In fact, we send billions of messages a day through our various clusters. Behind the scenes, Kafka uses Zookeeper for various distributed coordination tasks, such as deciding which Kafka broker is in charge of assigning partition leaders and storing metadata about the topics in its brokers. Kafka’s success within Yelp has also meant that our clusters have grown substantially from when they were first deployed. At the same time, our other heavy Zookeeper users (e.g., Smartstack and PaasTA) have increased in scale, putting more load on our shared Zookeeper clusters. To alleviate this situation, we...

Continue reading

Joinery: A Tale of Un-Windowed Joins

Summary At Yelp, we generate a wide array of high throughput data streams spanning logs, business data, and application data. These streams need to be joined, filtered, aggregated, and sometimes even quickly transformed. To facilitate this process, the engineering team has invested a significant amount of time analyzing multiple stream processing frameworks, ultimately identifying Apache Flink as the best suited option for these scenarios. We’ve now implemented a join algorithm using Flink, which we’re calling “Joinery.” It is capable of performing un-windowed one-to-one, one-to-many, and many-to-many inner joins across two-or-more keyed data streams. So, how does it work? In the...

Continue reading

TTL as a Service: Automatic Revocation of Stale Privileges

Security and usability are often at odds with one another, a fact that is best illustrated by access control. Deny everyone, and you’ll have a super secure system that no one can use; allow everyone, and you’ll maximize usability at the cost of security. The Principle of Least Privilege exists to balance both security and usability by giving users only the minimum amount of access they need to do their job. This reduces the attack surface by preventing attackers from leveraging a compromised user’s important, albeit unused, privileges for vertical/horizontal escalation. The Problem That said, there are a few key...

Continue reading