Engineering Blog

Announcing the Yelp Dataset Challenge Round 7

What can you learn from a Photo? Show us with the Yelp Dataset Challenge Round 7! The Challenge The Yelp Dataset Challenge provides the academic community with a real-world dataset over which to apply their research. We encourage students to take advantage of this wealth of data to develop and extend their own research in data science and machine learning. Students who submit their research are eligible for cash awards and incentives for publishing and presenting their findings. A new round of the Yelp Dataset Challenge (our seventh already!) opened on January 15, 2016, giving students access to reviews and...

Continue reading

Yelp Dataset Challenge Round 5 Winner

Yelp Dataset Challenge Round 5 Winners The fifth round of the Yelp Dataset Challenge ran throughout the first half of 2015 and we were quite impressed with the projects and concepts that came out of the challenge. Today, we are proud to announce the grand prize winner of the $5,000 award: “From Group to Individual Labels Using Deep Features” by Dimitrios Kotzias, Misha Denil, Nando De Freitas, and Padhraic Smyth (from the University of California, Irvine, the University of Oxford, and the Canadian Institute for Advanced Research). This paper proposes a novel approach to using group-level labels (e.g. the category...

Continue reading

Critical CSS Middleware: Inlining The Important CSS rules On-The-Fly

Website performance can be judged in a lot of ways, but perhaps the most important is user-perceived performance: the amount of time that is taken between clicking a link and having the desired page rendered on the screen. A big part of keeping things feeling snappy is understanding which bits of content are blocking the “critical rendering path,” and coming up with ways to shorten or unblock them. At Yelp we focused on shortening the process of loading our CSS stylesheets. Before the browser can begin rendering the page, it needs to have its HTML markup and CSS rules. Usually,...

Continue reading

Introducing dumb-init, an init system for Docker containers

At Yelp we use Docker containers everywhere: we run tests in them, build tools around them, and even deploy them into production. In this post we introduce dumb-init, a simple init system written in C which we use inside our containers. Lightweight containers have made running a single process without normal init systems like systemd or sysvinit practical. However, omitting an init system often leads to incorrect handling of processes and signals, and can result in problems such as containers which can’t be gracefully stopped, or leaking containers which should have been destroyed. dumb-init is simple to use and solves...

Continue reading

Introducing the Yelp Restaurant Photo Classification Challenge

We’re excited to release our first image dataset with hundreds of thousands of user-submitted photos as part of a challenge to all data scientists, launching this week on Kaggle! Yelp’s users provide several kinds of “unstructured” data such as reviews, photos, and videos. They can also answer structured questions like, “Is the restaurant romantic?” These structured answers are incredibly useful to users who want a quick summary of important attributes of a business. We want to know: can you extract these attributes from our photos dataset, and what is the right way to approach this problem? If this type of...

Continue reading

How We Made Yelp Search Filters Data Driven

Yelp has an incredible amount of information about businesses to power local search experiences. Many of our users use our advanced search filters to explore and find exactly the place they are looking for. While most people don’t have any trouble filtering their searches using filters such as price, distance, and rating, it was harder for users to employ our more specialized filters such as “Outdoor Seating” or “Live Music”. We set off on a mission to make our advanced filters more approachable for casual users without hindering the experience for our advanced users. Before designing the new filters, we...

Continue reading

It's The Holiday Season and We’re Giving You A Present: PaaSTA!

In case you missed it last month, we open sourced PaaSTA, our platform-as-a-service which received a lot of excitement and support from the community. We want to bring our friends together to learn more, so we’re hosting our last Yelp Tech Talk of the year in conjunction with Mesosphere on December 17th, featuring PaaSTA + Mesos! Our lead engineer on PaaSTA, Kyle Anderson, will introduce the platform and speak to how it uses Mesos to power our SOA architecture while Sunil Shah, from Mesosphere, will speak on Mesosphere’s DCOS and how it enables PaaS tools like PaaSTA to exist. We’ll...

Continue reading

Introducing PaaSTA: An Open, Distributed, Platform as a Service

As an Operations engineer, my first priority is to keep the site up. A close second is enabling developers to quickly go from an idea to running code in production. Once an organization grows, the only sane way to ship code with any reasonable frequency is to split it up into microservices, also known as building a Service Oriented Architecture (SOA). We’ve previously talked about the philosophy behind services and why we build them. This blog post explains the tool we use to make those services available to developers: PaaSTA! What is PaaSTA? PaaSTA is Yelp’s platform-as-a-service. It allows developers...

Continue reading