Engineering Blog

Towards Building a High-Quality Workforce with Mechanical Turk

In addition to having written over 15 million reviews, Yelpers also contribute hundreds of thousands of business listing corrections each year.  Not all of these corrections are accurate, though, and there are quite a few jokers out there (e.g. suggesting the aquariums category for popular seafood restaurants… very funny!).  Yelp is serious about the correctness of business listings, so in order to efficiently validate each and every change, we’ve turned to Amazon’s Mechanical Turk (AMT) as well as other automated methods.  We recently published a research paper [1] at the NIPS 2010 Workshop on Computational Social Science and the Wisdom...

Continue reading

Yelp's Third Hackathon

Click your heels together and say the magic words three times: “there’s no place like Yelp…there’s no place like Yelp…there’s no place like Yelp.” Then perhaps you’ll be whisked away to the land of Yelp Engineering where our third Hackathon just wrapped up. Like the previous Hackathon and the one before that, this 48-hour period where engineers were set loose to work on “anything you wouldn’t normally be able to work on that might be useful, funny, or cool” (and ostensibly, related at least indirectly to Yelp) resulted in some awesome creations. One of the biggest milestones for our third...

Continue reading

mrjob: Distributed Computing for Everybody

Ever wonder how we power the People Who Viewed this Also Viewed… feature? How does Yelp know that people viewing Coit Tower might also be interested in the Filbert Steps and the Parrots of Telegraph Hill? It’s pretty much what you’d expect: we look at a few months of access logs, find sessions where people viewed more than one business, and collect statistics about pairs of businesses that were viewed in the same session. Now here’s the kicker: we generate on the order of 100GB of log data every day. How do we deal with terabytes of data in a...

Continue reading

Now Testify!

The Yelp code base has been under development for over six years. We push multiple times a day. We don’t have a dedicated QA team. Yelpers get very angry when their detailed account of how that waitress was totally into them doesn’t get saved. Effective automated testing is the only way we can stay sane. One of the great benefits of Python is its “batteries included” philosophy which gives us access to lots of great libraries including the built-in unittest module. However, as our code base grows in size and complexity, so do our testing needs. There are plenty of...

Continue reading

tron - Centralized Scheduling

As Yelp has grown over the years, we’ve amassed huge collections of data - the collective output of the actions of tens of millions of users. Analyzing this data helps us improve the user experience across the site, from ranking businesses and extracting “review highlights” to fighting spam and keeping the site secure. These tasks involve long-running batch processes that analyze large logs and database tables, with workflows sometimes composed of five or more dependent processes. Managing these workflows with traditional scheduling tools - most notably the cron family - eventually caused us to breach the “complexity comfort zone” that...

Continue reading

Push it!

At Yelp, we push new code live every day. Pushing daily allows us to quickly prototype new features and squash bugs in a proactive manner. Because we aim to deploy new code so often, we’re always looking for ways to make the process efficient and painless. There are four main stages to the Yelp push process: code review, integration, testing, and finally, live deployment. Each step is important, and there are ways to maximize the efficiency of all of them. Code Review This first stage of the push process happens before a push request is even made: all code destined...

Continue reading

What the Engineers-in-Training Work on at Yelp

This summer, we were fortunate to have 7 awesome interns join us for a 12-week internship at Yelp HQ. Coming from elite schools across the nation, these interns weren’t fetching us coffee or making copies. They were working side by side with our best and brightest on some of the most important and critical pieces of Yelp’s business. We had them coding away on everything from search infrastructure to user visible features. We couldn’t be prouder of the work they did and now we’d like to share their accomplishments with you! Aditya M.: Bachelors student at Carnegie Mellon Aditya tackled...

Continue reading

Yelp Makes Beer More Fun

Here at Yelp, we enjoy a good brew. We also love our jobs as engineers! So for our recent Hackathon, it made perfect sense to build something that made drinking beer more…interactive. We call it KegMate. </embed> Being that you’ve found yourself on our Engineering blog, I bet you’re interested in how it works, eh? Theory of Operation (An Overview) Sensors attached to the keg feed data into an Arduino microcontroller, which in turn communicates directly with the iPad via a serial connection. The iPad processes that data and displays it in a snazzy manner along with a description of...

Continue reading