Engineering Blog

mrjob: Distributed Computing for Everybody

Ever wonder how we power the People Who Viewed this Also Viewed… feature? How does Yelp know that people viewing Coit Tower might also be interested in the Filbert Steps and the Parrots of Telegraph Hill? It’s pretty much what you’d expect: we look at a few months of access logs, find sessions where people viewed more than one business, and collect statistics about pairs of businesses that were viewed in the same session. Now here’s the kicker: we generate on the order of 100GB of log data every day. How do we deal with terabytes of data in a...

Continue reading

Now Testify!

The Yelp code base has been under development for over six years. We push multiple times a day. We don’t have a dedicated QA team. Yelpers get very angry when their detailed account of how that waitress was totally into them doesn’t get saved. Effective automated testing is the only way we can stay sane. One of the great benefits of Python is its “batteries included” philosophy which gives us access to lots of great libraries including the built-in unittest module. However, as our code base grows in size and complexity, so do our testing needs. There are plenty of...

Continue reading

tron - Centralized Scheduling

As Yelp has grown over the years, we’ve amassed huge collections of data - the collective output of the actions of tens of millions of users. Analyzing this data helps us improve the user experience across the site, from ranking businesses and extracting “review highlights” to fighting spam and keeping the site secure. These tasks involve long-running batch processes that analyze large logs and database tables, with workflows sometimes composed of five or more dependent processes. Managing these workflows with traditional scheduling tools - most notably the cron family - eventually caused us to breach the “complexity comfort zone” that...

Continue reading

Push it!

At Yelp, we push new code live every day. Pushing daily allows us to quickly prototype new features and squash bugs in a proactive manner. Because we aim to deploy new code so often, we’re always looking for ways to make the process efficient and painless. There are four main stages to the Yelp push process: code review, integration, testing, and finally, live deployment. Each step is important, and there are ways to maximize the efficiency of all of them. Code Review This first stage of the push process happens before a push request is even made: all code destined...

Continue reading

What the Engineers-in-Training Work on at Yelp

This summer, we were fortunate to have 7 awesome interns join us for a 12-week internship at Yelp HQ. Coming from elite schools across the nation, these interns weren’t fetching us coffee or making copies. They were working side by side with our best and brightest on some of the most important and critical pieces of Yelp’s business. We had them coding away on everything from search infrastructure to user visible features. We couldn’t be prouder of the work they did and now we’d like to share their accomplishments with you! Aditya M.: Bachelors student at Carnegie Mellon Aditya tackled...

Continue reading

Yelp Makes Beer More Fun

Here at Yelp, we enjoy a good brew. We also love our jobs as engineers! So for our recent Hackathon, it made perfect sense to build something that made drinking beer more…interactive. We call it KegMate. </embed> Being that you’ve found yourself on our Engineering blog, I bet you’re interested in how it works, eh? Theory of Operation (An Overview) Sensors attached to the keg feed data into an Arduino microcontroller, which in turn communicates directly with the iPad via a serial connection. The iPad processes that data and displays it in a snazzy manner along with a description of...

Continue reading

Yelp Engineers Discover Blogging

For six years we’ve been hammering away on this little website of ours, feeding the needs of a steadily growing user-base with awesome features, freaky-good SEO and an elegant, performant architecture. What started out as a couple of guys in a room together is now nearly forty engineers developing a diverse set of systems that serve over 35 million unique visitors per month. We’ve come a long way in that time and covered a lot of ground — but our source control log still has blame lines from PayPal co-founder and initial Yelp investor Max Levchin, so it’d be hard...

Continue reading