Engineering Blog

MySQL Minutiae & InnoDB Internals

At Yelp, we store nearly all of our data in MySQL. At any given time we’re issuing tens of thousands of SQL queries to our database cluster per second, with some individual servers going above the 10k qps mark. Our database cluster consists of billions of rows. In response to a lot of different problems we’ve had to optimize the snot out of our MySQL installation, and we’ve learned some interesting things along the way. A colleague and I recently gave a presentation to some of our coworkers, titled  MySQL Minutiae & InnoDB Internals. The talk covered some good background knowledge that every developer should...

Continue reading

After Hours Project: Kinect Hacking

Here at Yelp, we’re passionate about building things; it’s at the core of our engineering philosophy. In fact, we enjoy it so much that many of us keep on building after we finish work. I recently found some spare time to work on an interesting project with the Microsoft Kinect. I think it’s a cool start and I’ve open sourced the code so that others can build something even cooler. Easy Skeletal Tracking If you’re reading this blog, you’re likely familiar with the Microsoft Kinect. It combines an RGB camera, an infrared laser projector, and an infrared camera to determine...

Continue reading

Upcoming Tech Events at Yelp

Over the next couple of weeks, Yelp is going to be hosting two open-to-the-public events for members of the software development community: PyPy Just-in-Time Interpreters March 3rd, 6pm Armin Rigo of the PyPy project will be giving a presentation on achievements made by PyPy, the “fastest, most compatible, and most stable ‘alternative’ Python interpreter.” Special attention will be given to advancements in the area of dynamic (JIT) interpreters. You can find more information on SFPython’s Meetup page. If you plan to come, make sure to RSVP at least a day in advance so that security will allow you into the...

Continue reading

Weird iPhone Compiler/Architecture Bug

Alan on our iPhone team recently encountered a tricky bug that had to do with the ObjectiveC compiler and differences in architecture between ARM and x86. The Problem On the device (but not the sim), the following code would crash on views that implemented drawInRect methods that returned a CGPoint (instead of having a void return type like other drawInRect methods). for (id view in _subviews) { [view drawInRect:[view frame]]; } To make matters weirder, the pointer ‘view’ before stepping into drawInRect was not the same as the pointer ‘self’ after stepping into drawInRect! However, by typecasting view to the...

Continue reading

Towards Building a High-Quality Workforce with Mechanical Turk

In addition to having written over 15 million reviews, Yelpers also contribute hundreds of thousands of business listing corrections each year.  Not all of these corrections are accurate, though, and there are quite a few jokers out there (e.g. suggesting the aquariums category for popular seafood restaurants… very funny!).  Yelp is serious about the correctness of business listings, so in order to efficiently validate each and every change, we’ve turned to Amazon’s Mechanical Turk (AMT) as well as other automated methods.  We recently published a research paper [1] at the NIPS 2010 Workshop on Computational Social Science and the Wisdom...

Continue reading

Yelp's Third Hackathon

Click your heels together and say the magic words three times: “there’s no place like Yelp…there’s no place like Yelp…there’s no place like Yelp.” Then perhaps you’ll be whisked away to the land of Yelp Engineering where our third Hackathon just wrapped up. Like the previous Hackathon and the one before that, this 48-hour period where engineers were set loose to work on “anything you wouldn’t normally be able to work on that might be useful, funny, or cool” (and ostensibly, related at least indirectly to Yelp) resulted in some awesome creations. One of the biggest milestones for our third...

Continue reading

mrjob: Distributed Computing for Everybody

Ever wonder how we power the People Who Viewed this Also Viewed… feature? How does Yelp know that people viewing Coit Tower might also be interested in the Filbert Steps and the Parrots of Telegraph Hill? It’s pretty much what you’d expect: we look at a few months of access logs, find sessions where people viewed more than one business, and collect statistics about pairs of businesses that were viewed in the same session. Now here’s the kicker: we generate on the order of 100GB of log data every day. How do we deal with terabytes of data in a...

Continue reading

Now Testify!

The Yelp code base has been under development for over six years. We push multiple times a day. We don’t have a dedicated QA team. Yelpers get very angry when their detailed account of how that waitress was totally into them doesn’t get saved. Effective automated testing is the only way we can stay sane. One of the great benefits of Python is its “batteries included” philosophy which gives us access to lots of great libraries including the built-in unittest module. However, as our code base grows in size and complexity, so do our testing needs. There are plenty of...

Continue reading