Engineering Blog

Yelp Tech Talks: Mobile Testing 1, 2, 3 Wrap Up

Last week we held our second tech talk, focusing on mobile, in a new internal series we launched this year. The presenters covered two very important topics in mobile: wearable apps and testing. Building Our Apple Watch App The evening started with a talk by Bill M. who lead the efforts to build our Apple Watch app. We knew we had to build the app in order to provide our users with the best possible access and experience. Since the platform was brand new and kept changing, it came with its own challenges. Development needed to be planned carefully to...

Continue reading

Mycroft - Load Data into Redshift Automatically

Yelp generates terabytes of logs every day. Starting in 2010 with the release of mrjob, Yelp has relied heavily on Amazon Elastic MapReduce (EMR) and MapReduce jobs to analyze this data. While MapReduce works well to repeatedly answer the same question, it’s not a great tool to answer questions that are not well defined or that need to be answered only once. Consequently, we started using Redshift, Amazon’s Postgres-compatible column-oriented data warehouse, to explore our data. Yelp’s log data already lands on S3 every day making it a convenient location to stage data for loading into Redshift. Unfortunately, most of...

Continue reading

Data Science Contest “Keeping it Fresh”: Predict Restaurant Health Scores

Yelp connects people with local businesses and along the way we’ve gathered rich data about customers’ experiences at those businesses via reviews, tips, check-ins and business attributes. We are constantly asking ourselves how the collective wisdom of Yelpers can be used to better inform cities in their efforts around protecting the health of our communities. In particular, could we use Yelp’s reviews and business information to make the process of sending Health Inspectors to restaurants more efficient? According to the Centers for Disease Control, more than 48 million Americans per year become sick from food, and an estimated 75% of...

Continue reading

CocoaPods (or How to Stop Worrying About Dependency Management)

Yelp has had an iOS app for as long as third-party iOS apps have existed. Maintaining a codebase with that much history is always interesting and sometimes challenging, and one of the biggest challenges is dependency management. For a long time, git submodules met most of our needs and caused relatively few headaches. However, the submodule approach made it difficult to understand what unanticipated or even breaking changes will be introduced when bumping a submodule by a commit – or several. A Git SHA has no concept of versioning. Additionally, adding a new library often required changes to the build...

Continue reading

Yelp Hackathon 16: What the hack is a Kühlkdebeulefotoapparat

A couple of weeks ago, the Yelp Product & Engineering teams put their creative minds together to work on our coolest, funniest and hardcore-est of ideas at the 16th edition of our internal Hackathon. As always, the food was plentiful- our kitchens were stacked with delicious catered food, fresh fruits and snacks, gourmet coffee, and, wait for it, ice-cream sandwiches! While all that deliciousness fueled our bodies, our minds were fueled by do-it-yourself Metal Earth 3D model kits and metal-etching workshops. If that wasn’t enough, we even got a chance to race these amazing Anki cars. Our VP of Engineering...

Continue reading

Analyzing the Web For the Price of a Sandwich

I geek out about the Common Crawl. It’s an open source crawl of huge parts of the Internet, accessible for anyone to use. You have full access to the HTML and text of billions of web pages. What’s more, you can scan the entire thing, tens of terabytes, for just a few bucks on Amazon EC2. These days they’re releasing a new dataset every month. It’s awesome. People frequently use mrjob to scan the Common Crawl, so it seems like a fitting tool for us to use. mrjob, if you’re not familiar, is a Python framework written by Yelp to...

Continue reading

March Events @ Yelp

This month we’re ramping up and preparing for an awesome time at PyCon. We’ll be there in full force next month so look for us there at booth 606! Be sure to catch a presentation by our own Soups R. on Friday, April 10 at 12:10 where he’ll be speaking on Data Science in Advertising: Or a future when we love ads. In the meantime, hopefully you aren’t too sleepy from daylight savings time to attend some great events this month: Wednesday, March 11, 2015 - 6:00PM - Tech talks and PyCon Startup Row Pitches (SF Python) Thursday, March 19,...

Continue reading

Reading Between the Lines: How We Make Sense of Users' Searches

The Problem People expect a lot out of search queries on Yelp. Understanding exact intent from a relatively vague text input is challenging. A few months ago, the Search Quality team felt like we needed to take a step back and reassess how we were thinking about a user’s search so that we could return better results for a richer set of searches. Our main business search stack takes into account many kinds of features that can each be classified as being related to one of distance, quality and relevance. However, sometimes these signals encode related meaning, making the equivalence...

Continue reading