The Challenge

The inaugural Yelp Dataset Challenge opened in March 2013 with the release of our latest academic dataset featuring reviews and businesses from the greater Phoenix metro area. The goal of the dataset was to encourage development of new techniques in data analysis and machine learning while providing the academic community with a rich dataset over which to train their models. Students who submitted their research related to the dataset were eligible for a cash reward and further incentives for publishing and presenting their findings.

The Winners of the First Yelp Dataset Challenge

The challenge was viewed by many thousands of people and thousands of qualified applicants participated by downloading the dataset. From the completed entries we received, a team of our data mining engineers have selected four entries as grand prize winners (in alphabetical order by entry name): 

Most of the entries used some aspect of machine learning; from inferring subtopics (Huang, et. al. and McAuley, et. al.) to predicting future reviews (Hood, et. al.) and many others. We also received many other entries including one of the winners, Wang, et. al., which applied word clouds to increase the utility of a large number of reviews.

Opening the Next Yelp Dataset Challenge

We are happy to announce the next iteration of the Yelp Dataset Challenge. The challenge will be open to students in the US and Canada and will run from September 26, 2013 to February 10, 2014. See the website for the full terms and conditions

This data can be used to train a myriad of models and extend research in many fields. We can’t wait to see what you come up with!

Back to blog