« November 2010 | Main | March 2011 »

February 2011

02/24/2011

Upcoming Tech Events at Yelp

Over the next couple of weeks, Yelp is going to be hosting two open-to-the-public events for members of the software development community:

PyPy Just-in-Time Interpreters

March 3rd, 6pm

Armin Rigo of the PyPy project will be giving a presentation on achievements made by PyPy, the "fastest, most compatible, and most stable 'alternative' Python interpreter." Special attention will be given to advancements in the area of dynamic (JIT) interpreters. You can find more information on SFPython's Meetup page. If you plan to come, make sure to RSVP at least a day in advance so that security will allow you into the building.

San Francisco Hadoop User Meetup

March 9th, 6pm

The third SF Hadoop meetup will be taking place at Yelp! The meetup is discussion-based, using an "unconference" format. Agenda and topics are determined at the beginning of the meeting (and anyone may volunteer a topic), but a proposed theme for the upcoming session is "integration." The session is expected to last approximately 2 hours, and more information is also available on the SF Hadoop Meetup page. Again, please make sure to RSVP at least a day in advance in order to be admitted past security.

02/18/2011

Towards Building a High-Quality Workforce with Mechanical Turk

In addition to having written over 15 million reviews, Yelpers also contribute hundreds of thousands of business listing corrections each year.  Not all of these corrections are accurate, though, and there are quite a few jokers out there (e.g. suggesting the aquariums category for popular seafood restaurants... very funny!).  Yelp is serious about the correctness of business listings, so in order to efficiently validate each and every change, we’ve turned to Amazon’s Mechanical Turk (AMT) as well as other automated methods.  We recently published a research paper [1] at the NIPS 2010 Workshop on Computational Social Science and the Wisdom of Crowds reporting on our experiences.

Vetting Workers On Test Tasks
Our experiences agree with several other studies in finding that the AMT workforce has many high-quality workers but also many spammers who don’t perform tasks reliably.  In particular, only 35.6% of workers passed our basic multiple-choice pre-screening test.  We used expert-labeled corrections in order to test worker performance and found that the variance of worker accuracies was very high:

Image00 Please see our paper for a full discussion of our observations.  Previous studies have proposed mechanisms to correct for the sort of worker biases we observed.  However, these mechanisms correct results as a post-processing step after workers have been paid for completing all tasks.  Given our experiences and financial goals, we find that a useful mechanism must vet workers online as they complete tasks.

Continue reading "Towards Building a High-Quality Workforce with Mechanical Turk" »