Sudarshan G., Yelp's current leading engineer behind mrjob, writes this week about an exciting opportunity to contribute to one of our most popular open source libraries: mrjob.
We are happy to announce that we are running a mrjob sprint at Pycon 2013. mrjob is an open source Python framework for running Python scripts to process large amounts of data either on your own Hadoop cluster or in AWS using EMR. The mrjob sprint is targeted at both developers who are familiar with mrjob as well as new users who want to come up to speed on it. The sprints are being held on March 18th and March 19th between 10 am and 6 pm at:
Hyatt Regency Santa Clara
5101 Great America Parkway
Santa Clara, California USA 95054
Most of the maintainers of mrjob will be available on both the days to help get new users and developers up to speed. The sprint is free to attend and there is no requirement in terms of registering for Pycon to be able to attend this sprint. We strongly encourage users with all levels of expertise with mrjob to register on the PyCon website and attend this sprint! An initial lists of tickets to be tackled at the sprint are tracked on Github.
The mrjob project has been developed at Yelp for over 3 years now. mrjob was built to provide a way for Yelp engineers to run log processing jobs on the hadoop cluster at Yelp while retaining all the benefits of working in Python and reusing the Yelp code base. Yelp released mrjob as an open source project in Oct 2010. The open source release came with support for Amazon's EMR framework. Yelp has since retired its local hadoop cluster and now uses mrjob exclusively with EMR to power all the large data processing batches. Over 200 batches at Yelp are powered by mrjob.
Sign up and we'll see you there!