tron - Centralized Scheduling
As Yelp has grown over the years, we've amassed huge collections of data - the collective output of the actions of tens of millions of users. Analyzing this data helps us improve the user experience across the site, from ranking businesses and extracting "review highlights" to fighting spam and keeping the site secure. These tasks involve long-running batch processes that analyze large logs and database tables, with workflows sometimes composed of five or more dependent processes. Managing these workflows with traditional scheduling tools - most notably the cron family - eventually caused us to breach the "complexity comfort zone" that surrounds any large engineering project. But then we had a vision, and it is that vision we are releasing today as tron.