Undebt: How We Refactored 3 Million Lines of Code
-
Evan H., Software Engineering Intern
- Aug 23, 2016
Peter Seibel wrote that to maximize engineering effectiveness, “Let a thousand flowers bloom. Then rip 999 of them out by the roots.” Flowers, in how the metaphor applies to us, are code patterns — the myriad different functions, classes, styles, and idioms that developers use when writing code. At first, new flowers are welcome — maybe the new pattern seems easier to use, more scalable, more efficient, or more suited to some particular task than the old.
As a code base grows, and the flowers proliferate, however, it becomes clear which patterns work and which don’t. Suddenly, code patterns that were once beautiful new flowers become technical debt in need of removal. When that happens, it’s time to start ripping. Otherwise, since developers learn by reading (and occasionally copy-and-pasting) from existing code, the bad flowers and the technical debt that comes with them will continue to grow unchecked.
Ripping out flowers with Undebt
Removing a bad code pattern by hand, especially in a massive code base like ours, is a Herculean task that puts a massive drain on developer time; time that could be better spent working on new features and shipping new code. That’s why other members of the Core Backend team and I built Undebt, an elegant, fast, reliable tool for performing massive, automated code refactoring.
Undebt works on any language and lets you define complex find-and-replace rules using standard, straightforward Python that can be applied quickly to an entire code base with a simple command.
Since its inception at our most recent hackathon, Undebt has become a key tool for performing en masse code refactoring. Used along with our open-source debt tracker, we can now efficiently monitor and remove technical debt before it becomes a serious problem.
The graph above, generated using our open-source debt tracker, shows the usage of a particular deprecated method across the 3 million lines of Python that make up Yelp’s codebase. The effect of Undebt can be seen near the end. We were able to completely remove two years of accumulated debt using a variant of the included method_to_function
example. We have also made heavy use of the sqla_count
example in refactoring our SQL alchemy code to remove inefficient sub-queries.
How it works
To use Undebt, you define a custom pattern that specifies what code pattern to look for and how to replace it. Undebt leverages the popular open source tool pyparsing to make writing a pattern file as easy as plain Python. Undebt also comes loaded with tools and examples to make this process as painless as possible.
Ready to start gardening with Undebt? Check out Undebt’s GitHub repository for more information on how to get started.