Introducing MOE: Metric Optimization Engine; a new open source, machine learning service for optimal experiment design
At Yelp we run a lot of A/B tests. By constantly trying new features and testing their impact, we are able to continue evolving our products and make them as useful as possible. However, running online A/B tests can be expensive (in opportunity cost, user experience, or revenue) and time consuming (to achieve statistical significance).
Furthermore, many A/B tests boil down to parameter selection (more of an A/A’ test, where a feature stays the same, and only the parameters change). Given a feature, we want to find the optimal configuration values for the constants and hyperparameters of the feature as quickly as possible. This can be analytically impossible for many systems. We need to treat these systems like black boxes where we can observe only the input and output. We want some combination of metrics (the objective function) to go up or down, but we need to run expensive, time consuming experiments to sample this function for each set of parameters.
MOE, the Metric Optimization Engine, is an open source, machine learning tool for solving these global, black box optimization problems in an optimal way. MOE implements several algorithms from the field of Bayesian Global Optimization. It solves the problem of finding optimal parameters by building and fitting a model of the objective function given historical information using Gaussian Processes. MOE then finds and returns the point(s) of highest expected improvement. These are the points that will have the highest expected gain over the best historical samples seen so far. For more information see the documentation and examples.
Here are some examples of when you could use MOE:
Optimizing a system's click-through rate (CTR). MOE is useful when evaluating CTR requires running an A/B test on real user traffic, and getting statistically significant results requires running this test for a substantial amount of time (hours, days, or even weeks). Examples include setting distance thresholds, ad unit properties, or internal configuration values.
Optimizing tunable parameters of a machine-learning prediction method. MOE can be used when calculating the prediction error for one choice of the parameters takes a long time, which might happen because the prediction method is complex and takes a long time to train, or because the data used to evaluate the error is huge. Examples include deep learning methods or hyperparameters of features in logistic regression.
Optimizing the design of an engineering system. MOE helps when evaluating a design requires running a complex physics-based numerical simulation on a supercomputer. Examples include designing and modeling airplanes, the traffic network of a city, a combustion engine, or a hospital.
Optimizing the parameters of a real-world experiment. MOE can help guide design when every experiment needs to be physically created in a lab or very few experiments can be run in parallel. Examples include chemistry, biology, or physics experiments or a drug trial.
We want to collect information about the system as efficiently as possible, while finding the optimal set of parameters in as few attempts as possible. We want to find the best trade-off between gaining new information about the problem (exploration) and using the information we already have (exploitation). This is an application of optimal learning. MOE uses techniques from this field to solve this problem in an optimal way.
MOE provides REST, Python and C++ interfaces. A MOE server can be spun up within a Docker container in minutes. The black box nature of MOE allows it to optimize any number of systems, requiring no internal knowledge or access. By using MOE to inform parameter exploration of a time consuming process like running A/B tests, performing expensive batch simulations, or tuning costly models, you can optimally find the next best set of parameters to sample, given any objective function. MOE can also help find optimal parameters for heuristic thresholds and configuration values in any system. See the examples for more information.