Lessons from A/B Testing on Bandit Subjects
-
Shichao Ma, Applied Scientist
- Dec 21, 2022
Abstract Compared to full-scale ML, multi-armed bandit is a lighter weight solution that can help teams quickly optimize their product features without major commitments. However, bandits need to have a candidate selection step when they have too many items to choose from. Using A/B testing to optimize the candidate selection step causes new bandit bias and convergence selection bias. New bandit bias occurs when we try to compare new bandits with established ones in an experiment; convergence selection bias creeps in when we try to solve the new bandit bias by defining and selecting established bandits. We discuss our...