In our discussion this week, we review Daniel Russo's paper on Simple Bayesian Algorithms for Best Arm Identification. This paper presents very simple algorithms for sequential decision-making problems where the goal is to identify, as quickly as possible, the best variant out of possible designs.
- Paper link, Daniel Russo, 29th Annual Conference on Learning Theory
Why this method?
- The problem setup is very general and can model many practical problems
- A few examples
- A/B testing: designs A or B, which one is best?
- Clinical trials for treatment variants
- Website optimization: different webpage layouts
- As the paper name suggests, algorithms proposed are very simple
- They have a single parameter which can be tuned automatically based on the data gathered.
- Their implementation mainly requires access to samples from the posterior distribution.
- possible designs.
- Each design has unknown parameter
- At each time-step
- experimenter chooses design
- experimenter observes a random outcome
- The prior for the unknown parameters lies in a bounded hyperrectangle of , with uniformly bounded density.
- Observed outcomes are sampled from a 1D exponential family where the log-partition function is twice differentiable with uniformly bounded first derivative.
- "Top-two" Bayesian algorithms with parameter (a probability).
- At step , given posterior at time for parameters
- Design selected for measurement is selected randomly
- Between the top two designs most likely to be "optimal" at time
- The three algorithms presented (top-two sampling, top-two value sampling and top-two thompson sampling) each have a different notion of being "optimal" at time .
No algorithm can do better than exponential convergence
- (of the posterior to the true best design on this problem)
- Convergence rate captures how difficult it is to identify the best design on a particular problem
Top-two algorithms reach this exponential rate with a proper parameter
Top-two algorithms with an adaptive also attain this exponential rate
- need be properly set to converge to ,
- It is only based on observations (no knowledge of required)
Top-two algorithms with a fixed converge exponentially, with a rate "not too far" from the optimal rate
- In particular,
If you like applying these kinds of methods practical ML problems, join our team.