Multi variant AB testing vs Multi-Armed bandit

1.6% vs 2%

Payoff

Payoff from variation 1.6% vs 2%

Payoff from variation 1.6% vs 2%

As it was expected epsilon greedy has the highest payoff no the price of reaching 95% percent confidence later than UCB1 did.

Certainty

Here AB never reaches certainty of 95% percent whereas UCB1 gets to there the quickest. It worth to note that AB has a point when it was close to 95% though it later dropped to 60% ish.

Confidence of 1.6% vs 2%

Confidence of 1.6% vs 2%

Run behaviour

The following plots show the runtime behaviour of each model which version was shown to the user. The Epsilon greedy quickly converged to the highest payoff model.

Variations displayed in AB model

Variations displayed in AB model

Variations display in Epsilon greedy model

Variations display in Epsilon greedy model

Variations display in UCB1 model

Variations display in UCB1 model

The UCB 1 model sometimes alternates between the two version with a general preference towards the 0.02 version which is expected and natural given the two version are close to each other hence to reach confidence level requires much more probing.

About charlesnagy

I'm out of many things mostly automation expert, database specialist, system engineer and software architect with passion towards data, searching it, analyze it, learn from it. I learn by experimenting and this blog is a result of these experiments and some other random thought I have time to time.
Bookmark the permalink.