You’ve just launched an A/B test. Version B is winning! Time to roll it out to everyone, right?

Not so fast.

The Sample Size Trap

Here’s a question: If version B is actually twice as good as version A, how often will your A/B test correctly identify it as the winner?

100% of the time? 90%? Surely at least 80%?

Try the interactive simulator below to find out. Spoiler alert: you might be shocked.

A/B Testing Simulator

Discover why sample size matters in A/B testing

Probability of A Success: 50.0%

Probability of B Success: 50.0%

Trials Per Month: 250

Number of Months: 6

Simulation Results Calculating...

Based on exact binomial probability calculations

A Wins

B Wins

Ties

Total trials per test: --

Method: --

What This Means for Your Business

When you run A/B tests with small sample sizes (which is most startups and many established businesses), you’re essentially flipping a weighted coin. Even when there’s a real difference, your test might point you in the wrong direction.

This is why:

Sample size matters more than test duration - Running a test for 3 months doesn’t help if you only get 50 trials per month
Statistical significance isn’t magic - That “95% confidence” threshold assumes you have enough data
False winners cost real money - Every time you pick the wrong variant, you’re leaving money on the table

The Math (For the Curious)

The simulator uses binomial probability distributions to calculate the exact likelihood of each outcome. For larger sample sizes (>1000 trials), it switches to Monte Carlo simulation for performance.

The key insight: even with a 2x improvement (0.01 → 0.02 conversion rate), you need hundreds of trials before you can reliably detect the difference.

What You Should Do

Calculate required sample size before running your test
Don’t stop early just because one variant is winning
Consider Bayesian methods for ongoing optimization instead of fixed-duration tests
Be honest about power - if you don’t have enough traffic, you don’t have enough traffic

The simulator above lets you play with different scenarios. Try setting version B to be just 10% better than A, or reducing your monthly traffic. You’ll quickly see why so many A/B tests lead to wrong conclusions.

Remember: in A/B testing, like in life, the most dangerous mistake is the one you don’t know you’re making.