Question 1

What does statistical significance actually mean?

Accepted Answer

It means the difference you observed between two variants is unlikely to have happened by random chance. At 95% confidence, a significant result has a p-value below 0.05, meaning there's less than a 5% probability you'd see a gap this large if the two variants truly performed identically. Significance does not tell you the difference is large or important — only that it's probably real. A tiny, significant uplift on huge traffic can be less valuable than a big, not-yet-significant uplift you should keep testing.

Question 2

How does this calculator compute significance?

Accepted Answer

It uses a two-proportion z-test. It pools the conversion rates of both variants to estimate a shared standard error, computes a z-score from the difference in rates, and converts that to a two-tailed p-value using the normal distribution. The result is significant if the p-value is below your chosen alpha (0.10, 0.05, or 0.01 for 90%, 95%, or 99% confidence). This is the standard approach for comparing two conversion rates and matches what most A/B testing tools report.

Question 3

How much traffic do I need for a valid A/B test?

Accepted Answer

There's no fixed number — it depends on your baseline conversion rate and the size of the difference you want to detect. Smaller effects need far more traffic: detecting a 2% relative uplift can take tens of thousands of conversions, while a 30% uplift may be clear in a few hundred. As a rule of thumb, aim for at least a few hundred conversions per variant before trusting a result, and don't stop the test the moment it crosses significance — early peeking inflates false positives.

Question 4

Why isn't my result significant even though B is winning?

Accepted Answer

Because the sample is too small to rule out chance. With low traffic, even a real difference can produce a p-value above your threshold — the data simply can't distinguish a true effect from random variation yet. Keep the test running to gather more visitors and conversions, and the p-value will tighten in whichever direction reflects reality. If it stays inconclusive after substantial traffic, the true difference between variants is probably too small to matter.

Question 5

Can I use this for ad creative tests, not just landing pages?

Accepted Answer

Yes. The math is identical for any two-variant test where you can count exposures and successes: ad creatives (impressions and clicks, or clicks and conversions), email subject lines (sends and opens), or landing pages (visitors and signups). Just map your two metrics to 'visitors' and 'conversions.' For ad creative, comparing clicks against impressions tests CTR; comparing conversions against clicks tests post-click performance.

A/B Test Significance Calculator

Inputs

Results

What the numbers mean

Conversion rate & uplift

Z-score & p-value

Significance

Why “it's winning” isn't enough

How AdFlint tests creative for you

Questions

What does statistical significance actually mean?

How does this calculator compute significance?

How much traffic do I need for a valid A/B test?

Why isn't my result significant even though B is winning?

Can I use this for ad creative tests, not just landing pages?