The null hypothesis (H₀) is the default assumption in any statistical test: that there is no real difference between two groups being compared. In A/B testing, H₀ states that your control and variation produce the same outcome — any observed difference is due to chance. The purpose of a test is not to prove H₀ true, but to gather enough evidence to reject it in favor of the alternative hypothesis (H₁).
Every hypothesis test has two competing statements:
You analyze your test data and calculate a p-value — the probability of observing a difference at least as extreme as yours, assuming H₀ is true. If p < 0.05 (the conventional threshold), you reject H₀ and conclude that the variation likely caused a real effect.
Crucially, you never "accept" H₀. A non-significant result means you failed to reject it — the test was inconclusive, not that no difference exists.
For a two-sample proportion test (the most common A/B test setup):
H₀: p₁ = p₂ (conversion rate for control equals variation)
H₁: p₁ ≠ p₂ (two-tailed) or p₁ < p₂ (one-tailed)
The test statistic (z-score) measures how many standard deviations your observed difference is from zero. Convert that to a p-value and compare against your significance level (α = 0.05 by convention).
Booking.com runs roughly 1,000 A/B tests per year. Imagine they test a new search-result card layout:
Two error types matter when working with the null hypothesis:
Underpowered tests (too small a sample size) frequently miss real effects, causing teams to wrongly conclude the variation "didn't work."
Deepen your understanding with these related glossary terms: alternative hypothesis, p-value, statistical significance, type-1 error, type-2 error, sample size.