F-Distribution — Definition, Formula & Examples
The F-distribution is a right-skewed probability distribution that arises when you compare the ratio of two independent chi-squared random variables, each divided by their degrees of freedom. It is the basis for F-tests used in ANOVA, regression analysis, and comparing population variances.
If and are independent chi-squared random variables with and degrees of freedom respectively, then the random variable follows an F-distribution with parameters (numerator degrees of freedom) and (denominator degrees of freedom), written .
Key Formula
Where:
- = Value of the random variable (x ≥ 0)
- = Numerator degrees of freedom
- = Denominator degrees of freedom
- = Beta function, B(a,b) = Γ(a)Γ(b)/Γ(a+b)
How It Works
In practice, you compute an F-statistic by taking the ratio of two variance estimates. In one-way ANOVA, the numerator is the between-group mean square (variance among group means) and the denominator is the within-group mean square (variance within groups). A large F-value suggests the group means differ more than random chance would predict. You then compare your computed F-statistic to a critical value from the F-distribution table at your chosen significance level. If the statistic exceeds the critical value, you reject the null hypothesis that all group means are equal.
Worked Example
Problem: In a one-way ANOVA comparing 3 groups with 10 observations each, the between-group mean square is 48 and the within-group mean square is 12. Test whether the group means differ at the α = 0.05 significance level.
Identify degrees of freedom: Numerator df = number of groups minus 1. Denominator df = total observations minus number of groups.
Compute the F-statistic: Divide the between-group mean square by the within-group mean square.
Compare to critical value: From an F-distribution table, the critical value for F(2, 27) at α = 0.05 is approximately 3.35. Since 4.0 > 3.35, we reject the null hypothesis.
Answer: The F-statistic of 4.0 exceeds the critical value of 3.35, so we reject the null hypothesis and conclude that at least one group mean differs significantly from the others at the 5% level.
Why It Matters
The F-distribution is central to ANOVA, which appears throughout experimental science, psychology, and engineering whenever you compare means across multiple groups. It also underlies the overall significance test in multiple regression, where the F-test determines whether your model explains a statistically significant portion of variation in the response variable.
Common Mistakes
Mistake: Swapping the numerator and denominator degrees of freedom when looking up the critical value.
Correction: The order matters: d₁ (numerator) always corresponds to the variance estimate in the numerator of the F-ratio. F(2, 27) is not the same distribution as F(27, 2).
