Central Limit Theorem — Definition, Formula & Examples
The Central Limit Theorem states that when you take sufficiently large random samples from any population, the distribution of the sample means will be approximately normal (bell-shaped), regardless of the population's original distribution.
Let be independent, identically distributed random variables with finite mean and finite variance . As , the standardized sample mean converges in distribution to the standard normal distribution .
Key Formula
Where:
- = Sample mean
- = Population mean
- = Population standard deviation
- = Sample size
- = Standard normal score (z-score)
How It Works
Start by identifying the population mean , population standard deviation , and sample size . As a rule of thumb, the CLT applies well when , though for populations that are already nearly symmetric, smaller samples suffice. The sampling distribution of will be approximately normal with mean and standard error . You can then convert any sample mean to a -score and use the standard normal table to find probabilities.
Worked Example
Problem: A factory produces bolts with a mean length of 50 mm and a standard deviation of 4 mm. The distribution of individual bolt lengths is skewed. What is the probability that the mean length of a random sample of 64 bolts is greater than 51 mm?
Find the standard error: Divide the population standard deviation by the square root of the sample size.
Compute the z-score: Standardize the sample mean of 51 mm using the population mean and standard error.
Find the probability: Using the standard normal table, .
Answer: There is approximately a 2.28% chance that the sample mean exceeds 51 mm.
Why It Matters
The CLT is the reason most inferential statistics works: confidence intervals and hypothesis tests for means rely on the sampling distribution being approximately normal. In fields from quality control to polling, it allows analysts to make probability statements about sample means even when the underlying data are not normally distributed.
Common Mistakes
Mistake: Applying the CLT to individual observations instead of sample means.
Correction: The CLT describes the distribution of the sample mean , not individual values. A single observation drawn from a skewed population is still skewed — normality emerges only when you average many observations.
