Estimator Bias — Definition, Formula & Examples
Estimator bias is the difference between an estimator's expected value and the true value of the parameter it estimates. An estimator with zero bias is called unbiased, meaning it hits the correct value on average across many samples.
The bias of an estimator for a parameter is defined as . An estimator is unbiased if and only if , so that its bias equals zero.
Key Formula
Where:
- = The estimator (a statistic computed from sample data)
- = The expected value of the estimator across all possible samples
- = The true population parameter being estimated
How It Works
To assess whether an estimator is biased, you compare its expected value (the long-run average over all possible samples) to the true parameter. If the expected value is consistently too high, the bias is positive; if consistently too low, the bias is negative. In practice, you often prove unbiasedness algebraically rather than through simulation. For example, the sample mean is an unbiased estimator of the population mean because . In contrast, dividing by instead of when computing sample variance produces a biased estimator that systematically underestimates the population variance.
Worked Example
Problem: A population has variance . You draw samples of size and compute , which divides by rather than . Find the bias of this estimator.
Step 1: It is a known result that the expected value of this estimator is:
Step 2: Substitute and :
Step 3: Apply the bias formula:
Answer: The bias is , meaning this estimator systematically underestimates the population variance by 4 on average. This is exactly why the corrected sample variance divides by instead.
Why It Matters
Estimator bias shows up directly on the AP Statistics exam when comparing estimators or explaining why uses . In data science and econometrics, choosing between biased and unbiased estimators (or accepting some bias for lower variance, as in ridge regression) is a core modeling decision.
Common Mistakes
Mistake: Confusing bias with variability. Students assume a biased estimator always gives wrong answers for any single sample.
Correction: Bias describes the long-run average error, not individual sample error. A biased estimator can still land on the true value in a given sample — it just won't center on it across repeated sampling.
