Bias — Definition, Formula & Examples

Bias is a systematic error in data collection, sampling, or analysis that causes results to consistently lean in one direction away from the true value. A biased sample or survey does not accurately represent the population it claims to describe.

In statistics, bias refers to any systematic tendency in the design, collection, or interpretation of data that produces results that differ from the true population parameter in a predictable direction. Bias is distinct from random error, which varies unpredictably and tends to cancel out over many observations.

How It Works

Bias enters a study when certain members of a population are more likely to be included or excluded than others. For example, surveying students only during lunchtime in the cafeteria would miss students who eat elsewhere or leave campus. The key feature of bias is that repeating the same flawed method produces the same skewed result — it does not average out with more data. To reduce bias, statisticians use random sampling, careful survey design, and blind study protocols.

Example

Problem: A school wants to find the average number of hours all 800 students sleep per night. The principal surveys 50 students who attend a Monday morning study hall. Those students report an average of 8.5 hours. A later random sample of 50 students from the entire school finds an average of 7.2 hours. Identify the bias.

Identify the sampling method: The first sample only includes students who attend Monday morning study hall. Students who are chronically sleep-deprived may skip early study hall, so they are underrepresented.

Compare to the random sample: The study-hall sample gave 8.5 hours, while the random sample gave 7.2 hours. The difference of 1.3 hours reflects the bias introduced by the non-random method.

8.5 - 7.2 = 1.3 \text{ hours of overestimation}

Classify the bias: This is selection bias (also called sampling bias) because the method systematically favored well-rested students over the broader population.

Answer: The principal's survey was biased because it systematically overestimated sleep by about 1.3 hours by only sampling students who show up to early study hall.

Why It Matters

Recognizing bias is essential in AP Statistics and any research-based career, from public health to market research. A biased dataset can lead to wrong conclusions no matter how large the sample is, because collecting more biased data does not fix the underlying flaw. Understanding bias helps you critically evaluate polls, studies, and news reports that cite statistics.

Common Mistakes

Mistake: Believing that a larger sample size eliminates bias.

Correction: A larger sample reduces random error but does not fix systematic bias. Surveying 1,000 people with a flawed method still produces skewed results. The sampling method itself must be corrected.