Sampling Bias
Sampling bias is a systematic error that occurs when the way a sample is collected causes some members of the population to be less likely to be included than others. This means the sample does not fairly represent the whole population, which can lead to misleading conclusions.
Sampling bias refers to a systematic tendency in a sampling method that favors certain outcomes over others, producing a sample that is not representative of the population of interest. It arises when the selection mechanism is related to the variable being studied, causing sample statistics to consistently overestimate or underestimate the corresponding population parameters. Unlike random sampling error, sampling bias does not decrease with larger sample sizes — a biased method applied to a million people is still biased.
Example
Problem: A school wants to estimate the average number of hours all 800 students spend on homework per week. They survey 50 students who stayed after school in the library on a Tuesday afternoon. The survey finds a mean of 14 hours per week. Identify the sampling bias and explain its effect.
Step 1: Identify the population. The population is all 800 students at the school.
Step 2: Identify the sample and how it was collected. The sample consists of 50 students found in the library after school. These students were not randomly selected from the full population.
Step 3: Determine who is systematically excluded or underrepresented. Students who go home immediately, attend sports practice, or have jobs after school are unlikely to be in the library. These students may spend fewer hours on homework.
Step 4: State the direction of the bias. Because library students likely spend more time on homework than typical students, the sample mean of 14 hours per week probably overestimates the true population mean.
Answer: The sampling method is biased because it systematically overrepresents students who study more. The reported mean of 14 hours per week likely overestimates the true average homework time for all 800 students.
Why It Matters
Sampling bias can make survey results, scientific studies, and polls deeply unreliable — even when the sample is large. A famous example is the 1936 Literary Digest poll that predicted Alf Landon would defeat Franklin Roosevelt in a landslide. The magazine surveyed over 2 million people, but its sampling method favored wealthier Americans, producing a spectacularly wrong prediction. Recognizing and avoiding sampling bias is essential in AP Statistics and in evaluating any data-driven claim you encounter.
Common Mistakes
Mistake: Believing that a larger sample size fixes sampling bias
Correction: Increasing sample size reduces random sampling error, but it does not correct bias. If the method of selection systematically excludes part of the population, surveying more people with the same flawed method just gives you a larger biased sample.
Mistake: Confusing sampling bias with response bias
Correction: Sampling bias is about who gets selected to participate. Response bias is about how people answer once they are in the sample (e.g., lying on a sensitive question). Both are sources of error, but they have different causes and different solutions.
