Stratified Sampling — Definition, Formula & Examples
Stratified sampling is a method of collecting data where you divide the entire population into distinct subgroups called strata, then take a separate random sample from each stratum. This ensures every subgroup is represented in the final sample, which typically produces more precise estimates than simple random sampling alone.
Given a population of size partitioned into non-overlapping strata of sizes where , a stratified sample selects an independent simple random sample of size from each stratum . Under proportional allocation, , so each stratum's share of the sample matches its share of the population.
Key Formula
Where:
- = Number of individuals sampled from stratum i
- = Total desired sample size
- = Number of individuals in stratum i in the population
- = Total population size
How It Works
First, identify a variable that divides your population into meaningful, non-overlapping groups — for example, grade level, income bracket, or region. Next, classify every member of the population into exactly one stratum. Then draw a simple random sample independently within each stratum. Finally, combine the stratum-level results to estimate population parameters. Because you guarantee representation from every subgroup, stratified sampling reduces variability in your estimates compared to drawing one big simple random sample, especially when the strata differ noticeably from each other on the variable you are measuring.
Worked Example
Problem: A high school has 1,000 students: 400 freshmen, 300 sophomores, 200 juniors, and 100 seniors. You want a stratified sample of 100 students using proportional allocation. How many students should you sample from each grade?
Step 1: Identify the strata and their sizes. The four grades form the strata with sizes 400, 300, 200, and 100.
Step 2: Apply proportional allocation for freshmen.
Step 3: Repeat for the remaining strata.
Step 4: Verify the stratum samples sum to the total sample size.
Answer: Sample 40 freshmen, 30 sophomores, 20 juniors, and 10 seniors, selecting each group by simple random sampling within that grade.
Another Example
Problem: A city has 6,000 households: 2,000 in the urban core, 2,500 in the suburbs, and 1,500 in rural areas. A researcher wants a proportional stratified sample of 60 households to study water usage. How many should come from each area?
Step 1: Define the strata: Urban (2,000), Suburban (2,500), Rural (1,500).
Step 2: Compute the sample size for each stratum.
Step 3: Check the total.
Answer: The researcher randomly selects 20 urban, 25 suburban, and 15 rural households.
Visualization
Why It Matters
Stratified sampling is a core topic on the AP Statistics exam, appearing in both multiple-choice and free-response questions about study design. Polling firms, medical researchers, and government agencies such as the U.S. Census Bureau use it to ensure minority or hard-to-reach subgroups are adequately represented. Mastering this method also builds the foundation for understanding more advanced designs like stratified multistage sampling used in national surveys.
Common Mistakes
Mistake: Confusing strata with clusters. Students sometimes describe stratified sampling as picking a few groups and surveying everyone in them.
Correction: Remember: strata = sample from ALL groups; clusters = sample WHOLE groups. In stratified sampling every stratum contributes members to the sample.
Mistake: Allowing individuals to belong to more than one stratum.
Correction: Strata must be mutually exclusive and exhaustive. Every population member falls into exactly one stratum, so no one is double-counted or left out.
