Mathwords logoMathwords

Stratified Sampling — Definition, Formula & Examples

Stratified sampling is a method of collecting data where you divide the entire population into distinct subgroups called strata, then take a separate random sample from each stratum. This ensures every subgroup is represented in the final sample, which typically produces more precise estimates than simple random sampling alone.

Given a population of size NN partitioned into kk non-overlapping strata of sizes N1,N2,,NkN_1, N_2, \ldots, N_k where Ni=N\sum N_i = N, a stratified sample selects an independent simple random sample of size nin_i from each stratum ii. Under proportional allocation, ni=nNiNn_i = n \cdot \dfrac{N_i}{N}, so each stratum's share of the sample matches its share of the population.

Key Formula

ni=nNiNn_i = n \cdot \frac{N_i}{N}
Where:
  • nin_i = Number of individuals sampled from stratum i
  • nn = Total desired sample size
  • NiN_i = Number of individuals in stratum i in the population
  • NN = Total population size

How It Works

First, identify a variable that divides your population into meaningful, non-overlapping groups — for example, grade level, income bracket, or region. Next, classify every member of the population into exactly one stratum. Then draw a simple random sample independently within each stratum. Finally, combine the stratum-level results to estimate population parameters. Because you guarantee representation from every subgroup, stratified sampling reduces variability in your estimates compared to drawing one big simple random sample, especially when the strata differ noticeably from each other on the variable you are measuring.

Worked Example

Problem: A high school has 1,000 students: 400 freshmen, 300 sophomores, 200 juniors, and 100 seniors. You want a stratified sample of 100 students using proportional allocation. How many students should you sample from each grade?
Step 1: Identify the strata and their sizes. The four grades form the strata with sizes 400, 300, 200, and 100.
N1=400,  N2=300,  N3=200,  N4=100N_1 = 400,\; N_2 = 300,\; N_3 = 200,\; N_4 = 100
Step 2: Apply proportional allocation for freshmen.
n1=1004001000=40n_1 = 100 \cdot \frac{400}{1000} = 40
Step 3: Repeat for the remaining strata.
n2=1003001000=30,n3=1002001000=20,n4=1001001000=10n_2 = 100 \cdot \frac{300}{1000} = 30,\quad n_3 = 100 \cdot \frac{200}{1000} = 20,\quad n_4 = 100 \cdot \frac{100}{1000} = 10
Step 4: Verify the stratum samples sum to the total sample size.
40+30+20+10=100  40 + 30 + 20 + 10 = 100 \; \checkmark
Answer: Sample 40 freshmen, 30 sophomores, 20 juniors, and 10 seniors, selecting each group by simple random sampling within that grade.

Another Example

Problem: A city has 6,000 households: 2,000 in the urban core, 2,500 in the suburbs, and 1,500 in rural areas. A researcher wants a proportional stratified sample of 60 households to study water usage. How many should come from each area?
Step 1: Define the strata: Urban (2,000), Suburban (2,500), Rural (1,500).
N=6,000,  n=60N = 6{,}000,\; n = 60
Step 2: Compute the sample size for each stratum.
nurban=6020006000=20,nsuburb=6025006000=25,nrural=6015006000=15n_{\text{urban}} = 60 \cdot \frac{2000}{6000} = 20, \quad n_{\text{suburb}} = 60 \cdot \frac{2500}{6000} = 25, \quad n_{\text{rural}} = 60 \cdot \frac{1500}{6000} = 15
Step 3: Check the total.
20+25+15=60  20 + 25 + 15 = 60 \; \checkmark
Answer: The researcher randomly selects 20 urban, 25 suburban, and 15 rural households.

Visualization

Why It Matters

Stratified sampling is a core topic on the AP Statistics exam, appearing in both multiple-choice and free-response questions about study design. Polling firms, medical researchers, and government agencies such as the U.S. Census Bureau use it to ensure minority or hard-to-reach subgroups are adequately represented. Mastering this method also builds the foundation for understanding more advanced designs like stratified multistage sampling used in national surveys.

Common Mistakes

Mistake: Confusing strata with clusters. Students sometimes describe stratified sampling as picking a few groups and surveying everyone in them.
Correction: Remember: strata = sample from ALL groups; clusters = sample WHOLE groups. In stratified sampling every stratum contributes members to the sample.
Mistake: Allowing individuals to belong to more than one stratum.
Correction: Strata must be mutually exclusive and exhaustive. Every population member falls into exactly one stratum, so no one is double-counted or left out.