Cohort — Definition, Formula & Examples
A cohort is a group of individuals who share a common characteristic or experience within a defined time period. In statistics, cohorts are tracked over time so researchers can compare outcomes across groups.
A cohort is a subset of a population defined by a shared attribute—such as birth year, enrollment date, or exposure to a treatment—whose data are collected and analyzed longitudinally or cross-sectionally to identify patterns, trends, or causal relationships.
How It Works
To use cohorts in a study, you first identify a defining characteristic (e.g., graduation year or treatment group). You then collect data on each cohort at one or more points in time. By comparing statistics like means, percentiles, or proportions across cohorts, you can detect differences or trends. For example, comparing the median test scores of the Class of 2022 cohort to the Class of 2023 cohort reveals whether performance shifted between years.
Worked Example
Problem: A school tracks the average SAT scores of three graduating cohorts: Class of 2022 (n = 200, mean = 1050), Class of 2023 (n = 210, mean = 1080), and Class of 2024 (n = 195, mean = 1100). Describe the trend across cohorts.
Identify the cohorts: Each graduating class is a cohort defined by the year students graduated.
Compare the statistic: List the mean SAT scores in order: 1050, 1080, 1100. Each successive cohort has a higher mean.
Calculate the overall change: Find the increase from the first cohort to the last.
Answer: The mean SAT score increased by 50 points across the three cohorts, suggesting an upward trend in performance from 2022 to 2024.
Why It Matters
Cohort analysis appears in AP Statistics when discussing observational studies and experimental design. Public health researchers use cohorts to track disease outcomes, and businesses use them to measure customer retention over time. Understanding cohorts helps you interpret whether differences in data stem from group-level factors rather than individual variation.
Common Mistakes
Mistake: Confusing a cohort with a random sample
Correction: A cohort is defined by a shared characteristic (like birth year), not by random selection. A random sample could be drawn from within a cohort, but the two concepts serve different purposes.
