Grouped Frequency Distribution — Definition, Formula & Examples
A grouped frequency distribution is a table that organizes data into non-overlapping intervals (called classes) and records how many data values fall into each interval. It simplifies large data sets by summarizing them into a manageable number of groups.
A grouped frequency distribution partitions the range of a quantitative data set into mutually exclusive, exhaustive class intervals of equal width and tallies the absolute frequency of observations within each interval , where and is the total number of observations.
Key Formula
Where:
- = Largest value in the data set
- = Smallest value in the data set
- = Desired number of classes
How It Works
Start by finding the range of your data (maximum minus minimum). Choose the number of classes — typically between 5 and 20 — and divide the range by that number to get the class width, rounding up to a convenient value. Set up intervals so they cover the entire data set without overlapping. Then tally each data value into its interval and record the frequency. The resulting table is the foundation for drawing histograms and computing cumulative or relative frequencies.
Worked Example
Problem: Twenty students scored the following on a test: 52, 55, 58, 61, 63, 65, 67, 70, 72, 74, 76, 78, 80, 82, 84, 87, 90, 92, 95, 98. Create a grouped frequency distribution with 5 classes.
Find the range: Subtract the minimum from the maximum.
Determine class width: Divide the range by the number of classes and round up.
Build the table: Starting at 50, create intervals of width 10 and count values in each.
Answer: The grouped frequency distribution has five classes (50–59, 60–69, 70–79, 80–89, 90–99) with frequencies 3, 4, 5, 4, and 4 respectively, summing to 20.
Visualization
Why It Matters
Grouped frequency distributions are essential for summarizing large data sets in AP Statistics and introductory college courses. They are the direct basis for constructing histograms, which reveal the shape, center, and spread of a distribution at a glance.
Common Mistakes
Mistake: Creating overlapping intervals such as 50–60 and 60–70, which makes it unclear where a boundary value like 60 belongs.
Correction: Use non-overlapping intervals like 50–59 and 60–69, or use notation such as [50, 60) and [60, 70) to clearly include the left endpoint and exclude the right.
