Bimodal Distribution — Definition, Formula & Examples
A bimodal distribution is a data set or probability distribution that has two distinct peaks, meaning two values (or ranges of values) occur more frequently than others. The two peaks are separated by a valley where frequencies dip.
A distribution is bimodal if its frequency function or probability density function exhibits exactly two local maxima. Each local maximum corresponds to a mode, and the two modes need not have equal frequency.
How It Works
To spot a bimodal distribution, create a histogram or frequency plot of your data and look for two separate humps. Each hump represents a cluster of data points around a frequently occurring value. Bimodal distributions often arise when data comes from two different groups mixed together — for example, heights of adult men and women combined into one data set. The mean of a bimodal distribution can be misleading because it may fall in the valley between the two peaks, where very few data points actually lie.
Worked Example
Problem: A teacher records the test scores of 20 students: 55, 58, 60, 62, 63, 64, 65, 66, 82, 84, 85, 86, 87, 88, 88, 89, 90, 91, 92, 93. Determine whether this distribution is bimodal.
Group into intervals: Create frequency bins: 50–69 and 80–99.
Count frequencies: Scores 50–69: 8 students. Scores 70–79: 0 students. Scores 80–99: 12 students.
Identify peaks: There is one cluster centered around the low 60s and another centered around the high 80s, with a gap in the 70s where no scores appear.
Answer: The distribution is bimodal, with one peak near 63 and another near 88. The empty 70s range forms the valley between the two modes.
Visualization
Why It Matters
In AP Statistics and introductory college courses, recognizing a bimodal distribution tells you that your data likely contains two distinct subgroups. Reporting only the mean and standard deviation hides this structure. Identifying bimodality can prompt you to split the data and analyze each group separately for more meaningful results.
Common Mistakes
Mistake: Assuming the mean is a good measure of center for bimodal data.
Correction: The mean often falls in the valley between the two peaks, representing almost no actual data points. Report the two modes instead, or split the data into subgroups before summarizing.
