Chebyshev's Inequality — Definition, Formula & Examples
Chebyshev's Inequality states that for any dataset (regardless of its shape or distribution), at least of the data values fall within standard deviations of the mean, where .
For any random variable with finite mean and finite, positive variance , and for any real number , the inequality holds. Equivalently, at least a fraction of the probability mass lies within standard deviations of the mean.
Key Formula
Where:
- = A random variable (or data value)
- = The mean of the distribution or dataset
- = The standard deviation
- = Number of standard deviations from the mean (must be greater than 1)
How It Works
Choose a value of representing the number of standard deviations from the mean. Plug into to find the minimum proportion of data guaranteed to lie in the interval . The result is a lower bound — the actual proportion is often higher, especially for symmetric or bell-shaped distributions. The power of this inequality is that it applies to every distribution with a finite mean and variance, making no assumptions about shape.
Worked Example
Problem: A dataset has a mean of 50 and a standard deviation of 5. Using Chebyshev's Inequality, find the minimum percentage of data values that lie between 35 and 65.
Find k: The interval from 35 to 65 spans 15 units on each side of the mean. Divide by the standard deviation to get k.
Apply the formula: Substitute k = 3 into Chebyshev's Inequality.
Interpret: Convert the fraction to a percentage.
Answer: At least 88.9% of the data values must lie between 35 and 65.
Why It Matters
Chebyshev's Inequality is essential in introductory probability and statistics courses because it provides guaranteed bounds without assuming normality. It is used in quality control and finance to estimate the spread of data when the underlying distribution is unknown or skewed.
Common Mistakes
Mistake: Using k = 1 and expecting a useful bound.
Correction: At k = 1 the formula gives , which tells you nothing. Chebyshev's Inequality only provides meaningful bounds when k > 1.
