Skewness — Definition, Formula & Examples
Skewness is a measure of how lopsided a distribution is compared to a perfectly symmetric bell curve. A distribution can be skewed left (tail stretches toward smaller values), skewed right (tail stretches toward larger values), or have zero skew (symmetric).
Skewness is a dimensionless measure of the asymmetry of a probability distribution about its mean. For a sample of observations, the sample skewness (Fisher's formula) is computed as a scaled ratio of the third central moment to the cube of the standard deviation. A positive value indicates a longer right tail, a negative value indicates a longer left tail, and zero indicates symmetry.
Key Formula
Where:
- = Sample skewness coefficient
- = Number of observations
- = Each individual data value
- = Sample mean
- = Sample standard deviation
How It Works
To assess skewness, look at a histogram or dotplot and compare the two tails. If the right tail is longer, the distribution is right-skewed (positively skewed), and the mean is typically greater than the median. If the left tail is longer, it is left-skewed (negatively skewed), and the mean is typically less than the median. On the AP Statistics exam, you often describe shape qualitatively — stating whether a distribution is approximately symmetric, skewed left, or skewed right — rather than calculating a numerical value. When a numerical measure is needed, the formula below gives a precise coefficient that lets you compare the degree of skew across different data sets.
Worked Example
Problem: A data set contains the values 2, 3, 4, 4, 5, 5, 5, 6, 6, 15. Determine whether the distribution is skewed and in which direction.
Step 1: Find the mean: Add all values and divide by 10.
Step 2: Find the median: With 10 values sorted, the median is the average of the 5th and 6th values.
Step 3: Compare mean and median: The mean (5.5) is greater than the median (5). The value 15 is an outlier that pulls the mean to the right.
Step 4: State the skewness: Because the mean exceeds the median and the right tail is stretched by the outlier at 15, the distribution is skewed right (positively skewed).
Answer: The distribution is skewed right. The long tail toward 15 pulls the mean above the median, producing positive skewness.
Another Example
Problem: Given a small data set: 1, 8, 9, 9, 10, 10, 10, 11, 11, 12. Is it skewed left, skewed right, or approximately symmetric?
Step 1: Find the mean: Sum the values and divide by 10.
Step 2: Find the median: Average the 5th and 6th values in the sorted list.
Step 3: Interpret: The mean (9.1) is less than the median (10). The unusually low value of 1 stretches the left tail, so the distribution is skewed left (negatively skewed).
Answer: The distribution is skewed left. The outlier at 1 pulls the mean below the median.
Visualization
Why It Matters
Skewness shows up throughout AP Statistics, especially when you describe distributions on free-response questions or decide whether the mean or median better represents the center. In real-world applications, income data is famously right-skewed, which is why economists often report median household income rather than the mean. Understanding skew also matters in inferential statistics: many hypothesis tests assume roughly symmetric data, so recognizing strong skew tells you when to use alternative methods.
Common Mistakes
Mistake: Confusing the direction of skew with where most data are clustered.
Correction: Skewness is named for the direction of the long tail, not the peak. A right-skewed distribution has most values on the left with a tail stretching right.
Mistake: Assuming the mean is always greater than the median in a skewed distribution.
Correction: Mean > median is typical for right skew, and mean < median for left skew, but this rule is not guaranteed for every distribution. Always check the shape of the data rather than relying solely on this shortcut.
