Spearman Rank Correlation Coefficient — Definition, Formula & Examples
The Spearman rank correlation coefficient is a number between −1 and +1 that measures how well the relationship between two variables can be described by a monotonic function, using the ranks of the data rather than the actual values.
Denoted (or ), the Spearman rank correlation coefficient is a nonparametric measure of rank correlation computed by applying the Pearson correlation formula to the ranked values of two variables. When there are no tied ranks, it simplifies to , where is the difference between the two ranks for each observation and is the number of observations.
Key Formula
Where:
- = Spearman rank correlation coefficient
- = Difference between the rank of $x_i$ and the rank of $y_i$ for the $i$-th observation
- = Number of paired observations
How It Works
To compute , first rank each variable's values separately from smallest to largest. Then find the difference between the two ranks for each pair. Square those differences, sum them, and substitute into the formula. A value near +1 indicates a strong increasing monotonic relationship, a value near −1 indicates a strong decreasing monotonic relationship, and a value near 0 suggests no monotonic association. Unlike Pearson's , Spearman's coefficient does not assume linearity or normally distributed data, making it robust to outliers and suitable for ordinal data.
Worked Example
Problem: Five students are ranked by their math score and their science score. Math ranks: 1, 2, 3, 4, 5. Science ranks: 2, 1, 3, 5, 4. Find the Spearman rank correlation coefficient.
Step 1: Compute each rank difference (Math rank minus Science rank).
Step 2: Square each difference and sum them.
Step 3: Substitute into the formula with .
Answer: , indicating a strong positive monotonic association between math and science ranks.
Why It Matters
Spearman's coefficient is a standard tool in introductory statistics courses whenever data are ordinal or the assumptions of Pearson's (linearity, normality) are not met. Researchers in psychology, ecology, and market research frequently use it to quantify monotonic associations in survey data or ranked outcomes.
Common Mistakes
Mistake: Using the simplified formula when there are tied ranks.
Correction: The shortcut formula assumes no ties. When ties exist, assign average ranks and compute using the general Pearson correlation formula applied to the ranks.
