Mathwords logoMathwords

Covariance — Definition, Formula & Examples

Covariance is a measure of how two variables move together. A positive covariance means they tend to increase together, while a negative covariance means one tends to decrease when the other increases.

For two random variables XX and YY, the covariance is defined as the expected value of the product of their deviations from their respective means: Cov(X,Y)=E[(XμX)(YμY)]\text{Cov}(X, Y) = E[(X - \mu_X)(Y - \mu_Y)]. For a sample of nn paired observations, the sample covariance uses n1n - 1 in the denominator as a degrees-of-freedom correction.

Key Formula

Cov(X,Y)=1n1i=1n(xixˉ)(yiyˉ)\text{Cov}(X, Y) = \frac{1}{n-1}\sum_{i=1}^{n}(x_i - \bar{x})(y_i - \bar{y})
Where:
  • xi,yix_i, y_i = The $i$-th paired observations of variables $X$ and $Y$
  • xˉ,yˉ\bar{x}, \bar{y} = The sample means of $X$ and $Y$
  • nn = The number of paired observations

How It Works

To compute covariance, you find how far each data point deviates from its variable's mean, multiply the paired deviations together, and average the results. If large values of XX tend to appear alongside large values of YY, most products will be positive, yielding a positive covariance. If large XX values pair with small YY values, most products will be negative. The magnitude of covariance depends on the units and scales of XX and YY, which is why correlation (covariance divided by the product of the standard deviations) is often preferred for comparison.

Worked Example

Problem: Five students' hours studied (XX) and exam scores (YY) are: (2, 60), (4, 70), (6, 80), (8, 85), (10, 95). Find the sample covariance.
Find the means: Compute the mean of each variable.
xˉ=2+4+6+8+105=6,yˉ=60+70+80+85+955=78\bar{x} = \frac{2+4+6+8+10}{5} = 6, \quad \bar{y} = \frac{60+70+80+85+95}{5} = 78
Compute each product of deviations: For each pair, multiply (xixˉ)(x_i - \bar{x}) by (yiyˉ)(y_i - \bar{y}).
(4)(18),  (2)(8),  (0)(2),  (2)(7),  (4)(17)=72,  16,  0,  14,  68(-4)(-18),\; (-2)(-8),\; (0)(2),\; (2)(7),\; (4)(17) = 72,\; 16,\; 0,\; 14,\; 68
Sum and divide by n − 1: Add the products and divide by 4.
Cov(X,Y)=72+16+0+14+684=1704=42.5\text{Cov}(X,Y) = \frac{72 + 16 + 0 + 14 + 68}{4} = \frac{170}{4} = 42.5
Answer: The sample covariance is 42.542.5, indicating a positive association: more hours studied tends to go with higher exam scores.

Visualization

Why It Matters

Covariance is the building block of the Pearson correlation coefficient and appears throughout regression analysis, portfolio theory in finance, and multivariate probability. In AP Statistics and college-level courses, understanding covariance is essential for interpreting how variables relate before moving to linear models.

Common Mistakes

Mistake: Interpreting a large covariance as a strong relationship.
Correction: Covariance is not standardized — its magnitude depends on the units of the variables. Divide by the product of the standard deviations to get the correlation coefficient, which ranges from 1-1 to 11 and allows meaningful comparison of strength.