Covariance — Definition, Formula & Examples

Covariance is a measure of how two variables move together. A positive covariance means they tend to increase together, while a negative covariance means one tends to decrease when the other increases.

For two random variables $X$ and $Y$ , the covariance is defined as the expected value of the product of their deviations from their respective means: $\text{Cov}(X, Y) = E[(X - \mu_X)(Y - \mu_Y)]$ . For a sample of $n$ paired observations, the sample covariance uses $n - 1$ in the denominator as a degrees-of-freedom correction.

Key Formula

\text{Cov}(X, Y) = \frac{1}{n-1}\sum_{i=1}^{n}(x_i - \bar{x})(y_i - \bar{y})

Where:

$x_i, y_i$ = The $i$-th paired observations of variables $X$ and $Y$
$\bar{x}, \bar{y}$ = The sample means of $X$ and $Y$
$n$ = The number of paired observations

How It Works

To compute covariance, you find how far each data point deviates from its variable's mean, multiply the paired deviations together, and average the results. If large values of

X

tend to appear alongside large values of

Y

, most products will be positive, yielding a positive covariance. If large

X

values pair with small

Y

values, most products will be negative. The magnitude of covariance depends on the units and scales of

X

and

Y

, which is why correlation (covariance divided by the product of the standard deviations) is often preferred for comparison.

Worked Example

Problem: Five students' hours studied (

X

) and exam scores (

Y

) are: (2, 60), (4, 70), (6, 80), (8, 85), (10, 95). Find the sample covariance.

Find the means: Compute the mean of each variable.

\bar{x} = \frac{2+4+6+8+10}{5} = 6, \quad \bar{y} = \frac{60+70+80+85+95}{5} = 78

Compute each product of deviations: For each pair, multiply

(x_i - \bar{x})

(y_i - \bar{y})

(-4)(-18),\; (-2)(-8),\; (0)(2),\; (2)(7),\; (4)(17) = 72,\; 16,\; 0,\; 14,\; 68

Sum and divide by n − 1: Add the products and divide by 4.

\text{Cov}(X,Y) = \frac{72 + 16 + 0 + 14 + 68}{4} = \frac{170}{4} = 42.5

Answer: The sample covariance is

42.5

, indicating a positive association: more hours studied tends to go with higher exam scores.

Visualization

Why It Matters

Covariance is the building block of the Pearson correlation coefficient and appears throughout regression analysis, portfolio theory in finance, and multivariate probability. In AP Statistics and college-level courses, understanding covariance is essential for interpreting how variables relate before moving to linear models.

Common Mistakes

Mistake: Interpreting a large covariance as a strong relationship.

Correction: Covariance is not standardized — its magnitude depends on the units of the variables. Divide by the product of the standard deviations to get the correlation coefficient, which ranges from

-1

1

and allows meaningful comparison of strength.