Mathwords logoMathwords

Bivariate Data — Definition, Examples & Meaning

Bivariate data is data where two different variables are recorded for each individual or observation. For example, measuring both the height and weight of each student in a class gives you a bivariate data set.

Bivariate data consists of paired observations (xi,yi)(x_i, y_i) where each data point records values of two variables measured on the same subject or unit. The two variables are often analyzed to determine whether a relationship exists between them, using tools such as scatterplots, correlation coefficients, and regression models. In contrast to univariate data, which examines a single variable in isolation, bivariate data allows investigation of association, trend, and prediction.

Key Formula

r=1n1i=1n(xixˉsx)(yiyˉsy)r = \frac{1}{n-1}\sum_{i=1}^{n}\left(\frac{x_i - \bar{x}}{s_x}\right)\left(\frac{y_i - \bar{y}}{s_y}\right)
Where:
  • rr = the correlation coefficient, measuring the strength and direction of the linear relationship
  • nn = the number of paired observations
  • xi,yix_i, y_i = the individual data values for the two variables
  • xˉ,yˉ\bar{x}, \bar{y} = the sample means of the x and y variables
  • sx,sys_x, s_y = the sample standard deviations of the x and y variables

Worked Example

Problem: Five students were each measured for hours of study per week and their exam score (out of 100). The data are: (2, 55), (4, 65), (5, 72), (8, 85), (10, 90). Describe the data and find the mean of each variable.
Step 1: Identify the two variables. Here, x = hours of study per week and y = exam score. Each student provides one paired observation.
Step 2: Calculate the mean of the x-values (hours studied).
xˉ=2+4+5+8+105=295=5.8\bar{x} = \frac{2 + 4 + 5 + 8 + 10}{5} = \frac{29}{5} = 5.8
Step 3: Calculate the mean of the y-values (exam scores).
yˉ=55+65+72+85+905=3675=73.4\bar{y} = \frac{55 + 65 + 72 + 85 + 90}{5} = \frac{367}{5} = 73.4
Step 4: Describe the association. As study hours increase, exam scores tend to increase as well. This suggests a positive association between the two variables. You would plot the data on a scatterplot to visualize this relationship.
Answer: The bivariate data has means xˉ=5.8\bar{x} = 5.8 hours and yˉ=73.4\bar{y} = 73.4 points, and the two variables show a positive association.

Visualization

Why It Matters

Bivariate data is central to statistics because most real questions involve relationships: Does more exercise lower blood pressure? Do advertising budgets predict sales? In AP Statistics, you will use bivariate data to build scatterplots, compute correlation, and fit least-squares regression lines — skills that form the basis of data-driven decision making in science, business, and social research.

Common Mistakes

Mistake: Analyzing the two variables separately instead of as pairs.
Correction: The whole point of bivariate data is that each observation links two values together. If you break the pairing — for instance by sorting one column independently — you destroy the relationship you are trying to study.
Mistake: Assuming a strong correlation means one variable causes the other.
Correction: Correlation measures association, not causation. Two variables can move together because of a lurking variable or coincidence. Always consider the study design before making causal claims.