Mathwords logoReference LibraryMathwords

Residual Plot

A residual plot is a scatterplot that displays the residuals (the differences between observed and predicted values) on the vertical axis against the predicted values (or the explanatory variable) on the horizontal axis. It helps you judge whether a linear regression model is appropriate for your data.

A residual plot graphs the residuals ei=yiy^ie_i = y_i - \hat{y}_i on the yy-axis against either the fitted values y^i\hat{y}_i or the explanatory variable xix_i on the xx-axis. If the regression model is a good fit, the residuals should appear randomly scattered around the horizontal line e=0e = 0, with no obvious pattern, curvature, or systematic change in spread. Patterns in a residual plot signal that the chosen model does not adequately capture the relationship in the data.

Key Formula

ei=yiy^ie_i = y_i - \hat{y}_i
Where:
  • eie_i = the residual for the $i$th observation
  • yiy_i = the observed value of the response variable
  • y^i\hat{y}_i = the predicted value from the regression model

Worked Example

Problem: A simple linear regression predicts test scores from hours of study. The data for five students gives observed scores yy and predicted scores y^\hat{y}. Compute the residuals and describe what to look for in the residual plot. | Student | Hours (xx) | Observed (yy) | Predicted (y^\hat{y}) | |---------|------------|----------------|----------------------| | 1 | 1 | 52 | 50 | | 2 | 2 | 58 | 60 | | 3 | 3 | 72 | 70 | | 4 | 4 | 78 | 80 | | 5 | 5 | 92 | 90 |
Step 1: Calculate each residual using ei=yiy^ie_i = y_i - \hat{y}_i.
e1=5250=2,e2=5860=2,e3=7270=2,e4=7880=2,e5=9290=2e_1 = 52 - 50 = 2, e_2 = 58 - 60 = -2, e_3 = 72 - 70 = 2, e_4 = 78 - 80 = -2, e_5 = 92 - 90 = 2
Step 2: Plot each predicted value y^\hat{y} on the horizontal axis and the corresponding residual ee on the vertical axis. The five points are (50,2)(50, 2), (60,2)(60, -2), (70,2)(70, 2), (80,2)(80, -2), (90,2)(90, 2).
Step 3: Examine the plot for patterns. Here the residuals alternate between 22 and 2-2, staying close to zero with no curvature or fan shape.
Step 4: Because the residuals are small and show no systematic pattern, the linear model appears to be a reasonable fit for these data.
Answer: The residuals are 2,2,2,2,22, -2, 2, -2, 2. The residual plot shows points scattered closely around the line e=0e = 0 with no clear pattern, suggesting the linear model fits well.

Visualization

Why It Matters

In AP Statistics, you cannot tell whether a linear model is appropriate just by looking at the original scatterplot or the correlation coefficient. A residual plot is the standard diagnostic tool: a curved pattern tells you a nonlinear model might work better, while a fan or funnel shape warns that the variability in your response is not constant. Checking residual plots is a required step whenever you perform regression analysis on the AP exam.

Common Mistakes

Mistake: Concluding that a clear curved pattern in the residual plot means the linear model is fine because rr is high.
Correction: A high correlation coefficient does not guarantee linearity. If the residual plot shows curvature, the linear model is not appropriate regardless of rr.
Mistake: Expecting the residual plot to show a linear trend when the model is a good fit.
Correction: A good residual plot looks like a random cloud of points centered on zero. Any visible pattern — linear, curved, or fan-shaped — indicates a problem with the model.

Related Terms

  • Residualthe individual values plotted in this chart
  • Regressionthe model whose fit this plot assesses
  • Scatterplotthe graph type a residual plot is based on