Mathwords logoReference LibraryMathwords

Coefficient of Determination (R²)

The coefficient of determination, written as R², is a number between 0 and 1 that tells you what proportion of the variation in your dependent variable is explained by your regression model. An R² of 0.85, for instance, means 85% of the variability in the response variable can be accounted for by the independent variable(s).

The coefficient of determination, denoted R2R^2, quantifies the fraction of the total variability in the response variable yy that is captured by a regression model. It is computed as 11 minus the ratio of the residual sum of squares (SSresSS_{\text{res}}) to the total sum of squares (SStotSS_{\text{tot}}). Values range from 00 (the model explains none of the variability) to 11 (the model explains all of the variability). In simple linear regression, R2R^2 equals the square of the Pearson correlation coefficient rr.

Key Formula

R2=1SSresSStot=1i=1n(yiy^i)2i=1n(yiyˉ)2R^2 = 1 - \frac{SS_{\text{res}}}{SS_{\text{tot}}} = 1 - \frac{\sum_{i=1}^{n}(y_i - \hat{y}_i)^2}{\sum_{i=1}^{n}(y_i - \bar{y})^2}
Where:
  • R2 = the coefficient of determination
  • yiy_i = each observed value of the dependent variable
  • y^iŷ_i = the predicted value from the regression model
  • yˉȳ = the mean of all observed y values
  • SSresSS_res = the residual (error) sum of squares
  • SStotSS_tot = the total sum of squares

Worked Example

Problem: Five students' hours of study (x) and test scores (y) are recorded. After fitting a regression line, you find that the correlation coefficient is r = 0.90. Find and interpret R².
Step 1: Square the correlation coefficient to get R².
R2=r2=(0.90)2=0.81R^2 = r^2 = (0.90)^2 = 0.81
Step 2: Convert to a percentage for interpretation.
0.81×100%=81%0.81 \times 100\% = 81\%
Step 3: Write the interpretation in context. About 81% of the variation in test scores can be explained by the linear relationship with hours of study.
Answer: R² = 0.81. Approximately 81% of the variability in test scores is explained by hours of study.

Visualization

Why It Matters

R² is one of the first things researchers and analysts check when evaluating a model. In AP Statistics, you are expected to calculate R² and — more importantly — interpret it in the context of the data. Beyond the classroom, R² helps scientists decide whether a model is useful for prediction, such as determining how well advertising spending predicts sales revenue or how well temperature explains ice cream demand.

Common Mistakes

Mistake: Interpreting R² as proving causation between x and y.
Correction: R² measures the strength of a statistical association, not a cause-and-effect relationship. A high R² does not mean x causes y; other variables or coincidence may be involved.
Mistake: Saying "R² = 0.81 means 81% of the data points fall on the regression line."
Correction: R² describes the proportion of variance explained, not the percentage of points on the line. The correct phrasing is: 81% of the variability in the response variable is explained by the model.

Related Terms