Mathwords logoReference LibraryMathwords

Correlation vs. Causation

Correlation means two variables tend to change together — when one goes up, the other tends to go up (or down). Causation means one variable directly causes the other to change. The critical distinction: correlation does not imply causation. Ice cream sales and drowning deaths are correlated (both increase in summer), but ice cream does not cause drowning — a third variable (warm weather) drives both.

Correlation vs. Causation

CorrelationCausation
DefinitionTwo variables move together in a patternOne variable directly produces a change in the other
DirectionCan be positive, negative, or zeroAlways one-directional: cause → effect
Measured byCorrelation coefficient rr (−1 to +1)Controlled experiments, not just observation
Third variables?May be driven by confounding variablesMust rule out confounders to establish
ExampleCities with more firefighters have more fires (both caused by city size)Smoking causes lung cancer (established via decades of controlled studies)
Proves?Association, not explanationA mechanism linking cause to effect

When to Use Each

Use Correlation when...

  • Describing relationships in observational data
  • Exploring whether variables are related before investigating why
  • Building prediction models (regression) where mechanism isn't the goal
  • Reporting statistical associations in research papers

Use Causation when...

  • Making policy decisions (banning a substance, recommending a treatment)
  • Understanding WHY something happens, not just that it happens
  • Drawing conclusions from randomized controlled experiments
  • Establishing scientific mechanisms

Examples

Spurious correlation
Per capita cheese consumption correlates with the number of people who die tangled in bedsheets (r ≈ 0.95). This is pure coincidence — no mechanism connects them. This is why correlation ≠ causation.
Confounding variable
Students who eat breakfast get better grades. Does breakfast cause better grades? Not necessarily — families that provide breakfast may also provide more academic support, better sleep routines, and other advantages.
Established causation
Randomized clinical trials show that a specific drug reduces blood pressure. Because the experiment controls for confounders (placebo group, random assignment), we can conclude the drug causes the reduction.

Common Confusion Points

The most common error in statistics and media reporting is treating a correlation as proof of causation. Headlines like 'Study finds coffee drinkers live longer' imply causation, but the study may only show correlation.
Reverse causation is another pitfall: A and B are correlated, but B causes A (not A causes B). For example, successful people read more books — but does reading cause success, or do already-successful people have more leisure time to read?

Frequently Asked Questions

Does correlation ever imply causation?
Correlation alone never proves causation. However, strong correlation combined with a plausible mechanism, dose-response relationship, temporal ordering (cause precedes effect), and consistency across studies can build a strong case for causation.
How do you prove causation?
The gold standard is a randomized controlled experiment (RCT): randomly assign subjects to treatment and control groups, then observe the difference. Random assignment ensures that confounding variables are balanced between groups.
What is a confounding variable?
A confounding variable (confounder) is a third variable that influences both the supposed cause and the supposed effect, creating a spurious correlation. For example, age can confound the relationship between exercise and heart health.

Related Pages