Negatively Associated Data
Negatively Associated Data
A relationship in paired data in which one variable's values tend to increase when the other decreases, and vice-versa. In a scatterplot, negatively associated data tend to follow a pattern from the upper left to the lower right. Negatively associated data have a negative correlation coefficient.

See also
Key Formula
r=[n∑x2−(∑x)2][n∑y2−(∑y)2]n∑xy−(∑x)(∑y)
Where:
- r = Correlation coefficient; a value between −1 and 1. For negatively associated data, r < 0.
- n = Number of data pairs
- x = Values of the first (independent) variable
- y = Values of the second (dependent) variable
- ∑xy = Sum of the products of each paired x and y value
Worked Example
Problem: A teacher records the number of hours students spent watching TV per week and their test scores. The data are: (2, 90), (4, 80), (6, 70), (8, 60), (10, 50). Show that this data is negatively associated by computing the correlation coefficient r.
Step 1: List the values and compute the required sums. Here n = 5.
∑x=2+4+6+8+10=30
Step 2: Find the sum of y values and the sum of the products xy.
∑y=90+80+70+60+50=350∑xy=180+320+420+480+500=1900
Step 3: Compute the sums of squares.
∑x2=4+16+36+64+100=220∑y2=8100+6400+4900+3600+2500=25500
Step 4: Substitute into the correlation coefficient formula.
r=[5(220)−302][5(25500)−3502]5(1900)−(30)(350)=[1100−900][127500−122500]9500−10500
Step 5: Simplify the numerator and denominator to find r.
r=(200)(5000)−1000=1000000−1000=1000−1000=−1
Answer: r = −1, which confirms a perfect negative association. As TV hours increase, test scores decrease in a perfectly linear pattern.
Another Example
Unlike the first example, this data set does not follow a perfectly linear pattern. It demonstrates that real-world negatively associated data often has r between −1 and 0 rather than exactly −1.
Problem: A store tracks the price of a product (in dollars) and the number of units sold over four months: (5, 40), (10, 35), (15, 20), (20, 25). Determine whether the data is negatively associated.
Step 1: Record the sums with n = 4.
∑x=50,∑y=120,∑xy=200+350+300+500=1350
Step 2: Compute sums of squares.
∑x2=25+100+225+400=750∑y2=1600+1225+400+625=3850
Step 3: Substitute into the formula for r.
r=[4(750)−2500][4(3850)−14400]4(1350)−(50)(120)=(500)(1000)5400−6000
Step 4: Simplify to find r.
r=500000−600=707.1−600≈−0.849
Answer: r ≈ −0.849, indicating a strong (but not perfect) negative association between price and units sold.
Frequently Asked Questions
What is the difference between negatively associated data and positively associated data?
With negatively associated data, one variable tends to decrease as the other increases, producing a downward trend on a scatterplot and a negative correlation coefficient (r < 0). With positively associated data, both variables tend to increase together, producing an upward trend and a positive correlation coefficient (r > 0).
Does negative association mean one variable causes the other to decrease?
No. Negative association describes a pattern or trend, not causation. Two variables can move in opposite directions because of a third hidden variable or pure coincidence. You need a controlled experiment or additional evidence to establish that one variable actually causes the other to change.
What does it mean when r is close to 0 but still negative?
A value of r near 0 (such as −0.1) indicates a very weak negative association. The data points are widely scattered, and the downward trend is barely detectable. In practice, such weak associations may not be meaningful.
Negatively Associated Data vs. Positively Associated Data
| Negatively Associated Data | Positively Associated Data | |
|---|---|---|
| Direction of trend | One variable increases while the other decreases | Both variables increase together |
| Scatterplot pattern | Downward slope, upper left to lower right | Upward slope, lower left to upper right |
| Correlation coefficient | r < 0 (between −1 and 0) | r > 0 (between 0 and 1) |
| Real-world example | More exercise → lower resting heart rate | More study hours → higher test scores |
Why It Matters
Recognizing negative association is essential in statistics courses when you analyze scatterplots and compute correlation. It appears in science classes (e.g., altitude vs. air pressure), economics (price vs. demand), and health studies (exercise vs. body fat percentage). Understanding that a negative r value quantifies this inverse relationship helps you interpret data and make predictions using linear regression.
Common Mistakes
Mistake: Assuming negative association means no relationship between the variables.
Correction: Negative association is a definite relationship — it means the variables move in opposite directions. 'No relationship' corresponds to r ≈ 0, where there is no clear trend at all.
Mistake: Confusing negative association with causation.
Correction: A negative correlation coefficient tells you two variables tend to move in opposite directions, but it does not prove that changes in one variable cause changes in the other. Always consider lurking variables and study design before inferring causation.
Related Terms
- Positively Associated Data — Opposite trend where both variables increase together
- Correlation Coefficient — Numerical measure of direction and strength
- Scatterplot — Graph used to visually display association
- Paired Data — Data format required to identify association
- Variable — Quantity that changes across observations
- Line of Best Fit — Has negative slope for negatively associated data
- Linear Regression — Method for modeling the inverse relationship
