Jensen's Inequality — Definition, Formula & Examples
Jensen's Inequality states that for a convex function, the function applied to an average is always less than or equal to the average of the function applied to individual values. In short, convexity pushes the function's output upward relative to the midpoint.
If is a convex function on an interval and is a random variable with existing and taking values in , then . The inequality reverses if is concave.
Key Formula
Where:
- = A convex function (inequality reverses for concave functions)
- = A random variable (or a set of values being averaged)
- = Expected value (or weighted average)
How It Works
Identify whether your function is convex (curves upward, like or ) or concave (curves downward, like or ). For a convex function, applying it to the mean gives a result no larger than the mean of the applied values. For a concave function, the direction flips: the function of the mean is at least as large as the mean of the function values. This relationship holds for weighted averages, expectations, and integrals alike.
Worked Example
Problem: Let take values 1 and 3 each with probability . Verify Jensen's Inequality for the convex function .
Compute E[X]: Find the expected value of .
Compute the left side: φ(E[X]): Apply the function to the expected value.
Compute the right side: E[φ(X)]: Apply the function to each value first, then take the expectation.
Compare: Check whether the inequality holds.
Answer: The inequality holds: .
Why It Matters
Jensen's Inequality underpins core results in statistics and information theory, including why variance is nonnegative (apply it with ) and why the log-likelihood in the EM algorithm behaves the way it does. In finance, it explains why the expected return of a nonlinear portfolio differs from the return at the expected price — a concept central to risk management and option pricing.
Common Mistakes
Mistake: Applying the inequality in the wrong direction — assuming for a convex function.
Correction: Remember: convex means the function bulges downward between endpoints, so evaluating at the average underestimates. The mnemonic 'convex = cup = ≤' can help: the cup holds the higher value on the right side.
