Pareto Distribution — Definition, Formula & Examples
The Pareto distribution is a continuous probability distribution that models phenomena where a large portion of outcomes come from a small fraction of causes — the classic "80/20 rule." It is defined on values greater than or equal to a minimum threshold and has a heavy right tail, meaning extreme values are more probable than in distributions like the normal.
A continuous random variable follows a Pareto distribution with shape parameter and scale parameter if its probability density function is for , and otherwise. Its cumulative distribution function is for .
Key Formula
Where:
- = Shape parameter (controls tail heaviness), must be positive
- = Scale parameter (minimum possible value of X), must be positive
- = Value of the random variable
How It Works
The scale parameter sets the minimum possible value of . The shape parameter controls how rapidly the tail decays: smaller produces a heavier tail with more extreme values. When , the mean is infinite; when , the variance is infinite. To find the probability that exceeds some value , use the survival function .
Worked Example
Problem: Incomes in a region follow a Pareto distribution with minimum income and shape parameter . What is the probability that a randomly selected person earns more than $60,000?
Write the survival function: For a Pareto distribution, the probability of exceeding a value is:
Substitute values: Plug in , , and :
Compute: Evaluate the expression:
Answer: There is a 12.5% probability that a randomly selected person earns more than $60,000.
Why It Matters
The Pareto distribution appears in insurance (modeling large claims), network traffic analysis, and economics (wealth and income distributions). Understanding its heavy tail is essential in risk management, where underestimating the probability of extreme events can be costly.
Common Mistakes
Mistake: Using the Pareto PDF or CDF for values below the scale parameter .
Correction: The distribution is only defined for . The probability density is exactly zero below this threshold — there is no probability mass to the left of .
