Distribution Function — Definition, Formula & Examples
A distribution function (also called the cumulative distribution function or CDF) is a function that gives the probability that a random variable takes a value less than or equal to a given number. For any real number , it tells you .
The (cumulative) distribution function of a random variable is the function defined by for all . It is non-decreasing, right-continuous, satisfies , and .
Key Formula
Where:
- = The probability that the random variable X takes a value at most x
- = A random variable
- = Any real number at which the CDF is evaluated
How It Works
To find for a discrete random variable, sum the probabilities of all outcomes less than or equal to . For a continuous random variable, integrate the probability density function from to . The value always lies between 0 and 1, and it never decreases as increases. You can also find the probability that falls in an interval by computing .
Worked Example
Problem: A fair die is rolled once. Let X be the number shown. Find the distribution function F(x) and compute P(X ≤ 4).
Step 1: List the probability of each outcome. Each face has probability 1/6.
Step 2: The CDF at x = 4 is the sum of probabilities for all outcomes ≤ 4.
Step 3: Simplify the fraction.
Answer: F(4) = 2/3, so there is approximately a 66.7% probability that the die shows 4 or less.
Why It Matters
Distribution functions are central to hypothesis testing, confidence intervals, and reliability engineering. When you look up a p-value in a z-table or t-table, you are reading values from a CDF. Any probability question about a random variable can ultimately be answered using its distribution function.
Common Mistakes
Mistake: Confusing the distribution function (CDF) with the probability mass function (PMF) or probability density function (PDF).
Correction: The CDF gives cumulative probability P(X ≤ x), which is a running total. The PMF gives P(X = x) for discrete variables, and the PDF gives the density (not probability) at a point for continuous variables. The CDF is obtained by summing the PMF or integrating the PDF.
