Mathwords logoMathwords

Hypergeometric Distribution — Definition, Formula & Examples

The hypergeometric distribution gives the probability of drawing exactly kk successes from a finite population when you sample without replacement. It applies whenever you pull items from a group that contains two types (success/failure) and do not put them back.

A discrete random variable XX follows a hypergeometric distribution with parameters NN (population size), KK (number of success states in the population), and nn (number of draws) if its probability mass function is P(X=k)=(Kk)(NKnk)(Nn)P(X = k) = \frac{\binom{K}{k}\binom{N-K}{n-k}}{\binom{N}{n}} for max(0,n+KN)kmin(n,K)\max(0,\, n+K-N) \le k \le \min(n,\,K).

Key Formula

P(X=k)=(Kk)(NKnk)(Nn)P(X = k) = \frac{\dbinom{K}{k}\dbinom{N - K}{n - k}}{\dbinom{N}{n}}
Where:
  • NN = Total population size
  • KK = Number of success states in the population
  • nn = Number of draws (sample size)
  • kk = Number of observed successes in the sample

How It Works

You use the hypergeometric distribution when three conditions hold: the population is finite, each member is classified as success or failure, and sampling is done without replacement. To find P(X=k)P(X=k), count the ways to choose kk successes from KK, multiply by the ways to choose nkn-k failures from NKN-K, then divide by the total ways to choose nn items from NN. The expected value is E(X)=nKNE(X) = \frac{nK}{N}, and the variance is Var(X)=nKNNKNNnN1\text{Var}(X) = n\frac{K}{N}\frac{N-K}{N}\frac{N-n}{N-1}. As NN grows large relative to nn, the hypergeometric distribution approaches the binomial distribution.

Worked Example

Problem: A deck contains 20 cards: 6 red and 14 black. You draw 5 cards without replacement. What is the probability of getting exactly 2 red cards?
Identify parameters: Population N=20N = 20, successes K=6K = 6, draws n=5n = 5, desired successes k=2k = 2.
Count favorable outcomes: Choose 2 red from 6 and 3 black from 14.
(62)(143)=15×364=5,460\binom{6}{2}\binom{14}{3} = 15 \times 364 = 5{,}460
Count total outcomes: Choose any 5 from 20.
(205)=15,504\binom{20}{5} = 15{,}504
Compute probability: Divide favorable by total.
P(X=2)=5,46015,5040.3522P(X = 2) = \frac{5{,}460}{15{,}504} \approx 0.3522
Answer: The probability of drawing exactly 2 red cards is approximately 0.3520.352, or about 35.2%.

Visualization

Why It Matters

Quality control relies on this distribution: when an inspector pulls a sample from a finite lot of products, the hypergeometric model gives the exact probability of finding a certain number of defectives. It also underpins Fisher's exact test, a standard tool in biostatistics for analyzing contingency tables with small sample sizes.

Common Mistakes

Mistake: Using the binomial distribution instead when sampling without replacement from a small population.
Correction: The binomial assumes independence between draws (replacement). When the sample is a non-negligible fraction of the population (often cited as more than 5-10%), the hypergeometric distribution is the correct model because each draw changes the composition of the remaining pool.