Probability & testing · Distributions
Hypergeometric Distribution Calculator
Probability when you draw without putting things back. From a population of N with K "successes", draw n and ask how many successes you get: the chance of exactly, at most or at least k — plus the mean, spread and the full distribution. It is the honest model for defectives pulled from a finite batch, cards from a deck or numbers in a lottery, where each draw changes the odds for the next.
Example as set: a batch of 50 items with 5 defective; you inspect 10 and find 1 defective. “Success” just means the category you are counting — defective, ace, winning number.
Result
In plain English
The hypergeometric distribution counts successes in a sample drawn without replacement from a finite group. That last detail is everything: once you have pulled a defective item, there is one fewer left, so the probability shifts with every draw. It is the without-replacement twin of the binomial, and the two agree only when the population is so large that removing a few barely changes the odds.
- N, K, n, k
- Population size N, the K successes it contains, the n items you draw, and k, the number of successes you are asking about.
- P(X = k)
- The probability of exactly k successes: C(K,k)·C(N−K, n−k) ∕ C(N,n) — choose your successes and non-successes, over all possible draws.
- cumulative P(X ≤ k), P(X ≥ k)
- The chance of at most, or at least, k successes — found by adding up the individual probabilities.
- mean & variance
- Expect n·K∕N successes on average. The variance carries a finite-population correction (N−n)∕(N−1) that the binomial lacks.
- vs the binomial
- With replacement (or a huge N), draws are independent and the binomial applies. Without replacement from a small N, use this.
Frequently asked
When do I use the hypergeometric instead of the binomial distribution?
Use the hypergeometric when you draw without replacement from a finite population, so each draw changes what is left — inspecting items from a batch, dealing cards, drawing lottery numbers. Use the binomial when draws are independent: with replacement, or when the population is so large that removing a few barely moves the proportion (a common rule is N more than about 20 times n). As N grows, the hypergeometric converges on the binomial.
How do you calculate hypergeometric probability?
P(X = k) = C(K, k) · C(N − K, n − k) ∕ C(N, n), where C(a, b) is "a choose b". The numerator counts the ways to pick k successes from the K available and the remaining n − k draws from the N − K non-successes; the denominator counts all ways to draw n from N. The calculator computes this with log-factorials so it stays accurate even for large N.
What are the mean and variance of the hypergeometric distribution?
The mean is n·K∕N — the same as the binomial with p = K∕N. The variance is n·(K∕N)·(1 − K∕N)·(N − n)∕(N − 1). That last factor, (N − n)∕(N − 1), is the finite-population correction: it makes the hypergeometric variance smaller than the binomial's, because sampling without replacement from a finite pool is less variable. When n = N you draw everything and the variance is zero.
What are some real-world examples of the hypergeometric distribution?
Anywhere you sample without replacement from a fixed pool: quality control (defective items drawn from a batch), card games (the chance of two aces in a five-card hand), lotteries (matching the drawn numbers), capture–recapture estimates of wildlife populations, and the maths behind Fisher’s exact test. The tell is simple — if items already drawn are not put back, so each draw changes the odds, it is hypergeometric rather than binomial.