Click or press any key to close help Help screen describing the interactive elements

The hypergeometric distribution arises when choosing "without replacement" from a finite population. Consider a population of size N which contains M elements having some property. We select a sample of size K from this population. The number in the sample that have this property is a hypergeometric random variable X.

Parameter Range Description
N N = 0, 1, ... Total size of population
M M = 0, 1, ..., N # elements with property
K K = 0, 1, ..., N # elements selected

Probability Mass Function

P ( X = x | N , M , K ) = ( M x ) ( N M K x ) ( N K )

Support

x = 0 , 1 , , K

Mean

Variance

Example N M K
Eight cups of tea are made, four of which contain sugar. Four cups are chosen at random. Let X be the number chosen that contain sugar. 8 4 4
A standard deck of cards contains 52 cards total, of which 13 are hearts. Six cards are chosen at random. Let X be the number of hearts chosen. 52 13 6
A lottery selects 5 numbers out of 59 total. Let X be the number of matches between the numbers on a random ticket and the numbers drawn. 59 5 5

X ~ Hypergeometric(N, M, K)

Chart of the hypergeometric distribution Chart area for displaying the hypergeometric pmf, cdf, visualization, and simulation

E(X) = , Var(X) =

Note that:

  • XM. Since only M elements have the given property, the maximum number of elements that can be chosen with the property is M.
  • XK - (N - M). Since M elements have the given property, then N - M elements do not have the property. So, if K items are chosen, at least K - (N - M) items must have the property.

The illustration above shows a sample of K items chosen from a set of size N, of which M items have some property shown in green. The random variable X denotes the number of items chosen with this property, which has a hypergeometric(N, M, K) distribution.

The simulation above shows a sample of K items chosen from a set of size N, of which M items have some property. Items with the property are shown in green, and items without the property are shown in grey. The random variable X denotes the number of items chosen with this property, which has a hypergeometric(N, M, K) distribution. The histogram accumulates the results of each simulation.

Y = Hypergeometric(N, M, 1) ~ Bernoulli(M/N) limN,M→∞ X ~ Binomial(K, M/N)

Chart of the related distribution Chart area for displaying the related pdf, cdf, visualization, and simulation

E(Y) = , Var(Y) =