The normal or Gaussian distribution is the classic "bell curve". This is a continuous symmetric distribution defined over all real numbers. A location parameter μ specifies the mean of the distribution, while the variance of the distribution is given by a parameter σ².
The function Φ(x) in the cdf denotes the standard normal cdf function, which has the property that Φ(-x) = 1 - Φ(x).
Parameter | Range | Description |
---|---|---|
μ | −∞ < μ < ∞ | Expected value |
σ² | σ² > 0 | Variance |
Probability Density Function
Support
Mean
Variance
Example | μ | σ² |
---|---|---|
The life of laptop batteries has a normal distribution with mean 4 hours and standard deviation 1 hour. Let X be the battery life of a randomly chosen laptop. | 4.000 | 1.000 |
Birth weights of babies born in the United States are normally distributed with mean 3.4 kilograms and standard deviation 0.57 kilograms. Let X be the birth weight of a randomly chosen baby. | 3.400 | 0.3249 |
Heights of male red kangaroos are normally distributed with approximate mean 1.5 meters and standard deviation 0.12 meters. Let X be the height of a randomly chosen male red kangaroo. | 1.500 | 0.0144 |
X ∼ Normal(μ, σ²)
E(X) = , Var(X) =
The normal distribution is one of the most commonly used distributions in all of probability theory. This is a consequence of the "Central Limit Theorem", which states that the distribution of the sample means of n independent random variables converges to a normal distribution as n increases. Many quantities which result from the addition of multiple independent factors therefore naturally have a normal distribution.
Since the pdf of the normal distribution is symmetric about the location parameter μ, the cdf has the property that F(μ - x) = 1 - F(μ + x).
The graph above displays the survival function S(x) = P(X > x) = 1 - F(X), where F(x) is the cumulative distribution function (cdf).
Survival functions are used in survival analysis, a branch of statistics concerned with the expected duration until an event occurs such as death or the failure of a mechanical system.
The graph above displays the hazard function h(x). This equals f(x)/S(x), where f(x) is the pdf and S(x) = P(X > x) is the survival function.
Note that h(x) converges asymptotically to (x - μ)/σ² as x → ∞.
The illustration above shows a sample of n points chosen independently and at random from a uniform(μ − σ√(3n), μ + σ√(3n)) distribution. The population has mean μ and variance σ2n, while the sample mean X̅ (shown as a light blue line) has mean μ and variance σ2. By the Central Limit Theorem, the distribution of X̅ converges to a normal(μ, σ2) distribution as n approaches infinity.
The uniform distribution is used in the illustration for simplicity. Note that the Central Limit Theorem holds for the sample mean of any distribution, as long as the variance is finite.
The simulation above shows a sample of n points (marked red) chosen independently and at random from a uniform(μ − σ√(3n), μ + σ√(3n)) distribution with mean μ and variance σ2n. The light blue line shows the sample mean X̅, which itself has mean μ and variance σ2. By the Central Limit Theorem, X̅ converges to a normal(μ, σ2) distribution as n approaches infinity. The histogram accumulates the results of each simulation.
The uniform distribution is used in the simulation for simplicity. Note that the Central Limit Theorem holds for the sample mean of any distribution with finite variance.