The Pareto distribution is a power-law distribution that models many types of phenomena that become less common at larger scales. A shape parameter α controls the exponent in the power-law, while a scale parameter xₘ defines the lower bound of the distribution.
Parameter | Range | Description |
---|---|---|
α | α > 0 | Shape parameter |
xₘ | xₘ > 0 | Scale parameter |
Probability Density Function
Support
Mean
Variance
Example | α | xₘ |
---|---|---|
For the 633 bestselling books in the US that sold 2 million or more copies between 1895 and 1965, the number of books sold (in millions) follows an approximate Pareto distribution with α = 3.51. | 3.510 | 2.000 |
The magnitude of earthquakes occuring in California which record above 3.8 on the Richter scale follows an approximate Pareto distribution with α = 3.04. | 3.040 | 3.800 |
For AT&T customers in the US receiving 10 or more phone calls per day, the number of daily phone calls follows an approximate Pareto distribution with α = 2.22. | 2.220 | 10.00 |
X ∼ Pareto(α, xₘ)
E(X) = , Var(X) =
The Pareto distribution has the property of being scale-invariant. Suppose we consider some number x0 and a multiplier k. From the Pareto cdf, we have P(X > kx0 | X > x0) = (1/k)α, which is independent of x0. For example, if US household incomes follow a Pareto distribution, then the ratio of households with income over $100K compared to those over $50K is the same as the ratio of those over $50K compared to those over $25K.
Note that the mean of the Pareto distribution is only defined for α > 1, while the variance is only defined for α > 2.
The Pareto distribution has the property of being scale-invariant. Suppose we consider some number x0 and a multiplier k. From the Pareto cdf, we have P(X > kx0 | X > x0) = (1/k)α, which is independent of x0. For example, if US household incomes follow a Pareto distribution, then the ratio of households with income over $100K compared to those over $50K is the same as the ratio of those over $50K compared to those over $25K.
Note that the mean of the Pareto distribution is only defined for α > 1, while the variance is only defined for α > 2.
The graph above displays the survival function S(x) = P(X > x) = 1 - F(X), where F(x) is the cumulative distribution function (cdf).
Survival functions are used in survival analysis, a branch of statistics concerned with the expected duration until an event occurs such as death or the failure of a mechanical system.
The graph above displays the hazard function h(x). This equals f(x)/S(x), where f(x) is the pdf and S(x) = P(X > x) is the survival function.
The illustration above shows a point U chosen from a standard uniform distribution. The random variable X = xm/U1/α has a Pareto(α, xm) distribution.
The simulation above shows a point U chosen from a standard uniform distribution on the y-axis. The light blue circle shows the value of the random variable X = xm/U1/α on the x-axis, which has a Pareto(α, xm) distribution. The histogram accumulates the results of each simulation.