MTH 337: Week 11

In [1]:
%pylab inline
Populating the interactive namespace from numpy and matplotlib

Sums of Uniform Random Numbers

Note the normed=True keyword argument to hist. This scales the histogram to be a probability density function i.e. to have an integral of one.

Summing pairs of uniform random numbers

In [2]:
npts = 100000
M = 2
a = random.rand(M, npts)
b = sum(a, axis=0)
hist(b, bins=50, normed=True)
xlabel('Sum')
Out[2]:
<matplotlib.text.Text at 0x7fa2e953d350>

Summing M-tuples of uniform random numbers

In [3]:
npts = 100000
for M in xrange(2, 7):
    results = random.rand(M, npts)
    totals = sum(results, axis=0)
    hist(totals, bins=50, normed=True)

Note on the histogram below:

  • The histtype='step' keyword argument just plots the outline of the histogram. This avoids the overlapping seen above, which obscures the relationship between the histograms.
  • Each histogram can be labeled using the label keyword argument.
  • The function legend allows us to identify different plots on the same axes, using the labels given.
In [4]:
npts = 100000
for M in xrange(2, 7):
    results = random.rand(M, npts)
    totals = sum(results, axis=0)
    hist(totals, bins=50, normed=True, histtype='step', label=str(M))
legend()
Out[4]:
<matplotlib.legend.Legend at 0x7fa2c246c7d0>

Generating Samples from a Non-Uniform Distribution

The function random.rand generates samples from a uniform distribution. In general however, we would like a way to generate samples from any probability distribution. Such samples could (for example) be used as the input to a Monte Carlo simulation.

Rather than attempting to write different random number generators for each probability distribution, we would like to devise some function g such that g(u) has the required probability distribution when u is a uniform random number.

The method used is as follows:

  1. Let v be a random number with probability density function p(v).
  2. Let u = $\int$ p(v) dv.
  3. Calculate the integral, to give u as a function of v.
  4. Solve for v as a function of u, given by v = g(u).
  5. The random number v will then have the required p.d.f., p(v)

Example: p(v) = $\frac{3}{8} v^2$ on the interval [0, 2]

Following the steps above, we have:

  1. u = $\int$ p(v) dv = $\int \dfrac{3}{8} v^2 dv = \dfrac{v^3}{8} + C$
  2. Since the interval [0, 1] for u maps to the interval [0, 2] for v, then u = 0 corresponds to v = 0.
  3. Solcing for C gives C = 0, so u = $\dfrac{v^3}{8}$.
  4. Solving for v then gives v = 2$\sqrt[3]{u}$ = g(u).

This can be tested by plotting the normed histogram of v and the probability density function p(v) on the same graph. They should match.

In [5]:
u = random.rand(100000)
v = 2*u**(1./3)
hist(v, normed=True, bins=50, fc='c', label='v = g(u)')
x = linspace(0, 2, 400)
y = .375*x**2
plot(x, y, 'r', lw=2, label='p(v)')
xlabel('v')
legend(loc='upper left')
Out[5]:
<matplotlib.legend.Legend at 0x7fa2c1ff1890>

The Gaussian or "Normal" Distribution

The Gaussian or "normal" distribution occurs frequently in the natural or social sciences. We saw that the sum of M-tuples converges to a normal distribution as M increases. This result is in accordance with the "Central Limit Theorem", one of the most important theorems in probability.

The "standard normal" distribution is a normal distribution centered on the origin with a standard deviation of 1. The probability density function for this distribution is:

$p(v) = \dfrac{1}{\sqrt{2 \pi}} e^{-v^2/2}$

Since samples drawn from a normal distribution are frequently needed for simulation, the numpy.random module includes a function for this, random.normal.

In [6]:
npts=100000
x = random.normal(size=npts)
figure(figsize=(8,5))
hist(x, bins=50, normed=True, fc='c', label = 'random.normal')
v = linspace(-6, 6, 400)
pv = exp(-.5*v**2)/(sqrt(2*pi))
plot(v, pv, 'r', lw=2, label = r'$p(v) = \frac{e^{-v^2/2}}{\sqrt{2\pi}}$')
xlabel('v')
legend()
Out[6]:
<matplotlib.legend.Legend at 0x7fa2c23aac10>

Contour Plots

A contour plot is a technique for representing a function of 2 variables in a 2D plane by plotting the level curves of the function.

Matplotlib provides two functions for contour plotting:

  • contour just plots the level curves of the function.
  • contourf fills the areas between the level curves with solid color.

The syntax is contour(x, y, z).

  • x is a 1D array of x-axis grid values.
  • y is a 1D array of y-axis grid values.
  • z is a 2D array of values f(x, y), corresponding to points (x, y).

There are lots of options to specify the levels, colors, etc.

In [7]:
x = linspace(2, 8, 100)
y = linspace(0, 6, 100)
X, Y = meshgrid(x, y)
Z = (X - 5)**2 + (Y - 3)**2 + 17
contour(x, y, Z)
Out[7]:
<matplotlib.contour.QuadContourSet instance at 0x7fa2c25392d8>

The function colorbar adds a bar describing the values of the function on the level curves.

In [8]:
x = linspace(2, 8, 100)
y = linspace(0, 6, 100)
X, Y = meshgrid(x, y)
Z = (X - 5)**2 + (Y - 3)**2 + 17
contour(x, y, Z)
colorbar()
Out[8]:
<matplotlib.colorbar.Colorbar instance at 0x7fa2c223f4d0>
In [9]:
x = linspace(2, 8, 100)
y = linspace(0, 6, 100)
X, Y = meshgrid(x, y)
Z = (X - 5)**2 + (Y - 3)**2 + 17
contourf(x, y, Z)
colorbar()
Out[9]:
<matplotlib.colorbar.Colorbar instance at 0x7fa2c2718098>

The cmap keyword argument allows a different colormap to be used, which maps numbers to colors.

Google "matplotlib colormap" to see the range of different colormaps available.

In [10]:
x = linspace(2, 8, 100)
y = linspace(0, 6, 100)
X, Y = meshgrid(x, y)
Z = (X - 5)**2 + (Y - 3)**2 + 17
contourf(x, y, Z, cmap='cool')
colorbar()
Out[10]:
<matplotlib.colorbar.Colorbar instance at 0x7fa2c1d56ef0>