.. Created by Adam Cunnningham on Fri Jun 3 2016.
.. Major rewrite on Thurs Oct 27 2016.
.. Separated report description and hints, Weds Dec 14 2016.

.. include:: isogrk3.rst

.. |rarrow|  unicode:: U+2192 .. RIGHT ARROW

.. |x0| replace:: x\ :sub:`0`
.. |x_i| replace:: x\ :sub:`i`
.. |xi-1| replace:: x\ :sub:`i-1`
.. |xi+1| replace:: x\ :sub:`i+1`

.. |ui| replace:: u\ :sub:`i`
.. |ui+1| replace:: u\ :sub:`i+1`

**The Generation and Use of Random Numbers**
============================================

Linear Congruential Generators
------------------------------

A *linear congruential generator* generates a sequence of numbers in the interval [0, 1) using the following algorithm:

- Pick a multiplier a and modulus m, both large integers.
- Pick a starting integer |x0| (your choice) in {1, 2, ... , m - 1}.
- Generate a sequence of real numbers |ui| by:

.. math::

    x_{i} &= a * x_{i-1} \mod m \\
    u_{i} &= x_{i}/m

Report Description
------------------

**Exercise 1. Linear Congruential Generators**

- Write a function ``rng`` that implements a linear congruential generator with the values a = 427419669081, m = 999999999989.
- Write a function ``randu`` that implements a linear congruential generator with the values a = 2 :superscript:`16` + 3, m = 2 :superscript:`31`.
- Test the linear congruential generators in your ``rng`` and ``randu`` functions, and the generator in **numpy.random.rand** for (i) uniformity, and (ii) lack of (coarse-scale) successive pair correlation.

**Exercise 2. Estimate the Value of Pi**

- Use Monte Carlo Integration to estimate the area of a circle of radius 1 (which gives us an estimate of the value of pi).

**Exercise 3. Bernoulli Random Variables**

In a game of tennis, Serena has a 60% probability to win any given point against Venus. Games are won by the first player to get to 4 points.

- What do you expect Serena's probability of winning a game to be - less than, greater than, or equal to 60%?
- Simulate the results of a large number of games.
- Plot a histogram of the number of points scored by Serena in each game.
- Numerically estimate the probability of Serena winning a game. Is this the result you expected?

**Exercise 4. Sums of Random Variables**

- Produce a histogram of the sums of pairs of uniformly distributed random numbers produced by **numpy.random.rand**.
- Produce histograms for the sums of M-tuples of uniform random numbers for values of M greater than 2.
- See if it is possible to shift and stretch the sums of M-tuples so that their distribution converges to something as M increases without bounds.

**Exercise 5: The Exponential Distribution**

The *exponential distribution* is used to model the time interval between events that occur randomly and independently in time. The probability density function for this distribution is:

.. math:: p(t) = \begin{cases} \lambda e^{-\lambda t} & t \geq 0 \\ 0 & t < 0 \end{cases}

where |lambda| > 0 is a parameter governing the mean time interval between events.

An example of an exponential distribution is radioactive decay, such as when Carbon-14 atoms decay to Nitrogen-14. In this case, the exponential distribution is a probability density function describing the time for an atom to decay. Since the mean time for an atom to decay is 8267 years, |lambda| is given by 1/8267.

- Using |lambda| = 1/8267, construct a function g(u) so that t = g(u) has the pdf p(t) given above.
- Confirm that the histogram of a large number of random numbers generated using t = g(u) does actually match the probability density function p(t).
- Plot the sums of M-tuples of exponentially distributed random numbers to see if these sums also converge to some limiting distribution. There is no need to do any kind of detailed analysis here.

**Exercise 6. To Switch or Not to Switch**

The math department is having a competition. There are three sealed boxes. Two of the boxes are empty, and the third box contains a bonus quiz card for MTH 337 (we don't have a lot of money for prizes). You're not told which box contains the card, but the instructor knows which one does.

You choose a box. The instructor then opens one other box, and shows you that it's empty. You are then offered a choice. You can stick with the box you initially chose, or you can switch to the remaining other box. What should you do?

- Describe the choice you would make, and why.
- Using a large number of trials, simulate the result of sticking with your original choice, and of switching to the other box.
- How often would you win (on average) using each of the two strategies?

**Exercise 7. Stock Market Model**

The simple stock price model defined in class is:

    s(t + 1) = (1 + |mu| + |sigma| |epsiv|) s(t)

    where:

    - s(t) is the stock price at time t.
    - |mu| is an underlying rate of growth per step.
    - |epsiv| is a "stochastic" or random term with a normal distribution.
    - |sigma| is the volatility, the amplitude of the stochastic term.

- Implement this model using a value for |mu| of 0.0 and |sigma| of 0.05 to find out what happens to an initial investment of 1 unit of money over 365 time steps.
- Plot the changing price of several stocks over this period.
- Plot a histogram of final stock prices after 365 steps using a large number of trials.
- Estimate the "expected value" (average over all trials) of the investment after 1 year.
- Calculate the likelihood of losing money. This is the proportion of trials where s(365) ends up less than 1.
- Calculate the "most likely" value of the investment after 1 year, using an appropriate definition of "most likely".
- Draw some conclusion about investing, or on the value of money.
- Run another simulation using a more modest value for |sigma| of 1%. Describe how changing |sigma| affects the final distribution i.e. the histogram of final prices.

Hints and Suggestions for the Report
------------------------------------

Linear Congruential Generators
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

- Testing for uniformity is done by plotting a histogram showing the distribution of the points in the interval [0, 1].

    - Use enough bins for the histogram to provide a clear idea of the shape of the distribution.
    - Use enough points that every bin in the histogram contains a large number - the more points the better.
- Testing for pairwise correlation is done by plotting successive pairs (|ui|, |ui+1|) as points in the plane.

    - Use lots of points, to see how effectively the points fill the unit square.
    - Use a small marker size, so that markers do not overlap.

Estimating Pi
~~~~~~~~~~~~~

- The circle is contained in a bounding box [-1, 1] :math:`\times` [-1, 1] in the xy-plane.
- Use **rand** to generate a large number of points, then scale the output of this function so that points (x, y) are uniform inside this box. An appropriate transformation is u |rarrow| 2u - 1.
- Use **hypot** to find the value of r corresponding to each (x, y) pair.
- A point (x, y) is inside the circle if r < 1.

Bernoulli Random Variables
~~~~~~~~~~~~~~~~~~~~~~~~~~

- Treat each point as a Bernoulli random number with p = 0.60.
- Numerically estimate Serena's probability of winning a game by running a large number of games, each of which consists of 7 points. We want the proportion of games where Serena scores 4 or more points.
- The most compact way to do this is to start with a 2D NumPy array of shape (7, ngames) generated using **rand**.
- A combination of operations applied directly to this array (to turn a uniform random number into a Bernoulli random number, for example), and an axis-dependent operation (for counting the points per game), will allow the result to be found without using any "for" loops.

Generating Histograms of M-Tuple Sums
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

- Use **rand** to create a 2D NumPy array of shape (M, npts).
- Sum all the M-tuples using **sum** as an axis-dependent operation i.e. by specifying which axis to sum over.

Finding the "Right" Scaling
~~~~~~~~~~~~~~~~~~~~~~~~~~~

- The "right" scaling is a function of M. When divided by this scaling, the histograms produced for different values of M should converge to the same distribution as M increases without bound.
- For each histogram generated for a different value of M, first subtract the mean (M/2) to shift the histogram so that it is centered on zero.
- Use the ``normed=True`` keyword argument to **hist** to turn the histogram into a probability distribution.
- Consider using the ``histtype='step'`` keyword argument when plotting multiple histograms on the same axes. This just plots the outline of a histogram, so it may be easier to see when convergence is happening.

The Exponential Distribution
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

- Construct a function g(u) so that t = g(u) has the exponential pdf p(t). Remember from class that the steps to construct this function are:

    1. Let :math:`u = \int p(t) dt`
    2. Plug in your probability density function p(t), and integrate to find u as a function of t.
    3. Solve for t as a function of u, given by t = g(u).
- Sum M-tuples of large numbers of samples drawn from p(t) using t = g(u). Remember that using a NumPy 2D array of shape (M, npts) will allow this kind of summing to be performed very efficiently.

To Switch or Not to Switch
~~~~~~~~~~~~~~~~~~~~~~~~~~

- Use an integer from the set {1, 2, 3} to represent each box.
- Use ``numpy.random.randint`` to randomly select winning boxes for a large number of trials.
- Construct an array of box choices made using each of the two strategies.
- Compare the winning box to your choice in both cases using ``==``, then use ``np.sum`` to sum the number of wins.

Plotting the Changing Stock Prices
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

- The random |epsiv| term needs to be calculated separately for every step.
- A "for" loop will be needed since the stock price each day is calculated from the previous day.
- Prices {s(0), s(1), ..., s(365)} should be saved either to a list or into an array.
- Plot enough stock prices to see how much variation there is in the different stock trajectories.
- Note that **plot** can be passed a 2D array to plot. When this is done, each column is plotted as a separate line.

Plotting the Histogram of Final Prices
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

- A large number of stock prices need to be tracked to get a good idea of the final distribution.
- A 1D array can be used to store the current prices of the different stocks. These prices can all be updated together each day inside the "for" loop.
- After the "for" loop is complete, this 1D array will contain the final price for every stock.

Analyzing the Distribution of Final Prices
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

- The array of final stock prices can be used to find both the expected value, the likelihood of losing money, and the "most likely" value of the investment after 1 year.