.. Created by Adam Cunnningham on Fri Jun 3 2016.
.. Major rewrite on Thurs Oct 27 2016.
.. Separated report description and hints, Weds Dec 14 2016.

.. include:: isogrk3.rst

**The Generation and Use of Random Numbers**
============================================

Report Description
------------------

**Exercise 1. Linear Congruential Generators**

- Write a function ``rng`` that implements a linear congruential generator with the values a = 427419669081, m = 999999999989.
- Write a function ``randu`` that implements a linear congruential generator with the values a = 2 :superscript:`16` + 3, m = 2 :superscript:`31`.
- Test the linear congruential generators in your ``rng`` and ``randu`` functions, and the generator in **random.rand** for (i) uniformity, and (ii) lack of (coarse-scale) successive pair correlation.

**Exercise 2. Estimate a Mean using Monte Carlo Integration**

- Download/save the :download:`mystery <../Reports/mystery.pyc>` module from the website to the same directory where Jupyter notebook is being run.
- Use Monte Carlo integration to find the mean of the function ``mysteryf`` in the mystery module by random sampling over the interval [0, 5].

**Exercise 3. Estimate an Area using Monte Carlo Integration**

- Use Monte Carlo integration to estimate the area of the "flower" whose boundary has the equation :math:`r(\theta) = 2 + \cos(7\theta)` in polar coordinates.

**Exercise 4. Bernoulli Random Variables**

- Numerically estimate the probability of getting 29 hits in 100 swings in a baseball game if the probability of getting a hit on any one swing is 0.29 (and is independent of other swings).

**Exercise 5. Sums of Random Variables**

- Produce a histogram of the sums of pairs of uniformly distributed random numbers produced by **random.rand**.
- Produce histograms for the sums of M-tuples of uniform random numbers for values of M greater than 2.
- See if it is possible to shift and stretch the sums of M-tuples so that their distribution converges to something as M increases without bounds.
- Plot the sums of M-tuples of non-uniform random numbers to see if these sums also converge to some limiting distribution. There is no need to do any kind of detailed analysis here.

**Exercise 6. Stock market Model**

- Implement the simple stock price model defined in class using a value for |mu| of 0.0 and |sigma| of 0.05 to find out what happens to an initial investment of 1 unit of money over 365 time steps. The model is:

    .. math:: s(t + 1) = (1 + \mu + \sigma \varepsilon) s(t)`

    where:

    - s(t) is the stock price at time t.
    - |mu| is an underlying rate of growth per step.
    - |epsiv| is a "stochastic" or random term with a normal distribution.
    - |sigma| is the volatility, the amplitude of the stochastic term.

- Plot the changing price of several stocks over this period.
- Plot a histogram of final stock prices after 365 steps using a large number of trials.
- Estimate the "expected value" (average over all trials) of the investment after 1 year.
- Calculate the likelihood of losing money. This is the proportion of trials where s(365) ends up less than 1.
- Calculate the "most likely" value of the investment after 1 year, using an appropriate definition of "most likely".
- Draw some conclusion about investing, or on the value of money.
- Run another simulation using a more modest value for |sigma| of 1%. Describe how changing |sigma| affects the final distribution i.e. the histogram of final prices.

Hints and Suggestions for the Report
------------------------------------

Linear Congruential Generator
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

- Testing for uniformity is done by plotting a histogram showing the distribution of the points in the interval [0, 1].
- Testing for pairwise correlation is done by plotting successive pairs :math:`(x_i, x_{i+1})` as points in the plane.
- Use enough bins for the histogram to provide a clear idea of the shape of the distribution.
- Use enough points that every bin in the histogram contains a large number - the more points the better.

Finding the Mean of the "Mystery" Function
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

- Use an appropriately large number of samples generated using **rand**.
- Scale the points returned by **rand** so they are uniform in the interval [0, 5], not [0, 1].
- Import the function mysteryf using the code: :code:`from mystery import mysteryf`
- The formula for the mean of a function is:

.. math::
    \overline{f} = \dfrac{1}{n} \sum_{i=1}^{n} f(x_i)

- This can be calculated using NumPy array operations and **mean** without needing "for" loops.
- The mean can be estimated a few times to get an idea of how accurate the estimate is.

Counting the Points Inside the "Flower"
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

- The "flower" is contained in a bounding box [-3, 3] :math:`\times` [-3, 3] in the xy-plane.
- Use **rand** to generate a large number of points, then scale the output of this function so that points (x, y) are uniform inside this box. An appropriate transformation is :math:`u \rightarrow 6u - 3`.
- Use **hypot** and **arctan2** to find the values of r and |theta| corresponding to each (x, y) pair.
- If a point (x, y) has polar coordinates (r, |theta|), then it is inside the curve if r < 2 + cos(7\ |theta|).

Bernoulli Random Variable
~~~~~~~~~~~~~~~~~~~~~~~~~

- Treat each swing as a Bernoulli random number with p = 0.29.
- Numerically estimate the probability by running a large number of trials, each of which consists of 100 swings. We want the proportion of trials with exactly 29 hits.
- The most compact way to do this is to start with a 2D NumPy array of shape (100, ntrials) generated using **rand**.
- A combination of operations applied directly to this array (to turn a uniform random number into a Bernoulli random number, for example), and an axis-dependent operation (for counting the hits per trial), will allow the result to be found without using any "for" loops.

Generating Histograms of M-Tuple Sums
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

- Use **rand** to create a 2D NumPy array of shape (M, npts).
- Sum all the M-tuples using **sum** as an axis-dependent operation i.e. by specifying which axis to sum over.

Finding the "Right" Scaling
~~~~~~~~~~~~~~~~~~~~~~~~~~~

- The "right" scaling is a function of M. When divided by this scaling, the histograms produced for different values of M should converge to the same distribution as M increases without bound.
- For each histogram generated for a different value of M, first subtract the mean (M/2) to shift the histogram so that it is centered on zero.
- Use the *normed=True* keyword argument to **hist** to turn the histogram into a probability distribution.
- Consider using the *histtype='step'* keyword argument when plotting multiple histograms on the same axes. This just plots the outline of a histogram, so it may be easier to see when convergence is happening.

M-Tuples of Non-Uniform Random Numbers
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

- Choose some other non-uniform probability density function, p(v). It can be any function you like, as long as both:

    - :math:`p(v) \geq 0` for all v
    - :math:`\int\limits_{-\infty}^{\infty} p(v) dv = 1`.
- Construct a function g(u) so that v = g(u) has the pdf p(v). Remember from class that the steps to construct this function are:

    1. Let :math:`u = \int p(v) dv`
    2. Plug in your probability density function p(v), and integrate to find u as a function of v.
    3. Solve for v as a function of u, given by v = g(u).
- Sum M-tuples of large numbers of samples drawn from p(v) using v = g(u). Remember that using a NumPy 2D array of shape (M, npts) will allow this kind of summing to be performed very compactly.

Plotting the Changing Stock Prices
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

- The random |epsiv| term needs to be calculated separately for every step.
- A "for" loop will be needed since the stock price each day is calculated from the previous day.
- Prices :math:`\{s(0), s(1), ..., s(365)\}` should be saved either to a list or into an array.
- Plot enough stock prices to see how much variation there is in the different stock trajectories.
- Note that **plot** can be passed a 2D array to plot. When this is done, each column is plotted as a separate line.

Plotting the Histogram of Final Prices
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

- A large number of stock prices need to be tracked to get a good idea of the final distribution.
- A 1D array can be used to store the current prices of the different stocks. These prices can all be updated together each day inside the "for" loop.
- After the "for" loop is complete, this 1D array will contain the final price for every stock.

Analyzing the Distribution of Final Prices
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

- The array of final stock prices can be used to find both the expected value, the likelihood of losing money, and the "most likely" value of the investment after 1 year.