.. Created by Adam Cunnningham on Fri Jun 3 2016. .. Major rewrite on Thurs Oct 27 2016. .. Separated report description and hints, Weds Dec 14 2016. .. include:: isogrk3.rst .. |rarrow| unicode:: U+2192 .. RIGHT ARROW .. |x0| replace:: x\ :sub:`0` .. |x_i| replace:: x\ :sub:`i` .. |xi-1| replace:: x\ :sub:`i-1` .. |xi+1| replace:: x\ :sub:`i+1` .. |ui| replace:: u\ :sub:`i` .. |ui+1| replace:: u\ :sub:`i+1` **The Generation and Use of Random Numbers** ============================================ Linear Congruential Generators ------------------------------ A *linear congruential generator* generates a sequence of numbers in the interval [0, 1) using the following algorithm: - Pick a multiplier a and modulus m, both large integers. - Pick a starting integer |x0| (your choice) in {1, 2, ... , m - 1}. - Generate a sequence of real numbers |ui| by: .. math:: x_{i} &= a * x_{i-1} \mod m \\ u_{i} &= x_{i}/m Report Description ------------------ **Exercise 1. Linear Congruential Generators** - Write a function ``rng`` that implements a linear congruential generator with the values a = 427419669081, m = 999999999989. - Write a function ``randu`` that implements a linear congruential generator with the values a = 2 :superscript:`16` + 3, m = 2 :superscript:`31`. - Test the linear congruential generators in your ``rng`` and ``randu`` functions, and the generator in **numpy.random.rand** for (i) uniformity, and (ii) lack of (coarse-scale) successive pair correlation. **Exercise 2. Estimate the Value of Pi** - Use Monte Carlo Integration to estimate the area of a circle of radius 1 (which gives us an estimate of the value of pi). **Exercise 3. Bernoulli Random Variables** In a game of tennis, Serena has a 60% probability to win any given point against Venus. Games are won by the first player to get to 4 points. - What do you expect Serena's probability of winning a game to be - less than, greater than, or equal to 60%? - Simulate the results of a large number of games. - Plot a histogram of the number of points scored by Serena in each game. - Numerically estimate the probability of Serena winning a game. Is this the result you expected? **Exercise 4. Sums of Random Variables** - Produce a histogram of the sums of pairs of uniformly distributed random numbers produced by **numpy.random.rand**. - Produce histograms for the sums of M-tuples of uniform random numbers for values of M greater than 2. - See if it is possible to shift and stretch the sums of M-tuples so that their distribution converges to something as M increases without bounds. **Exercise 5: The Exponential Distribution** The *exponential distribution* is used to model the time interval between events that occur randomly and independently in time. The probability density function for this distribution is: .. math:: p(t) = \begin{cases} \lambda e^{-\lambda t} & t \geq 0 \\ 0 & t < 0 \end{cases} where |lambda| > 0 is a parameter governing the mean time interval between events. An example of an exponential distribution is radioactive decay, such as when Carbon-14 atoms decay to Nitrogen-14. In this case, the exponential distribution is a probability density function describing the time for an atom to decay. Since the mean time for an atom to decay is 8267 years, |lambda| is given by 1/8267. - Using |lambda| = 1/8267, construct a function g(u) so that t = g(u) has the pdf p(t) given above. - Confirm that the histogram of a large number of random numbers generated using t = g(u) does actually match the probability density function p(t). - Plot the sums of M-tuples of exponentially distributed random numbers to see if these sums also converge to some limiting distribution. There is no need to do any kind of detailed analysis here. **Exercise 6. To Switch or Not to Switch** The math department is having a competition. There are three sealed boxes. Two of the boxes are empty, and the third box contains a bonus quiz card for MTH 337 (we don't have a lot of money for prizes). You're not told which box contains the card, but the instructor knows which one does. You choose a box. The instructor then opens one other box, and shows you that it's empty. You are then offered a choice. You can stick with the box you initially chose, or you can switch to the remaining other box. What should you do? - Describe the choice you would make, and why. - Using a large number of trials, simulate the result of sticking with your original choice, and of switching to the other box. - How often would you win (on average) using each of the two strategies? **Exercise 7. Stock Market Model** The simple stock price model defined in class is: s(t + 1) = (1 + |mu| + |sigma| |epsiv|) s(t) where: - s(t) is the stock price at time t. - |mu| is an underlying rate of growth per step. - |epsiv| is a "stochastic" or random term with a normal distribution. - |sigma| is the volatility, the amplitude of the stochastic term. - Implement this model using a value for |mu| of 0.0 and |sigma| of 0.05 to find out what happens to an initial investment of 1 unit of money over 365 time steps. - Plot the changing price of several stocks over this period. - Plot a histogram of final stock prices after 365 steps using a large number of trials. - Estimate the "expected value" (average over all trials) of the investment after 1 year. - Calculate the likelihood of losing money. This is the proportion of trials where s(365) ends up less than 1. - Calculate the "most likely" value of the investment after 1 year, using an appropriate definition of "most likely". - Draw some conclusion about investing, or on the value of money. - Run another simulation using a more modest value for |sigma| of 1%. Describe how changing |sigma| affects the final distribution i.e. the histogram of final prices. Hints and Suggestions for the Report ------------------------------------ Linear Congruential Generators ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - Testing for uniformity is done by plotting a histogram showing the distribution of the points in the interval [0, 1]. - Use enough bins for the histogram to provide a clear idea of the shape of the distribution. - Use enough points that every bin in the histogram contains a large number - the more points the better. - Testing for pairwise correlation is done by plotting successive pairs (|ui|, |ui+1|) as points in the plane. - Use lots of points, to see how effectively the points fill the unit square. - Use a small marker size, so that markers do not overlap. Estimating Pi ~~~~~~~~~~~~~ - The circle is contained in a bounding box [-1, 1] :math:`\times` [-1, 1] in the xy-plane. - Use **rand** to generate a large number of points, then scale the output of this function so that points (x, y) are uniform inside this box. An appropriate transformation is u |rarrow| 2u - 1. - Use **hypot** to find the value of r corresponding to each (x, y) pair. - A point (x, y) is inside the circle if r < 1. Bernoulli Random Variables ~~~~~~~~~~~~~~~~~~~~~~~~~~ - Treat each point as a Bernoulli random number with p = 0.60. - Numerically estimate Serena's probability of winning a game by running a large number of games, each of which consists of 7 points. We want the proportion of games where Serena scores 4 or more points. - The most compact way to do this is to start with a 2D NumPy array of shape (7, ngames) generated using **rand**. - A combination of operations applied directly to this array (to turn a uniform random number into a Bernoulli random number, for example), and an axis-dependent operation (for counting the points per game), will allow the result to be found without using any "for" loops. Generating Histograms of M-Tuple Sums ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - Use **rand** to create a 2D NumPy array of shape (M, npts). - Sum all the M-tuples using **sum** as an axis-dependent operation i.e. by specifying which axis to sum over. Finding the "Right" Scaling ~~~~~~~~~~~~~~~~~~~~~~~~~~~ - The "right" scaling is a function of M. When divided by this scaling, the histograms produced for different values of M should converge to the same distribution as M increases without bound. - For each histogram generated for a different value of M, first subtract the mean (M/2) to shift the histogram so that it is centered on zero. - Use the ``normed=True`` keyword argument to **hist** to turn the histogram into a probability distribution. - Consider using the ``histtype='step'`` keyword argument when plotting multiple histograms on the same axes. This just plots the outline of a histogram, so it may be easier to see when convergence is happening. The Exponential Distribution ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - Construct a function g(u) so that t = g(u) has the exponential pdf p(t). Remember from class that the steps to construct this function are: 1. Let :math:`u = \int p(t) dt` 2. Plug in your probability density function p(t), and integrate to find u as a function of t. 3. Solve for t as a function of u, given by t = g(u). - Sum M-tuples of large numbers of samples drawn from p(t) using t = g(u). Remember that using a NumPy 2D array of shape (M, npts) will allow this kind of summing to be performed very efficiently. To Switch or Not to Switch ~~~~~~~~~~~~~~~~~~~~~~~~~~ - Use an integer from the set {1, 2, 3} to represent each box. - Use ``numpy.random.randint`` to randomly select winning boxes for a large number of trials. - Construct an array of box choices made using each of the two strategies. - Compare the winning box to your choice in both cases using ``==``, then use ``np.sum`` to sum the number of wins. Plotting the Changing Stock Prices ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - The random |epsiv| term needs to be calculated separately for every step. - A "for" loop will be needed since the stock price each day is calculated from the previous day. - Prices {s(0), s(1), ..., s(365)} should be saved either to a list or into an array. - Plot enough stock prices to see how much variation there is in the different stock trajectories. - Note that **plot** can be passed a 2D array to plot. When this is done, each column is plotted as a separate line. Plotting the Histogram of Final Prices ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - A large number of stock prices need to be tracked to get a good idea of the final distribution. - A 1D array can be used to store the current prices of the different stocks. These prices can all be updated together each day inside the "for" loop. - After the "for" loop is complete, this 1D array will contain the final price for every stock. Analyzing the Distribution of Final Prices ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - The array of final stock prices can be used to find both the expected value, the likelihood of losing money, and the "most likely" value of the investment after 1 year.