780.20: 1094 Session 12

Handouts: gaussian_random.cpp, random_walk.cpp, and other printouts.

Today we'll play some games with the GSL random number generators.

Your goals for today:

Please work in pairs (more or less). The instructors will bounce around 1094 and answer questions.

Random Number Generation

The program gaussian_random.cpp calls GSL routines to generate both uniformly distributed and gaussian distributed numbers.

  1. Look at the gaussian_random.cpp code (there is a printout) and identify where the random number generators are allocated, seeded, and called. If you were creating a RandomNumber class, what statement(s) would you put in the constructor and destructor? What private variables would you define?

    Compile and link the code (use make_gaussian_random) then generate pairs of uniformly and gaussian distributed numbers in random_numbers.dat.
  2. Devise and carry out a way to use gnuplot to roughly check that the random numbers are uniformly distributed. [Hint: Your eye is a good judge of nonuniformity in two dimensions.] What did you do?

  3. You can check the distributions more quantitatively by making histograms of the random numbers. Think about how you would do that. Then take a look at gaussian_random_new.cpp, which has added crude histogramming (as well as automatic seeding). Use the makefile to compile and run with about 100,000 points. Look at random_histogram.dat. Use gnuplot to plot appropriate columns (with appropriate ranges of y) to check the uniform and gaussian distributions. Do they look random?

  4. Run gaussian_random_new.plt to plot and fit the gaussian distributions with gnuplot. Try 1,000,000 points and 10,000 points. Do you reproduce the parameters of the gaussian distribution? Attach a plot. (You may need to set b to a reasonable starting point such as the approximate peak height to get a useful fit.)

  5. [Extra] Take a look at (and run) rolling_dice.nb to see how to do random numbers in Mathematica.

Random Walking

We'll generate random walks in two dimensions using method 2 from the list in Section b of the Session 12 notes. In particular we'll start at the origin: (x,y) = (0,0) and for each step select Delta_x at random in the range [-sqrt(2), sqrt(2)] and Delta_y in the same range. So positive and negative steps in each direction are equally likely. The code random_walk.cpp implements this plan.

  1. What is the rms step length?

  2. Look at the random_walk.cpp code and identify where the random number generator is allocated, seeded, and called. Compile and link the code (use make_random_walk) and generate a random walk of 6400 steps.
  3. Plot the random walk (stored in "random_walk.dat") using gnuplot (use "with lines" to connect the points). Repeat a couple of times to get some intuition for what the walks look like.
  4. Check (using an editor) for the endpoints of a few walks. Roughly how does a typical distance R from the origin scale with N? (Can you reproduce the derivation from the notes of how the average of R scales with N?)

  5. Now we'll study more systematically how the final distance from the origin R = sqrt(x_final^2 + y_final^2) scales with the number of steps N. Note that now we don't need to save anything from a run except the value of R. The value of R will fluctuate from run to run, so for each N we want to average over a number of trials. How many trials should you use?

    Edit the code to make multiple runs for each value of N and takes the average of R. Make (and attach) an appropriate plot that reveals the dependence of R on N. [The code random_walk_length.cpp and plot file random_walk_length.plt implement this task. Try it yourself before looking at those.] Does it agree with expectations?

Monte Carlo Integration: Uniform Sampling

Your goal is to calculate the 10-dimensional integral of

  (x1 + x2 + x3 + ... + x10)^2
where each of the variables ranges from 0 to 1. The exact answer is 155/6. The basic Monte Carlo method is described in Section d of the Session 12 notes. In particular, equations (12.15) and (12.16) show that the integral is given approximately by the range(s) times the average of the function evaluated at N random vectors. (So for a 5-dimensional integral, each vector is a set of 5 random numbers {x1,x2,x3,x4,x5}.)

[Unless you are way ahead, use mc_integration.cpp, make_mc_integration, and mc_integration.plt below instead of writing your own.]

  1. Modify gauss_random.cpp or random_walk.cpp (copy one of them to mc_int.cpp) to make a Monte Carlo estimate of the 10-dimensional integral using N vectors (where N is input as in random_walk.cpp). Run it a few times with different seeds.
  2. Now for each value of N, make 16 trials and take the average as your official estimate of the integral.
  3. Try sample sizes of N = 10, 20, 40, ..., (i.e., from 10 to some suitably large number by multiples of 2).
  4. Plot the absolute value of the error versus N on a log-log plot and try to identify the approximate dependence.

Monte Carlo Integration: GSL Routines

Run a test program to do the same integral as in the last section but with vegas and miser.

  1. Take a look at the program gsl_monte_carlo_test.cpp while also looking at Monte Carlo integration in the online GSL library.
  2. The integral above is not a great test. After compiling and running the program, change the integrand to something more interesting (use your imagination!). Don't worry about knowing the exact answer; compare the results from the different routines. What do you find?

C++ Class for a Random Walk

The random walk code random_walk.cpp is basically written as a C code with C++ input and output. Here we reimplement the code as a C++ class.

  1. In the RandomWalk directory, compile and link RandomWalk_test (using make_RandomWalk_class_test). Run it to generate "RandomWalk_test.dat", which you should plot with gnuplot to verify that the output looks the same as from random_walk.cpp.
  2. Compare the old and new code (you have printouts of each). Discuss with your partner the advantages (and any disadvantages) of the definition of RandomWalk as a class. List some.

  3. An advantage of programming with classes is the ease of extending or generalizing a code. List two ways to extend the class definition.

  4. As time permits, modify the code to do the following:

Checking for Bad Input Values

You may have noticed that the nonlinear pendulum codes with the menus went crazy if you accidently typed a letter or a symbol rather than an integer. Here we see how to avoid this.

  1. Use make_input_check to create input_check and run it. First try some integers and then -1 to see how it is supposed to work. Then restart it and put in a letter or symbol (or whatever you want). To stop it, use Ctrl-c.
  2. A fix is given in input_check_fixed (which has its own makefile. Try it out. Look at the printout or on the screen to see what was done. What was done to fix it?

Debugging Revisited: Factorial Program

Time for more debugging practice. This time we have a program called factorial_debug.cpp that recursively calculates factorials and stores them in arrays.

  1. Try to compile using make_factorial_debug.cpp. There should be an error about a "variable-sized" array. The problem is that we can't declare stored_factorials when n_max is not a constant. Solution? Add "const" before "int n_max = 5;". Try it!
  2. Ok, now it should compile but it dies with a segmentation fault. Use GDB, including a traceback, to identify and fix the problem. What was wrong?

  3. Now there is either a new segmentation fault or wrong output (depending on your computer). What is the problem?

  4. With that fixed, you can get to the part of the code where an array is dynamically generated with "new". What statement would you use to create an array called "x_array" of doubles with integer length "x_length"?

  5. If you put in too large an integer, bad output happens. How could you store factorials differently to avoid this problem?

780.20: 1094 Session 12. Last modified: 09:28 am, February 18, 2011.