User guide to built-in random number generators
Your basic function for generating random numbers that is "built-in" to Java, C++, C
generates a sequence of pseudo-random numbers. The system supplied random number generator
(abbreviated random()
starting now) should be regarded with a great deal of suspicion.
For one thing, the sequence cycles -- it repeats when random()
is called often
enough. For another thing, the numbers themselves do
not usually show the amount of randomness one really needs.
A typical implementation of random()
uses this recurrence relation:
ij+1 = a * ij + c
That is, each number in the sequence is computed from a simple arithmetic expression
using the previous number in the sequence. a
and c
are constants
that are local to random()
.
The sequence is initialized with a seed (i0
). The computed
ij+1
is both returned to the caller and kept as a local static variable that is
used on the subsequent call to random()
.
It is a very good idea when debugging to initialize the sequence with some fixed seed. That way, the random number sequences generated will be the same over all the program runs -- much easier to debug that way! After the program is debugged, you'll want to start the sequence with different seeds. The most common way to do that is to use some number related to the system clock (often, just a call to the system clock will do).
random()
produces a uniform distribution
A uniform distribution has a flat histogram: all numbers generated are equally likely.
One quick (but crude) test of random()
's quality is to build a histogram by
calling it a gazillion times and incrementing the count for each number returned from random()
.
Scaling and translating
Typical beginning computer science problems calling for random numbers involve rolling a die
or pulling a card out of a deck. In the case of the die, the random number you want will range
between 1 and 6 to correspond to the pips on the different die faces. The
random number generator you're using may supply int
s between 0 and RAND_MAX
(the maximum
int
value generated); or it may supply a double
between 0 and 1; or
it may supply an int
between 0 and a user supplied k
.
In any case, you'll almost certainly need to "massage" the number supplied by random()
so that it fits in your desired range. The way to do this is to scale and translate. Suppose
we want to work with a die (numbers ranging between 1 and 6). To scale,
you divide the random number obtained by the range of values over which random()
runs.
Suppose random()
returns n
, an int
number that range between 0 and RAND_MAX
.
Scale n
by dividing by RAND_MAX + 1
and multiplying by 6.
Watch out for integer division! Scaling with 6/(RAND_MAX+1)
changes the range of
[0 .. RAND_MAX
] to [0 .. 5 ]. Now translate by adding 1 to get your desired range of [1..6] .
Gaussian distribution
random()
produces a uniform distribution. Very often, you'll be wanting to generate
random data that has a Gaussian (normal, bell-shaped curve) distribution. How do you get from
uniform to Gaussian? Details can be found in Numerical Recipes (see References). But basically,
you'll call a different function (called Gaussian()
here) that uses random()
. I'm thinking I got this code from Numerical Recipes, but I'm
not sure now (and watch out for errors! I haven't checked it in years). This is C code for
a function that returns Gaussian distributed random numbers and an example driver (main()
).
The random number generator supplied with C is called rand()
.
#include < math.h > double Gaussian() { static int iset = 0; static float gset; double fac, rsq, v1, v2; if (iset == 0) { do { v1 = 2.0 * rand()/ (double) RAND_MAX - 1.0; v2 = 2.0 * rand()/ (double) RAND_MAX - 1.0; rsq = (v1 * v1) + (v2 * v2); } while (rsq >= 1.0 || rsq == 0.0); fac = sqrt ( -2.0 * log (rsq) / rsq); gset = v1 * fac; iset = 1; return v2 * fac; } else { iset = 0; return gset; } } void main () { int i; double result; printf("Calling Gaussian 10 times\n"); for (i=0; i< 10; i++) { result = Gaussian(); printf(">\t%f\n", result); } printf("Execution completed\n"); }
Like "massaging" random()
to get the right range, you'll need to
massage the results of Gaussian to get the right mean and standard deviation. Gaussian()
returns a sequence with mean 0.0, and standard deviation 1.0.
Poisson distribution
Most people have a good intuitive understanding of the Gaussian distribution, as well as
for uniform distribution. However, for queueing problems you need the Poisson distribution.
The Poisson distribution does the job for modelling random arrivals (in time) at
the queue, when the arrivals are independent of each other.
Each number in a sequence of Poisson distributed numbers says how many arrivals
there have been during the last time interval.
Suppose you obtain the Poisson distributed sequence: 0, 0, 1, 0, 3, 2, 0, 0, ...
The meaning of this is that during time interval t0, there were 0 arrivals.
During time interval t1, also 0 arrivals. During t2 there was 1 arrival.
During t3, 0. During t4, 3. Et cetera.
#include < stdlib.h > #include < math.h > int Poisson ( double ev ) { int n = 0; // counter of iterations double em; // e^(-ev), where v is the expected value double x; // pseudorandom number em = exp (-ev); x = rand() / (double) RAND_MAX; // check your C compiler docs // for the correct constant name while (x > em) { n++; x *= rand() / (double) RAND_MAX; } return n; } main () { int i; for (i = 0; i< 1000; i++) { printf("new Poisson value: %d\n", Poisson(.133333) ); } }In order to use
Poisson()
, you need to know ev
, the "load factor". This
is the average number of arrivals per unit of time. Suppose, for example, the unit of time
we are simulating is seconds. From empirical evidence, we also know that there are, on the
average, 75 arrivals per hour, that is 75 arrivals in 60 seconds, that is 1.25 arrivals per second.
ev
, then, is 1.25.
The way this Poisson()
is implemented, you can have interleaved calls to it with different
values of ev
.
Note also, that this Poisson handles the "massaging" differently. The only parameter
necessary to get the proper Poisson distribution is the load factor, ev
, and
that is supplied as a parameter to Poisson()
. On the other hand, random()
,
giving the uniform distribution, supplies numbers that you have to scale and translate to fit into
a desired range; Gaussian()
also must be scaled and translated.