[vox-tech] random number question (longish)
Peter Jay Salzman
vox-tech@lists.lugod.org
Wed, 15 May 2002 14:22:07 -0700
hi mike,
begin msimons@moria.simons-clan.com <msimons@moria.simons-clan.com>
> Pete,
>
> - Could you explain you want in the way of random numbers?
>
> (reproducible or not, how many over the life of the program, how long
> the program runs, etc...)
sure. reproducible is a requirement, since i've already used "same seed
give same sequence" twice now to track down bugs in my program.
as for the number of random numbers:
a typical program of mine runs through 65,536 iterations where i don't
take data and 50,000,000 iterations where i do data. each iteration
calls:
/* Returns an integer between high and low (inclusive)
*/
int Irand(int low, int high)
{
return low + (int)( (double)(high-low+1) * rand()/(RAND_MAX + 1.0) );
}
twice, and:
/* Returns a long double between 0 and 1 inclusive
*/
long double LDrand(void)
{
return (long double)( (long double)rand() / RAND_MAX );
}
once. btw, should that last RAND_MAX be (RAND_MAX + 1.0L)?
so all in all, there are about 150,196,608 calls to rand() in this code.
the life of each random number is very short -- past any given
iteration, i throw away the 3 random numbers and generate 3 more for the
next iteration.
the heart of the program (it's much snipped) is:
int spins[N][M];
unsigned int seed;
/* Seed the random generator and set the initial state */
seed = SeedRandomGenerator();
SetIC(UP, spins);
E = UpEnergy(spins) + RightEnergy(spins);
for(i=0; i<trials; ++i) {
/* Choose the site to change */
y = Irand(0, N-1);
x = Irand(0, M-1);
trialE = E - 2.0L * SingleSiteEnergy(y, x, spins);
dE = trialE - E;
/* If we accept the change, change the spin at location (y,x) and
* update the energy. If we reject the change, don't do anything.
* We don't need to check dE < 0 since expl(-beta*dE) would be > 0.
*/
if (LDrand() < expl(-beta*dE)) {
ChangeSpin(y, x, spins);
E = trialE;
}
m = 0.0L;
for (y=0; y<N; ++y) {
for (x=0; x<M; ++x) {
m += spins[y][x];
}
}
AvgE += E;
AvgESqr += E*E;
mAvg += m;
mabsAvg += fabsl(m);
msqrAvg += m*m;
/* End of iteration */
}
/* Examine results */
and i used your idea about /dev/random for the seed generation:
unsigned int SeedRandomGenerator(void)
{
FILE *fp;
unsigned int seed;
if((fp = fopen("/dev/random", "r")) == NULL)
die("Can't open /dev/random for reading.");
fread(&seed, 1, sizeof(unsigned int), fp);
fclose(fp);
srand(seed);
return(seed);
}
> > i remember, but reading a file is fine for setting a seed (that's what i
> > do) but for generating a random number, it can be unsuitable. a monte
> > carlo simulation needs between 100,000 and 100,000,000 random numbers.
> > reading a file for all those numbers would be prohibitive.
>
> If there is a need for true random numbers and then /dev/urandom isn't
> bad at all... even for large numbers of fetches.
even for this many random numbers?
> Reading is about 25 times slower... but if any significant CPU work being
> done with the random numbers after they are fetched, then the 25 times
> slower will be lost in the noise.
interesting. as you can see, i'm not really doing complicated work
here. it's mostly +, -, * and / with a couple of expl(), fabsl() in
each iteration. it's certainly not like my semi-classical simulator
which is all about hardcore number crunching.
> It wouldn't be hard to provide an interface that mimics the srand/rand
> interface so that at link time you pick if you want true random or pseudo
> random.
>
>
> Following times on a pentium 100 ...
>
> call rand:
> parent reporting 1048576 randoms fetched in 1.472942 seconds
> Command being timed: "./rand"
> User time (seconds): 1.27
> System time (seconds): 0.01
> Percent of CPU this job got: 85%
> Elapsed (wall clock) time (h:mm:ss or m:ss): 0:01.49
>
> /dev/urandom:
> parent reporting 1048576 randoms fetched in 35.903687 seconds
> Command being timed: "./read"
> User time (seconds): 0.01
> System time (seconds): 30.72
> Percent of CPU this job got: 85%
> Elapsed (wall clock) time (h:mm:ss or m:ss): 0:35.92
that looks significant to me. ??
one thing i learned about numerical computing -- it's much different
from other types of coding. often, simpler less elegant schemes are
preferable to faster more intricate schemes. the problem is that often
a numerical scientist has no idea when the numbers are bad. most of us
trade speed for tried-and-true'dness unless the code in question runs
prohibitively long. i'd never code my physics stuff the way i'd code
linux-crypt, my usenet post sucker/canceler or my binary editor, for
example. when something fails in those programs, it's obvious.
pete