[vox-tech] random number question (longish)

Wed, 15 May 2002 14:22:07 -0700

hi mike,

begin msimons@moria.simons-clan.com <msimons@moria.simons-clan.com> 
> Pete,
> 
> - Could you explain you want in the way of random numbers?
> 
>   (reproducible or not, how many over the life of the program, how long
>   the program runs, etc...)

sure.  reproducible is a requirement, since i've already used "same seed
give same sequence" twice now to track down bugs in my program.

as for the number of random numbers:

a typical program of mine runs through 65,536 iterations where i don't
take data and 50,000,000 iterations where i do data.   each iteration
calls:

  /*  Returns an integer between high and low (inclusive)
  */
  int Irand(int low, int high)
  {
     return low + (int)( (double)(high-low+1) * rand()/(RAND_MAX + 1.0) );
  }

twice, and:

   /*  Returns a long double between 0 and 1 inclusive
   */
   long double LDrand(void)
   {
      return (long double)( (long double)rand() / RAND_MAX );
   }

once.  btw, should that last RAND_MAX be (RAND_MAX + 1.0L)?

so all in all, there are about 150,196,608 calls to rand() in this code.

the life of each random number is very short -- past any given
iteration, i throw away the 3 random numbers and generate 3 more for the
next iteration.

the heart of the program (it's much snipped) is:

   int spins[N][M];
   unsigned int seed;

   /* Seed the random generator and set the initial state */
   seed = SeedRandomGenerator();
   SetIC(UP, spins);
   E = UpEnergy(spins) + RightEnergy(spins);

   for(i=0; i<trials; ++i) {
      /* Choose the site to change */
      y = Irand(0, N-1);
      x = Irand(0, M-1);
      trialE = E - 2.0L * SingleSiteEnergy(y, x, spins);
      dE = trialE - E;

      /*  If we accept the change, change the spin at location (y,x) and
       *  update the energy.  If we reject the change, don't do anything.
       *  We don't need to check dE < 0 since expl(-beta*dE) would be > 0.
       */
      if (LDrand() < expl(-beta*dE)) {
         ChangeSpin(y, x, spins);
         E = trialE;
      }

      m = 0.0L;
      for (y=0; y<N; ++y) {
         for (x=0; x<M; ++x) {
            m += spins[y][x];
         }
      }

      AvgE    += E;
      AvgESqr += E*E;
      mAvg    += m;
      mabsAvg += fabsl(m);
      msqrAvg +=  m*m;

      /* End of iteration */
   }

   /* Examine results */

and i used your idea about /dev/random for the seed generation:

unsigned int SeedRandomGenerator(void)
{
   FILE *fp;
   unsigned int seed;

   if((fp = fopen("/dev/random", "r")) == NULL)
      die("Can't open /dev/random for reading.");
   fread(&seed, 1, sizeof(unsigned int), fp);
   fclose(fp);

   srand(seed);
   return(seed);
}

> > i remember, but reading a file is fine for setting a seed (that's what i
> > do) but for generating a random number, it can be unsuitable.  a monte
> > carlo simulation needs between 100,000 and 100,000,000 random numbers.
> > reading a file for all those numbers would be prohibitive.
> 
>   If there is a need for true random numbers and then /dev/urandom isn't 
> bad at all... even for large numbers of fetches.

even for this many random numbers?

>   Reading is about 25 times slower... but if any significant CPU work being 
> done with the random numbers after they are fetched, then the 25 times
> slower will be lost in the noise.

interesting.  as you can see, i'm not really doing complicated work
here.  it's mostly +, -, * and / with a couple of expl(), fabsl() in
each iteration.  it's certainly not like my semi-classical simulator
which is all about hardcore number crunching.

>   It wouldn't be hard to provide an interface that mimics the srand/rand
> interface so that at link time you pick if you want true random or pseudo
> random.
> 
> 
> Following times on a pentium 100 ...
> 
> call rand:
>   parent reporting 1048576 randoms fetched in 1.472942 seconds
>         Command being timed: "./rand"
>         User time (seconds): 1.27
>         System time (seconds): 0.01
>         Percent of CPU this job got: 85%
>         Elapsed (wall clock) time (h:mm:ss or m:ss): 0:01.49
> 
> /dev/urandom:
>   parent reporting 1048576 randoms fetched in 35.903687 seconds
>         Command being timed: "./read"
>         User time (seconds): 0.01
>         System time (seconds): 30.72
>         Percent of CPU this job got: 85%
>         Elapsed (wall clock) time (h:mm:ss or m:ss): 0:35.92

that looks significant to me.   ??

one thing i learned about numerical computing -- it's much different
from other types of coding.  often, simpler less elegant schemes are
preferable to faster more intricate schemes.  the problem is that often
a numerical scientist has no idea when the numbers are bad.  most of us
trade speed for tried-and-true'dness unless the code in question runs
prohibitively long.  i'd never code my physics stuff the way i'd code
linux-crypt, my usenet post sucker/canceler or my binary editor, for
example.  when something fails in those programs, it's obvious.

pete