[vox-tech] random number question

vox-tech@lists.lugod.org vox-tech@lists.lugod.org
Wed, 15 May 2002 13:34:26 -0400


Pete,

  Didn't study heavy math, this is what I have gathered... if someone
with a heavy math background would like to chip in that would be great.

    Later,
      Mike

On Wed, May 15, 2002 at 09:23:07AM -0700, Peter Jay Salzman wrote:
> using the same seed will produce the same set of random numbers.
> 
> i assume this is NOT true across reboots?

  False, sequences generated will be the same across reboots.


> and i assume it's not true across different computers?

  False, sequences will be the same across machines with the same CPU (1)
using the same algorithm.

1: The main question is that the algorithm be the same, for integer
   based algorithms (which I think rand is) the CPU may not mater at all.
   Just don't count on being able to generate the same sequence between
   a sparc, alpha, and intel line of processor until you test it.

from rand(3)   (random(3) has almost the same text)
#      The srand() function sets its argument as the seed  for  a
#      new  sequence  of pseudo-random integers to be returned by
#      rand().  These sequences are repeatable by calling srand()
#      with the same seed value.


> i'd do the experiment, but i'd rather not reboot my system right now.
> does anyone know the answer offhand?

  The c library rand and random functions produce a sequence of 
pseudo-random numbers based on an initial seed value.  This means 
that while the sequence itself is not random (every single time
the algorithm is started with a seed of X it will spit out certain
numbers in a certain order).

  In practice if you seed the number generator with the current
time and you do not run your program more than once a second, you 
will get good enough "random" numbers for most purposes.

  These algorithms also tend to "cycle" back upon themselves, so
if they spit out (10, 5, 8, 20, ...) sometime millions of reads
later the same millions of numbers will be pulled in the same order.
  In one complete cycle the numbers in a sequence should be
randomly distributed in the range of output numbers... so if the 
outputs are between 0 and 1999 and a cycle takes 2 million reads 
each number should have been hit 1000 times each.

  I think each sequence is completely independent of the others...
in that the order of numbers in one cycle of all sequences will be
different.


  Sometimes is very valuable to be able to have pseudo-random numbers
which can re-generate a certain sequence of numbers: this is because if 
your program uses "rand" and crashes sometimes, you can set the seed
value to some constant number that you know leads to a crash then rerun
a few times with gdb to figure out what is going wrong...


  As I said to Mark last time stuff about rand/random came up.  If
you want lots of non-reproducible random numbers you should open
up the /dev/urandom device and read bytes from there... from read to
read you will get randomized stuff.  If you are doing something
cryptographically sensitive then open /dev/random, but don't try to
read much data from that device it will block readers until it
has enough "collected entropy"...

  I remember reading that some work was put into preventing local DOS
of server applications, but I don't know how well that works.  You may 
be able to slow things like ssh session creation down by attaching a
bunch of processes to /dev/random and reading all they can
  "for l in `seq 1 100`; do cat /dev/random > /dev/null; done"

from random(4)
#      The random number generator  gathers  environmental  noise
#      from  device  drivers  and  other  sources into an entropy
#      pool.  The generator also keeps an estimate of the  number
#      of  bit  of  the  noise  in  the  entropy pool.  From this
#      entropy pool random numbers are created.
#
#      When read, the /dev/random device will only return  random
#      bytes  within the estimated number of bits of noise in the
#      entropy pool.  /dev/random should  be  suitable  for  uses
#      that  need  very  high quality randomness such as one-time
#      pad or key generation.  When the entropy  pool  is  empty,
#      reads  to /dev/random will block until additional environ-
#      mental noise is gathered.
#
#      When read, /dev/urandom device will return as  many  bytes
#      as are requested.  As a result, if there is not sufficient
#      entropy in the entropy pool, the returned values are theo-
#      retically  vulnerable  to  a  cryptographic  attack on the
#      algorithms used by the driver.  Knowledge  of  how  to  do
#      this is not available in the current non-classified liter-
#      ature, but it  is  theoretically  possible  that  such  an
#      attack  may  exist.  If this is a concern in your applica-
#      tion, use /dev/random instead.