[vox-tech] random number question

Jeff Newmiller vox-tech@lists.lugod.org
Wed, 15 May 2002 16:24:23 -0700 (PDT)


On Wed, 15 May 2002 msimons@moria.simons-clan.com wrote:

> On Wed, May 15, 2002 at 11:49:52AM -0700, Jeff Newmiller wrote:
> > On Wed, 15 May 2002 msimons@moria.simons-clan.com wrote:
> > >   I think each sequence is completely independent of the others...
> > > in that the order of numbers in one cycle of all sequences will be
> > > different.
> > 
> > Not quite sure I understand what you are saying here.  If you find
> > yourself at a particular point in the cycle, the numbers that follow are
> > predetermined. The problem is that the "cycle" for a good pseudo-random
> > number generator is often much longer than can be represented by the seed
> > value, so you may not have complete flexibility to choose where in the
> > cycle you will start.  A good seeding algorithm will distribute the
> > starting points throughout the cycle, so that each seed will appear to
> > yield a different sequence.  A bad one could assign multiple seed values
> > to the same points in the cycle. :)
> 
> Jeff,
> 
>   I'm not sure that this is correct, this is what I think happens...
> 
>   I think that there are X different cycles, not one cycle that the
> seed defines an initial offset into.  I'm not certain of this... so 
> if you know otherwise let me know and I'll stop thinking that.  ;)

After looking at the source, I think it is time for you to stop thinking
that. :)

The glibc 2.2 version of random can operate with either a linear
congruential generator (LCG) or one of four different varieties of linear
feedback shift register generators (LFSRG).  The default method is one of
the latter.

The LFSRG uses a working array of initially randomized (with a LCG) bits
arranged as longs, and an index (actually a pointer) in the array.  Each
time you request a new random number, it munges the value at the current
location with one a fixed "distance" behind to generate the new result,
and increments the index.  In effect, the array acts like a card "shoe" in
a casino, extending the random cycle far beyond the number of available
result values.  Another way to look at it is that the random number that
is generated has lots and lots of bits, and you only use some of them on
each iteration.

Note that an algorithm that yields x every single time you have y is not
really very random.  That is like playing poker and knowing that your
opponent cannot have four aces, because you do.  A truly random card sho
would allow all of the players to have four-aces... unlikely, but
possible.

srandom uses an LCG to expand the seed into enough bits to fill
the LFSRG array, gambling that this will yield a random distribution
of starting points throughout the full repetition cycle.

> 
>     Later,
>       Mike
> 
> ps: I do understand that algorithms where all seeds share a common cycle
>     just start at different offsets in it, exist and are easy to come up
>     with... I thought that the real rand functions don't have that property.
> 
> 
> to phrase what I'm thinking a few other ways:
> 
> what I was what I was trying to say there is...
> - there is a completely different number sequence for each seed
> - no two seeds cover the numbers in the same orders even if you shift
>   the numbers in a cycle around.
> - if you pretend that all the numbers in a cycle are one big loop or 
>   ring of numbers.  and you pull the loop for seed 1, and the loop for
>   seed 2, you will not be able to "spin" the loops around to have them
>   match number for number.
> - if you find a place in seed 1's list that gives (4, 12, 32) and you find
>   a place in seed 2's list that gives (4, 12, 32)... if start pulling one
>   number from both lists they will not continue to be the same.
> 
> 
>   to phrase another way... the following is an example of how I 
> think the rand/random functions do _not_ operate.
> 
> 
>   pretend I designed a pseudo-random number generator that returns
> numbers from 0 to 16.  It uses the following static table of numbers,
> the seed sets the starting position in the array, when a random number
> is returned the next_value index is incremented and wraps to zero when
> 16... like the following:
> =============
> #define MY_MAX_RAND 16
> 
> static int data_table[] = {
>   2, 5, 1, 12, 14, 6, 7, 4, 15, 8, 9, 10, 13, 11, 3, 16
> };
> static unsigned char next_value = 1;
> 
> int dumb_random()
> {
>   int result = data_table[next_value];
>   next_value = next_value == 16 ? 0 : next_value++;
>   return result;
> }
> 
> int dump_srandom(int seed)
> {
>   next_value = seed % (MY_MAX_RAND + 1);
> }

Your "dumb_random" is in fact a simplified version of exactly how the GNU
random works.  The differences are that data_table is generated on the
fly.

> =============
> 
> 
>   To build on the stuff above, how I think things do work is something
> like:
> static int master_data_table[][] = {
>   { 2, 5, 1, 12, 14, 6, 7, 4, 15, 8, 9, 10, 13, 11, 3, 16 },
>   { 6, 7, 4, 2, 14, 15, 8, 9, 10, 13, 5, 3, 16, 1, 12, 11 },
>   /* a bunch more arrays with completely different orders */
> };
> static int *data_table = master_data_table[1];
> static unsigned char next_value = 1;
> 
> int dump_srandom(int seed)
> {
>   data_table = master_data_table[seed];
>   next_value = 1;
> }
> =============

No.  This throws away a lot of the potential psuedo-randomness-per-int in
your table, because each seed leads to a short cycle.

---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil@dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                      Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...2k
---------------------------------------------------------------------------