[m-dev.] Re: Random number generation package

Kenneth Almquist kalmquist2 at hotmail.com
Sat Sep 1 06:39:54 AEST 2001


Thomas Conway wrote:
> If you search through the mercury-users archive, you'll see the source
> for a robust, no holes barred, RNG that I translated from George Magalia's
> "Mother of all random number generators". It'll be slower than yours,
> but it is very robust.

I found it, and it looks interesting.  Here are my comments.

1)  The C code operates on 16 bit signed values whereas the Mercury code
    operates on 15 bit unsigned values.  This change means that the
    Mercury code produces different output than the C code.  Otherwise
    the algorithms appear to be identical.

2)  The Mercury code takes 17.3 microseconds, which is an order of
    magnitude slower than the package I sent.  I cleaned up the C code,
    timed the core algorithm, and got a time of 0.34 microseconds
    to generate a 32 bit random number.  (Converting the result to
    floating point takes another 0.06 microseconds using a floating
    point multiply.)  This is not much slower than the 0.22 microseconds
    for the algorithm I used, so the algorithm is not inherently slow.

    My guess is that the efficient way to code this algorithm in Mercury
    is to extract all the values of the state into individual variables,
    compute the values which will go into the new state, and then create
    the new state.  This would result in only one memory allocation per
    call, other than the allocations required to box floats.  Three
    words of the state are used only to hold intermediate values in the
    random number generation process so this would also make the state
    smaller.

3)  The interface includes a number of ways to use the basic random
    number generation functionality.  You appear to have thought more
    about the interface issues than I have.

4)  The irange predicate produces incorrect results if Upper = max_int,
    because the expression Max + 1 overflows.  This could be fixed by
    performing the addition in floating point.

5)  Only the low order 30 bits of the seed are used.  If these bits
    are zero, then the pseudo-random sequence produced will consist of
    nothing but zeros.  To avoid this, and to avoid duplicate sequences,
    the seed should be required to be in the range 0 .. pow(2, 30) - 1.

6)  The state of the generator consists mostly of previous outputs.
    Computing the remaining 32 bits takes no more work that computing
    the next random number.  Therefore code can be written which
    efficiently distinguishes between the output of the generator and
    the output of a true source of random numbers.

7)  In an ideal random function, flipping one input bit should change a
    given output bit about 50% of the time (depending on the values of
    the remaining input bits).  In the random number generator I posted,
    this is not the case for all bits, but for each bit in the A output
    there are at least 64 input bits where this is true.  For the B
    output, which appears in the generated random numbers only as the
    low order bits of floating point numbers, the figure is 56 input bits.
    For the C version of the "Mother" algorithm, the number is no more
    that 12 for any output bit, and is zero for the low order output
    bit.  For two rounds, the number varies from 7 to 23, with a median
    of 10.  The two round statistics are going to be similar to the
    statistics for nine sequential outputs (since the operation of
    converting between a state and a sequence of nine output is
    simple).  So a pattern detection algorithm might work as follows:

    Let C1 = the last nine outputs,
        C2 = another set of nine sequential outputs, and
        F2 = the output that immediately followed the outputs in C2.
    We want to predict the next output of the generator, which we will
    call F1, and we attempt to do this by exploiting the relationship
    between xor(F1, F2) and xor(C1, C2).  For each bit F2[i], we ask
    whether the corresponding bit in F1 will be the same or different.
    To do that, we look at the bits which differ between C1 and C2.
    If for every one of these bits, flipping the bit is either more
    or less likely to result in a change to bit i of the output, then
    we can make a prediction of F1[i].  The prediction will be quite
    weak, but if repeat the above operation with enough different
    values of C2, we will eventually get a large collection of
    predictions for an output bit, and combine them.

    It's hard to imagine the bit correlations discussed here actually
    mattering to any real application, so this is not a real weakness.
    It only matters if you are trying to judge how much of a margin
    the algorithm provides beyond the minimum requirements for
    pseudo-randomness.

    I notice that the function to compute half the output contains
    8 multiplications by constants and 8 addition operations.  These
    are all linear over the field of real numbers.  Then we perform
    a shift and a mask operation (equivalent to integer division with
    remainder), which are not linear.  Since we have only a couple of
    nonlinear operations, a promising area for discovering regularities
    is to look at the effect of numerical differences, rather than
    at the effect of bit flipping.  If I were going to spend more time
    on this, that is what I would look at next.  (In my package I have
    10 exclusive or's and 4 right shifts to break up patterns based
    on numerical differences.)
					Kenneth Almquist
--------------------------------------------------------------------------
mercury-developers mailing list
Post messages to:       mercury-developers at cs.mu.oz.au
Administrative Queries: owner-mercury-developers at cs.mu.oz.au
Subscriptions:          mercury-developers-request at cs.mu.oz.au
--------------------------------------------------------------------------



More information about the developers mailing list