Path: cactus.org!milano!cs.utexas.edu!sdd.hp.com!swrinde!mips!mips!decwrl!
+     deccrl!news.crl.dec.com!nntpd.lkg.dec.com!cadsys.enet.dec.com
From: cooper@cadsys.enet.dec.com (Topher Cooper)
Newsgroups: sci.crypt,comp.compression

Subject: Re: Source for _real_ random numbers
Message-ID: <34378@nntpd.lkg.dec.com>
Date: 18 Mar 92 15:33:16 GMT
References: 
+           <1992Mar17.185517.2014@athena.mit.edu>  <1992Mar18.022728.25075@athena.mit.edu> 
Sender: news@nntpd.lkg.dec.com
Reply-To: cooper@cadsys.enet.dec.com (Topher Cooper)
Followup-To: sci.crypt
Distribution: usa
Organization: Digital Equipment Corporation, Hudson MA
Lines: 51
Xref: cactus.org sci.crypt:5598 comp.compression:2559


In article , bruce@harry.ugcs.caltech.edu
(Bruce J Bell) writes:

|>One simple and effective(?) way to remove non-randomness from
|>physically-generated random numbers is to xor several near-
|>or semi-random bits together for 1 output bit.

    If your "non-randomness" is in the form of bias (i.e., the probability
    of a 1 being not equal to .5) then this will indeed increase the
    "randomness".

    An even better method (using slightly more bits -- if your bias is
    low) goes back to, I believe, Von Neumann.  It will remove *all* bias
    from a biased but otherwise random sequence.  Look at the bits in
    pairs.  If both bits are the same, discard them.  If they are different
    emit the first bit.

    There is a problem with both these methods -- if there is a correlation
    between adjacent bits (another form of "non-randomness") then these
    methods will both *increase* the bias.  For many applications (though
    by no means all) a slight correlation is less harmful then a slight
    bias, so these procedures will make the non-randomness worse for those
    applications.

    Probably your best bet for removing subtle non-randomness from physical
    random number sequences -- if you can afford the computational load --
    is to XOR them with a statistically well-behaved (but deterministic)
    pseudo-random source -- such as can be gotten from DEA.  The physical
    random source provides the non-deterministic part of "randomness" while
    the pseudo-random source provides the statistical properties.

    If there is some degree of non-randomness (predictability such as produced
    by simple bias, periodic bias, or correlations) this will technically
    leave a degree of determinism in your output sequence (though it will
    be computationally infeasable to recover it if you use a pseudo-random
    sequence which is cryptographically strong in practice, such as DEA).
    If you can set bounds on how much determinism exists, algorithms can
    be devised which will hash your semi-deterministic mixed-random sequence
    to a non-deterministic mixed-random sequence.

    For example, let us say that you are confident that there is at least
    one bit of non-determinism for each two bits of output from your physical
    random number generator (not too hard to accomplish).  Instead of xoring
    with a psuedo-random sequence as I suggested previously, encrypt the
    physical RNG with DEA and an arbitrary password.  Then just use the
    top half of the encrypted result.  Since DEA distributes the input well
    over the output, you have hashed down the input to non-deterministic
    level.

				Topher