Newsgroups: sci.crypt
Path: cactus.org!ritter
From: ritter@cactus.org (Terry Ritter)

Subject: Re: IBM-PC random generator, source included
Message-ID: <1992Jun25.071702.8945@cactus.org>
Organization: Capital Area Central Texas UNIX Society, Austin, Tx
References: <2673@accucx.cc.ruu.nl> <1992Jun23.080147.15804@cactus.org>
+           <1992Jun24.164407.28468@cl.cam.ac.uk>
Date: Thu, 25 Jun 1992 07:17:02 GMT


 In <1992Jun24.164407.28468@cl.cam.ac.uk> rja14@cl.cam.ac.uk
 (Ross Anderson) writes:


>The current flame war:

 Perhaps a frank exchange of views may seem like a "flame war"
 to some readers in more sanguine societies like the UK.
 I certainly do not consider such an exchange a "flame war."

 I hope I have not insulted anyone, nor the work of anyone,
 by pointing out that there exists no theoretical basis nor
 any known precedent for generating nondeterministic sequences
 on a deterministic machine.  This would not be a new design,
 it would be a major breakthrough.  Finding a useful
 nondeterministic aspect hidden in a normal computer would be
 a significant and important advance.

 It is important to be able to propose new ideas, and have them
 considered seriously from various points of view.  And it is
 especially important to be able to make mistakes without this
 becoming a social issue.

 If we have lost the ability to have a frank open discussion
 without having it demeaned as a "flame war," I think we have
 lost one of the main advantages of open communication.  We
 learn from our errors better than any quiet exchange of
 congratulation, and others find our errors better than we do.

 That said, communication between various societies may be
 trickier than we might expect, even when we use the same
 language.


>should be amenable to birthday testing.

>If Nico's generator does indeed approximate to a virtual state machine with
>less than 32 bits of state, then this could be detected by drawing at most a
>few hundred thousand samples and counting the number of doubles (and triples
>if any).

 Well, I know just about enough statistics to know that good
 statistical tests can be a lot trickier than they seem.


 Is this new work?  The "birthday test" is not a normal RNG
 test, to say the least [6,7].  It is not detailed in the
 extensive analysis in Knuth [5].  It is not a standard crypto
 RNG test [1,2,7].  And it is not one of the common statistical
 tests [3,4,8,9].

 I do have a reference to the "birthday paradox," with the
 equation for qn, the probability that no two of n random
 samples from m possibilities are the same:

         P(m,n)   m (m-1) (m-2) ... (m-n+1)
    qn = ----- =  -------------------------
          m**n    m   m     m   ...    m

 Given that we have m = 2**32, and n typically 10*5, I'm not
 even sure how to compute the probability.  I would like to
 see a discussion of this and its significance.


 It should be obvious that if we have a well-designed 32-bit
 RNG and use it to produce "a few hundred thousand" consecutive
 32-bit internal state results, there will be no duplicates at
 all.  A normal statistical RNG steps through (almost) its entire
 state without repetition.  But we may not agree that we have
 direct access to the entire state in Nico's design.

 Suppose we take the same well-designed 32-bit RNG and, for each
 sample, we accumulate 256 results, take only the lsb from each,
 and combine the resulting 256 bits into a single 32-bit value
 using XOR; wouldn't we expect duplicates?  Nico's extensive
 output processing is another unusual situation for an RNG.

 Let's do this, and compare the results.  Now, from "a few hundred
 thousand samples," can we distinguish between, say, 16, 24, 32
 or 40 bits (or more) of internal state in different RNG's?

 I dunno.  I'm dubious, but ready to listen.  (I'm also not at
 all sure how one does this on a DOS PC.)


 Normally I would run a number of Chi Square tests with, say,
 512 bins (use the top 9 bits) or mod 512 (use the bottom 9
 bits), but I doubt that would be acceptable in this case.


 References:

 [1] Beker, H. and F. Piper.  1982.  Cipher Systems.  Wiley.

 [2] Beker, H. and F. Piper.  1985.  Secure Speech
     Communications.  Academic Press.

 [3] Conover, W.  1980.  Practical Nonparametric Statistics.
     2nd Ed.  Wiley.

 [4] Crow, E., F. Davis, and M. Maxfield.  1960.  Statistics
     Manual.  Dover.

 [5] Knuth, D.  1981.  The Art of Computer Programming.
     Vol. 2, 2nd Ed.  Addison-Wesley.

 [6] Marsaglia, G.  1985.  A Current View of Random Number
     Generators.  Proceedings of the Sixteenth Symposium on
     the Interface between Computer Science and Statistics.
     Atlanta, March 1984.  3-10.

 [7] Maurer, U.  1990.  A Universal Statistical Test for
     Random Bit Generators.  Advances in Cryptology--CRYPTO '90.
     409-420.

 [8] Press, W., and others.  1986.  Numerical Recipes.
     Cambridge University Press.

 [9] Williams, F.  1992.  Reasoning with Statistics.  4th Ed.
     Harcourt Brace Jovanovich.

 ---
 Terry Ritter    ritter@cactus.org