94121001.HTM

Path: illuminati.io.com!uunet!news.mathworks.com!uhog.mit.edu!bloom-beacon.
+     mit.edu!gatech!swrinde!cs.utexas.edu!not-for-mail
From: ritter@io.com (Terry Ritter)
Newsgroups: sci.crypt

Subject: Re: How random is "random data"?
Date: 11 Dec 1994 14:06:05 -0600
Organization: UTexas Mail-to-News Gateway
Lines: 122
Sender: nobody@cs.utexas.edu
Message-ID: <199412112006.OAA00275@pentagon.io.com>
NNTP-Posting-Host: news.cs.utexas.edu

 On 1994-12-10, stevalln@news.dorsai.org (Steve Allen) posted
 :

>   See Ueli Maurer's "A Universal Statistical Test For Random Bit
>Generators", Advances In Cryptology-- Crypto '90 (Springer-Verlag).
>This test does exactly that: for your n-bit generator (n=1..16), it
>takes the logs of inter-arrival times for all values 0..2^n-1, and
>compares the average to a perfect reference. He claims, as I
>understand it, that this test reveals *any* weakness in a RNG due
>to memory [...].


 This test has already been placed in a better context:
 On 1994-11-1 mjohnson@netcom.com (Mark Johnson) posted
 :

In article  stevalln@dorsai.org (Steve Allen) writes:
  >
  >   How universal is Ueli Maurer's Universal Statistical Test For
  > Random Bit Generators?
  >

Regrettably, Maurer's test does not reject the infamous RANDU
algorithm.  This algorithm is singled out for special scorn
and abuse in Knuth's Volume 2 (section 3.3.4, page 104 of the
2nd edition):

  " ... its very name RANDU is enough to bring dismay into
   the eyes and stomachs of many computer scientists.
   ... the generator fails most three dimensional criteria
   for randomness and it should never have been used."


The RANDU algorithm is a linear congruential, defined thus

     X[n+1] = (65539 * X[n]) mod 2^31

Maurer's test doesn't reject RANDU because of a regrettable
deficiency in modern computers: they're just not big enough
and fast enough to run his testing program.  His problem can
be found at the top of page 416 ("A Universal Statistical
Test for Random Bit Generators", proceedings of Crypto 90,
pp. 409-420).

   "The test can be implemented by using a table
    (denoted below as TAB) of size 2^L that stores for
    each L-bit block the time index of its most recent
    occurrence."

Unfortunately for Maurer, RANDU has L=31, so he's building
and using an array of 2^31 elements. The third for-loop
of his test algorithm, ("for n:= ...") performs a total
of 10000 * (2^L) iterations.  This takes a long time :-).

Equally unfortunately, other generators (e.g. the
Marsaglia generator posted here less than a week
ago) use L=250 or more.

Maurer's test seems best adapted for inspecting the
output of _physical_ random number generators ... things
that contain nuclear scintillation counters and
1/f differentiators and radio receivers tuned between
channels and Zener diodes and other random physical
phenomena.

 [Here I (ritter) disagree: I think a physically random generator
 should be seen as state machine with huge (essentially infinite)
 state.  Thus, the test is even *less* useful in that environment.
 Whereas we can analyze distribution and population from the design
 of a state machine (e.g. software), these additional quantities
 must be *measured* in a physically-random device.  There are just
 more things which can be wrong.]

Software generators, even extremely well-known
examples of BAD random numbers, can easily cheat Maurer
and obtain a passing grade they obviously don't deserve.



 And, on 1994-11-1 don@cam.ov.com (Donald T. Davis) posted
 <396moo$28m@deep-fried-beagle.cam.ov.com>:

Steve Allen writes:
>  How universal is Ueli Maurer's Universal Statistical Test For
>Random Bit Generators?
>[...]

it's universal in the sense that it's an entropy estimator,
and entropy is fundamental in information theory.
in principle, a perfect entropy measurement gives you the real
thing you want to know, which correlation and other statistical
tests only hint at: how much variation does the signal really
contain?

however, an entropy estimate is necessarily a very imperfect
measure, because the longer the strings that the estimator
takes into account, the more sensitive the estimator is to
long-term correlation.
[...]
running the estimator with shorter wordsizes is not useful,
because the longer wordsizes capture all short-term redundancies.


 And if neither of the above is satisfying, one might look at [1:188]:

       "There have been several recent papers on universal tests for
    pseudorandom number generators, mostly based on information
    theory [Maurer].  It is important to note that universal tests
    have been known to statisticians for half a century, but are
    little used because they tend to be very weak.  [...]  Note
    that we need to test generators with 1000 bits of state and
    a corresponding period."


 [1]  Maclaren, N.  1993.  Cryptographic Pseudo-random Numbers in
      Simulation.  In Fast Software Encryption, Ross Anderson, Ed.
      Springer-Verlag Lecture Notes in Computer Science, 809.

 ---
 Terry Ritter  ritter@io.com