Path: cactus.org!cs.utexas.edu!uunet!europa.eng.gtefsd.com!avdms8.msfc.nasa.
+     gov!sol.ctr.columbia.edu!spool.mu.edu!bloom-beacon.mit.edu!senator-
+     bedfellow.mit.edu!athena.mit.edu!fuellen
From: fuellen@athena.mit.edu (Georg Fuellen)
Newsgroups: sci.crypt

Subject: Re: What is Random?
Date: 9 Dec 1993 03:18:54 GMT
Organization: Massachusetts Institute of Technology
Lines: 117
Distribution: world
Message-ID: <2e65eu$lli@senator-bedfellow.MIT.EDU>
References: <93340.021832N13CC@CUNYVM.CUNY.EDU> 
+           <2e2k29INNglq@roche.csl.sri.com>
NNTP-Posting-Host: m1-142-8.mit.edu

In article <2e2k29INNglq@roche.csl.sri.com>, boucher@csl.sri.com (Peter K. Bouch er) writes:
|> [someone else wrote]
|> |> Another related issue is the attempt to make RNGs mimic an uniform
|> |> distribution. I would think any distribution that is both locally and
|> |> globally unpredictable would be as good even if it deviated significantly
|> |> from an uniform distribuiton. Maybe someone else can join in and comment.

If the sequence deviates significantly from a uniform distribution, it
is no longer unpredictable. Just predict the most frequent subsequence,
and you win more often than not.

|> Read Knuth Vol. II.
|> 
|> Anyone else have a better source for information on this issue?

The really clean definition of randomness is an asymptotic one, 
using the concept of a 'cryptographically strong' random number generator.
There are no "random" strings, but there can be algorithms ( random number 
generators ) that have a "random" output.

[ I am at it again... the following is put together from previous posts ;-]

A 'cryptographically strong' random number generator passes ANY polynomial time
statistical test asymptotically. ( Some people call such a generator 
"unpredictable", but that's just an equivalent concept. You can prove that.)

In this setting, any probabilistic algorithm can be viewed as a test for the 
random number generator.

That's because a statistical test is formally just a family of functions,
mapping a bitstring input to a test result, i.e.

         poly(k)             t
C : {0,1}         -->  {0.1}      (with fixed t, take  t=1  w.l.o.g.)
 k

and "passing all polynomial time tests" means that

  any polynomial time algorithm which can be viewed as computing such a function 
  WILL  NOT  EXHIBIT DIFFERENT BEHAVIOUR IF IT IS GIVEN A TRULY RANDOM STRING
  INSTEAD OF A STRONG RANDOM NUMBER GENERATOR'S OUTPUT, at least asymptotically. 

(I'll formalize this below.)

[someone else wrote]
|> To be
|> sure, in principle the LCG is not random, but then neither is any
|> algorithm for generating random numbers;

Of course, the algorithms are all deterministic. That's why you need all this
"asymptotic" crap.

It's the same thing with "polynomial time".
This can also be defined in an asymptotic way only. (Remember that O(k)
notation ?) You cannot say in a clean way that an algorithm using 20
microseconds on your PC is "fast", just as it is difficult to say that
1010010001 is more random than 1111111111. What you need is the  asymptotic
scaling behaviour of your algorithm. (Ok, ok, the fastest way to multiply uses
Fourier Transforms and its good scaling behaviour doesnt really pay off in
current applications ;-)


In our case  good  asymptotic scaling behaviour means that G (x),
                                                          k
a bitstring taken according to the probability distribution of the generator's
output bitstrings, which you get by feeding it with uniformly distributed seeds, --MORE--(66%)
*is indistinguishable* from a random string x' (again uniformly distributed)
for all polynomial time tests C  and for all but a finite number of k :
                               k

|                                    |      1
| Pr(C (G (x)) = 1) - Pr(C (x') = 1) | < -------
|     k  k                k          |   poly(k)


where the generator itself is a family of functions

         k            poly(k)
G : {0,1}  -->  {0.1}
 k

and poly(k) indicates values polynomial in k.

                                                  k
This is a strong property because there are only 2  different G (x),
                                                               k
     poly(k)                                 45
but 2       different x' and poly(k) may be k   !


In most cases this "passing all polynomial time tests" is proven
starting with a "plausible" assumption, e.g. "Large Numbers are hard to factor", 
or "The RSA crypto function cannot be broken". I know one generator which
does not even rely on such an assumption [ZMI89], but it is only
"locally strong".

[someone else wrote]
|> and LCG passes every test I've
|> ever heard about.

LCG's are not random asymptotically, see, among others, [Stern87].

A general reference is [Kranakis85], but I'm still looking for something
better.

[Kranakis85] E. Kranakis, Primality and Cryptography, 1985.
[Stern87]  J. Stern, Secret Linear Congruential Generators are
   not Cryptographically secure. Proc. of IEEE STOC, 1987, pp. 421-426
[ZMI89]    Y. Zheng, T. Matsumoto, H. Imai, On the Construction of Block Ciphers 
   Provably Secure and Not Relying on Any Unproved Hypothesis,
   Proc. of CRYPTO 89, Springer Verlag, pp.461-480.

  regards,

  georg
fuellen@mit.edu
The convex hull of all disclaimers made on usenet last year applies to this mess