Newsgroups: sci.crypt
Path: cactus.org!ritter
From: ritter@cactus.org (Terry Ritter)

Subject: Re: generating one-time pads
Message-ID: <1992Jun27.064216.21116@cactus.org>
Summary: Initial Analysis of Nico's RNG
Organization: Capital Area Central Texas UNIX Society, Austin, Tx
References:  <60814@cup.portal.com>   <2668@accucx.cc.
+           ruu.nl> <2796@accucx.cc.ruu.nl>
Date: Sat, 27 Jun 1992 06:42:16 GMT


 Article 6295 in sci.crypt:

 In <2815@accucx.cc.ruu.nl> nevries@accucx.cc.ruu.nl (Nico E de Vries)
 writes:


>I DO think the non constant
>nature of the crystals in the PC is measurable and believe my algo
>measures it.

 In that case, you should be able to estimate the extent of the
 crystal deviations from the experimental results.

 Conversely, by analyzing the experiment, you should be able to
 estimate the level of deviations the experiment could detect.



>You claim the non deterministic part of the process is
>to small to measure. I don't really see why.

 In that case, I am about to introduce you to analysis:


 First of all, the experiment:

   1.  There is a tight software loop decrementing an 8-bit
       counter.

   2.  A periodic timer-interrupt interrupts the count, samples
       the current count value, and saves the sample to an array.

 This is the extent of the data-gathering machinery.  Eventually,
 the rest of the program decimates the sample values, accumulating
 only the least-significant bits into a 32-bit value.  Eight such
 values are collected, XORed together, and returned as a 32-bit
 result.

 Since the post-acquisition processing is obviously completely
 deterministic, any non-determinism must be found by the data-
 gathering machinery alone.  Thus, we can limit the analysis to
 what the data-gathering machinery can do.


    Basically we have two nominally periodic processes;  The first,
    a hardware-precise periodic probe samples the second, a software
    counter.  The software process differs from the hardware counter
    in that software execution is often interrupted or delayed by
    background processes.  Essentially, we can infer the existence
    and to some degree the extent of the background processes by
    comparing successive samples of the software counter.

    Note that the hardware and software processes are not
    synchronized, nor are their periods necessarily related.
    Thus, without any extra interference at all, we can expect
    to see a generally periodic "jitter" of one count more or
    less.  This is, in fact, the expected synchronization effect
    due to fractional cycle rates in the two processes, but this
    will occur despite absolutely constant clock signals.


 Next, the software loop:

 To detect different amounts of overhead between sample periods
 we must compare counts.  Clearly, if the difference in the software
 process is less than the period of one count, we cannot detect it.
 So it is useful to consider how fast the count can be.

 Note that the interrupt process must access a variable incremented
 by the software.  The compiler is not going to put the variable
 in a register, but I could, and so I assume that it does, to bound
 the maximum software count rate  (the actual compiler-produced
 rate may be half of this):

 The fastest loop I found on the 386 is this:

     . . .

    loop: Dec AX     2   cy
          Jnz loop   7+1 cy
     . . .           ---
                     11  cy  @ (e.g.) 20 MHz

 Each count takes about 11 cycles of 20 MHz clock, or 550 nsec.
 Although we would be able to detect somewhat smaller events on faster
 machines, we will require *far* more significant events in the context
 of a 4.77 MHz 8088.  We will assume that 550 nsec is the absolute
 minimum detectable event on a modern machine.


 Next, the hardware timer interaction:

 The hardware timer is driven at about 1.19 MHz.  (On the original
 PC, this was a division by 12 from the main 14.318180 MHz clock.)
 Normally, timer 0 produces the periodic "tick" interrupt by
 dividing by 2**16 giving a tick rate of about 18.2065 per second,
 normally described as 18.2 ticks per second, and this is used by
 DOS to maintain its internal time count.

 For our experiment, the timer is initialized to divide by 2**10,
 leading to about 1165.2 interrupts per second, or a period between
 interrupts of about 858.2 usec.  Thus, we have 858 / 0.55 = 1560.36
 decrements per sample, maximum.  Since the decrementer is only
 a single byte counter, however, it will underflow 6 times, and
 do 24.36 additional counts per sample, maximum.

 Note that the fractional portion is real and significant, and
 here suggests that in about 3 out of 10 samples, normal operation
 will yield one more count.  More precisely, in about 36 out of 100
 samples, normal operation will give one more count.  More samples
 will give even more precision.  This is the normal synchronization
 "jitter" associated with two asynchronous processes.  Note that
 in order to ascribe count differences to some secondary process,
 we must compute and compensate for these expected sync jitter
 effects, or demand that a measured event exceed them.

 Also note again that the analysis so far assumes a total lack of
 background interference with the software process.

 Note also that the claimed data rate is 128 bits per second, while
 we compute 1165.2 samples per second.  However, post processing
 will use only one bit per sample, and require 8 samples per output,
 so that the actual data rate will be under 145.65 bits per second,
 a good check between claim and math.


 Next, the presumed source:

 What can we assume about the distribution of crystal oscillator
 phase jitter?  Well, if all the phase jitter were unidirectional,
 we (and all the equipment) would perceive that the frequency had
 changed.  We define "frequency" as the average over a long period.
 Thus, any phase jitter will be described as bipolar around some
 center average frequency, with magnitude and distribution such
 that the average jitter over many cycles is zero.  It is OK
 for the frequency to be a bit different, we just don't want it
 bobbling around.

 With two clocks, we measure one against the other, and can
 measure differences, but could never say which was "right."
 We can assume that either one is "right," make measurements,
 and relate any results to differences in the "other" clock.
 Here we assume a separate clocks:  14 MHz for the timer, and
 20 MHz for the CPU.

 Now, we know that the hardware samples occur each 858.2 usec.
 Over that period if the software process is delayed even to
 the extent of 550 nsec, we may see it (provided we can
 differentiate it from the effects of normal sync jitter).
 This is a precision of about 1 part in 1560 or 0.064%.

 Note, however, that we can only measure the *sum* of all
 deviations over the sample period.  During a single 852.2 usec
 sample period, there will be 17,044 clock cycles at 20 MHz.
 Recall that any phase jitter will be bipolar with an average
 of zero.  In order to measure main clock deviations, we must
 believe that 17,044 supposedly random events on a 50 nsec base
 will happen to sum to an excess or deficit of at least 550 nsec,
 or a total of 11 full clocks.  This must occur on average,
 1 time out of 8.


 Expected deviations:

 In <1992Jun26.012015.20842@massey.ac.nz>, T.Drawneek@massey.ac.nz
 (Ted Drawneek) responded to my query:

>> Perhaps Mr. Drawneek would care to speculate on the magnitude of
>> crystal oscillator nondeterministic phase shifts, and on the

>Therefore if you took two of these
>poorly designed oscillators and mixed their outputs,  you would get noise
>out to at least 3kHz which you could digitize.

 The 20,000,000 Hz cycle time is 50.0000 nsec.
 The 20,003,000 Hz cycle time is 49.9925 nsec.
 The 19,997,000 Hz cycle time is 50.0075 nsec.

 (This is 3 times my estimate of 0.0025 nsec.)


 Measurable results:

 Therefore, in order to get a measurable difference we must
 assume that the sum total of 17,044 supposedly random events,
 each of maximum size 0.0075 nsec will sum to 550 nsec.  This
 is impossible.

 If every one of the 17,044 supposedly random events conspired
 in the same direction, they would each have to be 0.032 nsec
 wide, 4.27 times the expected maximum crystal jitter.

 If most of the 17,044 supposedly random events canceled
 each other out, except for 10% in the same direction, then
 each of these would have to be 0.323 nsec wide, 42.7 times
 the expected maximum crystal jitter.

 These numbers are for the fastest possible software loop,
 with expected jitter compensation (which is not part of the
 original system).  Without such compensation, we would want
 to see a deviation of two or three counts (instead of just
 one) before we could claim to have measured these events.

 I would say this is unlikely.


 Last, the background process:

 Consider the refresh process.  Every 18 counts of 1.19 MHz,
 that is, every 15.12 usec, timer 1 generates a DMA request
 which is used as a refresh cycle.  (In this way we get 128
 dynamic memory refresh cycles every 2 msec.)

 Alas, I find myself without a 386 timing reference, but at
 least in 8088 designs the processor had to specifically
 give up the system bus for DMA to operate, and this means
 that the software process would stop for the duration.

 Even if not, should the sample timer elapse during DMA,
 the interrupt could not be handled until after the DMA
 completed.  Thus, the sample could apparently occur
 "early."

 In either case, CPU slow-down or sample delay, we get a
 substantial, measurable, software deviation.

 Suppose that DMA takes 5 CPU clocks.  Over our sample period
 of 858.2 usec, there will be about 56.76 refresh events,
 consuming 2837.9 nsec of CPU time.  This is over 5 times
 the width of the minimal measurable event.

 We can also expect to see jitter here, as the two hardware
 and one software process interact at different rates.
 I note that this interaction, once started, is generally
 deterministic.  Clock-rate differences between different
 PC's or at different times will lead to tiny changes in
 the ratios between the processes.  However, as we saw
 above, such differences express themselves over exponential
 time as extra pulses due to fractional cycle rates.


 So, the bottom line is:  Which is more likely?

 a) The experimental design is detecting nondeterministic crystal
    clock jitter.

 b) The experimental design is detecting deterministic but complex
    interactions with background DMA operations.



 Back to the response:


>Using two crystals to
>measure differences can be very accurate and I don't see why that
>would not be the case here.

 Nobody does until they go through the analysis.

 

>> Until you have some realistic analysis, you cannot validly say
>> that your "flawless true random number generator" "DOES work,"
>> even to yourself.

>I did many (as said before) and others are doing the same.

 Measurement is not analysis.


 ---
 Terry Ritter    ritter@cactus.org