From: (Bryan G. Olson; CMSC (G))
Newsgroups: sci.crypt

Subject: Re: Toward Axiomatic Fenced DES (long!)
Date: 5 Jun 1994 23:34:03 GMT
Organization: University of Maryland, Baltimore County
Lines: 348
Message-ID: <2stndb$>
References: <>
X-Newsreader: TIN [version 1.2 PL2]

Terry Ritter ( wrote:
:  In <2sjjjg$> (Bryan G. Olson; CMSC (G))
:  responds to my Fenced DES proof attempts:

: >[...]
: >:  The Proofs
: >
: >Please don't call them theorems and proofs if they are not.

:  Please don't nit-pick new work to death.  Few of these statements
:  are actually false; [...]

Thanks for reading my comments.  You and I still disagree about the
value of your results.  

Maybe I'm nit-picking again, but if you want to re-include stuff I
edited out of your post, please use a different annotation mark from
what you use to denote my writing.

: >:  2.9  THEOREM:  (Average distance and distribution.)  The expected
: >:  number of elements which differ between two n-position code values
: >:  is n * (1 - 1/|S|), and the distribution is binomial.
: >
: >You need to state the distribution of code value pairs in order for
: >the expected value you talk about to be defined.  Also, the proof
: >needs work.  First justify the binomial distribution, then show
: >this implies your result.

:  In all cases I take the "expectation" to be the mean or arithmetic
:  average over all possibilities, each one taken once.  Ironically,
:  the critical comments themselves state:

:   >When no
:   >distribution is given, we generally assume the uniform distribution.

That's only if you do say, as I did, that the selection is at random
from some set.  If you don't use the language of statistics the
default assumption is universal quantification, or in some contexts
existential quantification.

:  Here, that means all possible code value pairs.

:  As for the proof, first note that 2.7 is available:

: > 2.7  THEOREM:  (Position difference counts.)   The number of
: > n-position code values which differ in exactly m positions is
: > (n)          m
: > (m) * (|S|-1) .
: >

I didn't pick on this one, but you better say "from any given code
word", or it's not clear what they differ from.

Below is the proof we were talking about.  I had said of this, "the
proof needs work.  First justify the binomial distribution, then show
this implies your result."

: > 2.9  THEOREM:  (Average distance and distribution.)  The expected
: > number of elements which differ between two n-position code values
: > is n * (1 - 1/|S|), and the distribution is binomial.
: >
: >      (Proof:  Assume the number of code differences is the binomial
: >                                             (n)   m   n-m
: >      probability of a difference B(m;n,p) = (m)  p   q   , where
: >      where p = 1 - 1/|S| and q = 1-p, times the total number of
: >                       n
: >      code values (1/q) :
: >
: >      (n)          m   (n)  m  n-m   -n
: >      (m) * (|S|-1)  = (m) p  q     q
: >
[Terry added:]
:        ^                ^
:        |                |
:        This is 2.7      This is the binomial equation

: >                       (n)            m
: >                     = (m) (p / (1-p))
: >
:                     m                  m
:              (|S|-1)  =     (p / (1-p))

:               |S|-1   =      p / (1-p)

:                       = (1-1/|S|) / (1 - (1-1/|S|))

:                       = (1-1/|S|) / (1 - 1 + 1/|S|)

:                       = (1-1/|S|) / (1/|S|)

:                       = |S| (1-1/|S|)

:                       = |S| - |S|/|S|

:               |S|-1   = |S| - 1

:                         Look, Ma!  They're the same!

The first line of your proof asks us to assume the theorem is true.
From this assumption you derive something known to be true.  So ?
If you derive the result from something true, that's a proof; if
you derive something false from the assumption the theorem is false,
that's also a proof.  That the theorem implies something true is
not a proof.  I tell my students to be clear as to whether each
line is something you know, or something you want to prove.

:  This proof (2.9) looks good to me as it stands.

: >:  4.7  THEOREM:  (Probable change propagation.)  Any change whatsoever
: >:  to the input value to a complete invertible substitution is likely
: >:  to change about half the bits in the output value.
: >
: >:       (Proof:  Changing the input value selects among all remaining
: >:       output code values.  If the output is considered to be binary
: >:       bits, we expect about half those bits to change (2.9).)
: >
: >Not true.  The identity function is a complete invertible permutation.
: >I can give you many input changes which result in significantly less than
: >half the output bits changing.
: >
: >You need some idea of probability here.

:  I imply that in "expect."  I take the expectation to be the
:  arithmetic average over all possibilities, each one taken once.

You don't even say "expect" in the theorem, only the proof.  Without
the idea of probability you can't assume all the remaining output
codes are equally likely.

:  It may be that one could be more explicit about the exact
:  expectation, but an exact expectation is not what we normally
:  mean by "avalanche."  For example, we do not see measurements
:  which say "this construct achieves only 83.2 percent of the
:  expected avalanche."

:  With the understanding of the term "expect," I think the proof
:  of 4.7 is satisfactory as it stands.

: >For example:
: >
: >THEOREM: Given n-bit strings a and b with a <> b, if P(x) is chosen
: >(independently of (x,y) ) from a uniform distribution of all
: >permutations of the set of n-bit strings, then the expected Hamming
: >weight of P(a) xor P(b) is:
: >
: >       2**(n-1) / (2**n -1)

In a followup I had to correct this, I lost a factor of n.

:  I specifically do not want to get into defining "choosing from
:  a uniform distribution" since that implies a random choice, and
:  then we get to define "random."  The proofs that ran into real
:  problems did so over that word.  While this could be fixed, any
:  such approach must actually be an indirect way of describing the
:  expectation of an input distribution which I always take to be
:  uniform.  Instead of defining a choice "at random," we can
:  instead choose every possibility and report the results.

No, you don't have to define "random".  My theorem is in accepted,
precise mathematical language.  You got into trouble trying to define
the term differently.

: >:  4.8  DEFINITION:  AVALANCHE is a statistical property of a discrete
: >:  function in which any change whatsoever on the input is expected to
: >:  produce a change in about half the bits in the output value.
: >
: >This one is going to be hard to fix.

:  It's a *definition*: there is nothing to fix.

It's a bad definition.  What discrete functions have this property ?

:  I suppose we could be more explicit in defining the expectation,
:  but a particular explicit expectation is not the conventional
:  quality one associates with avalanche.

: >For any permutation on n-bit
: >strings (n>0) there exists some input change which will flip just one
: >output bit.  Proof: for permutation P use the change:
: >            P ; flip-first-bit ; inverse-P.

:  Yes, I would say this is known.  For a complete 8-bit substitution,
:  there are in fact eight.  See 2.8:

: > 2.8  EXAMPLE:  The number of 8-bit binary codes which differ in m
: > bits is:
: >
: >      distance     count
: >            0         1
: >            1         8
:                        ^--------- see!
: >            2        28
: >            3        56

Well, here's another:
        P; if first bit is one, flip-second-bit, 
           otherwise flip-third-bit; inverse-P.

Did you count that one among your 8 ?

(Hmm, given a permutation P(x) on (n-bit) codewords, how many
derangements D(x) of the codewords have the property that for all
codewords x, P(x) differs from P(D(x)) in exactly one of the n bit
positions ? )

Maybe you better tell us exactly what is in the set of any changes.

More to the point, if there is always some change induces a one-bit change
in the output of P, then no P on codewords of more than 2 bits can fulfill
your definition of having the avalanche property.

: >:  4.9  THEOREM:  (Avalanche is automatic.)  Avalanche is an inherent
: >:  property of complete invertible substitution.
: >
: >:       (Proof:  See 4.5, 4.7, and 2.9.)
: >
: >Unfortunately 4.7 is incorrect, and the definition of avalanche is
: >nonsense.

:  The *proof* of 4.7 may be insufficient, but the *fact* of 4.7 *is*
:  correct.  These are not just arbitrary abstractions, but instead
:  statements about specific reality.  The issue is to write them so
:  they will be accepted on an abstract level.

To write them so they are well-defined and true.

:  Similarly, it may well be that the definition of avalanche could
:  be improved, but the current ad hoc definition is good enough to
:  understand, and is consistent with the usual definition.

:  This (4.9) looks OK to me.

You admit that 4.7 is talking about the average over the set of all
permutations of n-bit codewords.  You can't apply it to an arbitrary

: >4.10 and 4.11 again need to be defined for a probability distribution.

:  I take "expectation" to be the arithmetic average over all
:  possibilities; that *is* a distribution.  These (4.10, 4.11) seem
:  OK to me.

: >:  4.14  THEOREM:  (Reconstruction requires information linking output
: >:  values to input values.)  An unknown invertible substitution cannot
: >:  be resolved without simultaneous information about both the input
: >:  value or position and the output value.
: >
: >:       (Proof:  To the extent that a particular substitution can
: >:       be said to have an identity, that identity is the relation
: >:       between substitute values and their position.  This relation
: >:       is both necessary and sufficient to define the substitution.)
: >
: >Say what?   [<-Bryan]

:  I *intend* to say that a substitution is exclusively a relation
:  between input and output values.  If we need to resolve a
:  particular substitution (out of all possible invertible
:  substitutions), we must have simultaneous information about both
:  the input values and the output values.  In this way I try to
:  extend the definition of substitution to imply a limit on
:  cryptanalysis.

:  There is a temptation to say that actual values must be known,
:  which is probably wrong, and which I avoid.

:  I'm not sure what else would be clearer here, but perhaps 4.14
:  could be re-worded.

Well, if you want to know how much information you need to determine
one permutation on the set of n-bit codewords, given that all such
permutations are equally likely, it's approximately:

                n * 2**n  bits.

Don't take the large size of this number as good news.  It means that
when we design ciphers, we have to deal with that very small subset of
permutations which can be specified in a reasonable amount of space.
Numbers we get from averaging over all permutations may not apply to

: >:  5.2  THEOREM:  An invertible substitution is a bit-mixer.
: >
: >May or may not be.

:   >5.1  DEFINITION:  A BIT-MIXER combines multiple input bits such
:   >that each output value is defined by each and every input bit.

:  OK, you might like "function of" or a more explicit rendering.
:  Still, the point is clear.  The output value from a substitution
:  is "dependent upon" or "a function of" each and every input bit.

Are you saying that ALL invertable permutations are bit mixers ?
That there exists some invertable permutation which is a bit mixer ?
That most invertable permutations are bit mixers ?

:  My principle assumption is that an invertible substitution is the
:  ideal block cipher, and this is consistent with the usual analysis.
:  If an ideal block cipher is an invertible substitution, it must
:  behave like an invertible substitution, and have the same bit-change
:  distribution.  And, to the extent that the real block cipher
:  approaches the ideal, it must behave in the same way.

Perhaps most of the invertable substitutions on n-bit codewords for a
given blocksize n (say, greater than 64) are good block ciphers in the
sense that they are secure, but most are terrible because you can't
write the program that computes them in any reasonable amount of