Path: illuminati.io.com!uunet!cs.utexas.edu!not-for-mail From: ritter@indial1.io.com (Terry Ritter) Newsgroups: sci.crypt Subject: Re: Toward Axiomatic Fenced DES (long!) Date: 5 Jun 1994 14:09:22 -0500 Organization: UTexas Mail-to-News Gateway Lines: 392 Sender: nobody@cs.utexas.edu Message-ID: <199406051905.OAA28522@indial1.io.com> NNTP-Posting-Host: news.cs.utexas.edu In <2sjjjg$he0@news.umbc.edu> olson@umbc.edu (Bryan G. Olson; CMSC (G)) responds to my Fenced DES proof attempts: >[...] >: The Proofs > >Please don't call them theorems and proofs if they are not. Please don't nit-pick new work to death. Few of these statements are actually false; in some cases the wording needs to be refined. Of course, some *are* defined so poorly that--upon reflection--they are probably not worth having. The concept of mathematical "proof" is subjective; we don't have a global automatic theorem-checker. We can't just toss this stuff in a machine and have it decide yes or no and where the weakness lies. If I had such a thing, my proofs would pass it. I obviously get complaints, but what, no complements? No comments on my proofs of the example Block Mixing Transforms? No complements for finding provable properties which can be used to expand the width of block ciphers as components? If you have been following this process, you know it has involved a serious amount of effort, trial-and-error, and public failure. Success was by no means assured. When it occurs it should be uplifting to all of us. >: 2.9 THEOREM: (Average distance and distribution.) The expected >: number of elements which differ between two n-position code values >: is n * (1 - 1/|S|), and the distribution is binomial. > >You need to state the distribution of code value pairs in order for >the expected value you talk about to be defined. Also, the proof >needs work. First justify the binomial distribution, then show >this implies your result. In all cases I take the "expectation" to be the mean or arithmetic average over all possibilities, each one taken once. Ironically, the critical comments themselves state: >When no >distribution is given, we generally assume the uniform distribution. Here, that means all possible code value pairs. As for the proof, first note that 2.7 is available: > 2.7 THEOREM: (Position difference counts.) The number of > n-position code values which differ in exactly m positions is > (n) m > (m) * (|S|-1) . > > (n) > (Proof: There are (m) combinations of m positions out of n > possible positions, and in any particular combination of m > positions each position can take on |S|-1 other symbols > producing (|S|-1)**m other code values for each combination.) >[...] > > 2.9 THEOREM: (Average distance and distribution.) The expected > number of elements which differ between two n-position code values > is n * (1 - 1/|S|), and the distribution is binomial. > > (Proof: Assume the number of code differences is the binomial > (n) m n-m > probability of a difference B(m;n,p) = (m) p q , where > where p = 1 - 1/|S| and q = 1-p, times the total number of > n > code values (1/q) : > > (n) m (n) m n-m -n > (m) * (|S|-1) = (m) p q q > ^ ^ | | This is 2.7 This is the binomial equation > (n) m > = (m) (p / (1-p)) > m m (|S|-1) = (p / (1-p)) |S|-1 = p / (1-p) = (1-1/|S|) / (1 - (1-1/|S|)) = (1-1/|S|) / (1 - 1 + 1/|S|) = (1-1/|S|) / (1/|S|) = |S| (1-1/|S|) = |S| - |S|/|S| |S|-1 = |S| - 1 Look, Ma! They're the same! The proven expression from 2.7 is shown to be the binomial equation in drag. This "justifies" the contention that the distribution is binomial. The binomial expectation is np, which is n * (1 - 1/|S|). > which is correct, so the expected number of different elements > is the binomial expectation np.) This proof (2.9) looks good to me as it stands. >[...] >: 3.2 DEFINITION: A RANDOM discrete function allows each output >: code value to be selected independently for each possible input >: condition. > >Bad definition. A function fixes the output value for each input >value, so there is nothing to allow. A random function is one chosen >at random from some distribution of possible functions. When no >distribution is given, we generally assume the uniform distribution. Yes, I agree. Moreover, since the whole point of the "random" thing was to eventually argue that an invertible substitution is not a random function (4.12), which is then never used, it is probably not worth having. And this means that the whole of section 3 is probably not worth having. >[...] >: 3.3 THEOREM: (Number of random functions.) There are 2**2n >: possible random functions with an n-bit binary input code and an >: n-bit binary output code. > >: (Proof: An n-bit binary code can make 2**n possible >: selections, each of which can be 2**n possible values, >: and (2**n)*(2**n) = 2**2n.) > > >Given a function there is no way to tell if it is random (under your >definition or mine). A uniformly distributed random function from >n-bit strings to n-bit strings is equally likely to take any of >(2**n)**(2**n) different values. Let's just delete section 3 and also 4.12. >[...] >: 4.7 THEOREM: (Probable change propagation.) Any change whatsoever >: to the input value to a complete invertible substitution is likely >: to change about half the bits in the output value. > >: (Proof: Changing the input value selects among all remaining >: output code values. If the output is considered to be binary >: bits, we expect about half those bits to change (2.9).) > >Not true. The identity function is a complete invertible permutation. >I can give you many input changes which result in significantly less than >half the output bits changing. > >You need some idea of probability here. I imply that in "expect." I take the expectation to be the arithmetic average over all possibilities, each one taken once. It may be that one could be more explicit about the exact expectation, but an exact expectation is not what we normally mean by "avalanche." For example, we do not see measurements which say "this construct achieves only 83.2 percent of the expected avalanche." With the understanding of the term "expect," I think the proof of 4.7 is satisfactory as it stands. >For example: > >THEOREM: Given n-bit strings a and b with a <> b, if P(x) is chosen >(independently of (x,y) ) from a uniform distribution of all >permutations of the set of n-bit strings, then the expected Hamming >weight of P(a) xor P(b) is: > > 2**(n-1) / (2**n -1) I specifically do not want to get into defining "choosing from a uniform distribution" since that implies a random choice, and then we get to define "random." The proofs that ran into real problems did so over that word. While this could be fixed, any such approach must actually be an indirect way of describing the expectation of an input distribution which I always take to be uniform. Instead of defining a choice "at random," we can instead choose every possibility and report the results. [...] >: 4.8 DEFINITION: AVALANCHE is a statistical property of a discrete >: function in which any change whatsoever on the input is expected to >: produce a change in about half the bits in the output value. > >This one is going to be hard to fix. It's a *definition*: there is nothing to fix. I suppose we could be more explicit in defining the expectation, but a particular explicit expectation is not the conventional quality one associates with avalanche. Maybe someone has a suggestion for an alternative definition of "avalanche," which might well affect the subsequent proofs. Right now, I see no reason to change it (4.8). >For any permutation on n-bit >strings (n>0) there exists some input change which will flip just one >output bit. Proof: for permutation P use the change: > P ; flip-first-bit ; inverse-P. Yes, I would say this is known. For a complete 8-bit substitution, there are in fact eight. See 2.8: > 2.8 EXAMPLE: The number of 8-bit binary codes which differ in m > bits is: > > distance count > 0 1 > 1 8 ^--------- see! > 2 28 > 3 56 > 4 70 > 5 56 > 6 28 > 7 8 > 8 1 > --- > 256 = 2**8 > > (Comment: There are 256 8-bit binary code values, and 255 > values which differ in at least one position from any > particular code value.) > >[...] >: 4.9 THEOREM: (Avalanche is automatic.) Avalanche is an inherent >: property of complete invertible substitution. > >: (Proof: See 4.5, 4.7, and 2.9.) > >Unfortunately 4.7 is incorrect, and the definition of avalanche is >nonsense. The *proof* of 4.7 may be insufficient, but the *fact* of 4.7 *is* correct. These are not just arbitrary abstractions, but instead statements about specific reality. The issue is to write them so they will be accepted on an abstract level. Similarly, it may well be that the definition of avalanche could be improved, but the current ad hoc definition is good enough to understand, and is consistent with the usual definition. This (4.9) looks OK to me. >4.10 and 4.11 again need to be defined for a probability distribution. I take "expectation" to be the arithmetic average over all possibilities; that *is* a distribution. These (4.10, 4.11) seem OK to me. >[...] >: 4.12 THEOREM: (Not a random function.) An invertible substitution >: cannot be a random function. > >: (Proof: Suppose a value is selected for placement somewhere >: in a substitution. Since an invertible substitution cannot >: allow another occurrence of that same value, other values >: cannot be selected independently.) > >Nope. As much as it's defined it's incorrect. >If a function f(x) is chosen from a uniform distribution of all >functions from n-bit strings to n-bit strings, the probability that >f(x) is an invertible substitution (permutation) is: > > (2**n)! / (2**n)**(2**n). I suggest we just throw out 4.12. >[...] >: 4.14 THEOREM: (Reconstruction requires information linking output >: values to input values.) An unknown invertible substitution cannot >: be resolved without simultaneous information about both the input >: value or position and the output value. > >: (Proof: To the extent that a particular substitution can >: be said to have an identity, that identity is the relation >: between substitute values and their position. This relation >: is both necessary and sufficient to define the substitution.) > >Say what? I *intend* to say that a substitution is exclusively a relation between input and output values. If we need to resolve a particular substitution (out of all possible invertible substitutions), we must have simultaneous information about both the input values and the output values. In this way I try to extend the definition of substitution to imply a limit on cryptanalysis. There is a temptation to say that actual values must be known, which is probably wrong, and which I avoid. I'm not sure what else would be clearer here, but perhaps 4.14 could be re-worded. >[...] >: 5.2 THEOREM: An invertible substitution is a bit-mixer. > >May or may not be. >5.1 DEFINITION: A BIT-MIXER combines multiple input bits such >that each output value is defined by each and every input bit. OK, you might like "function of" or a more explicit rendering. Still, the point is clear. The output value from a substitution is "dependent upon" or "a function of" each and every input bit. However, upon reflection, I think the definition is insufficient to capture my real intent, so this is something I will have to work on. Note that the entire purpose for this line of reasoning is reflected in the comment of 12.2: > (Comment: DES with a known key is an example of a block > mixing transform with absolutely no strength at all by itself, > which nevertheless adds strength through bit mixing.) which is significant because of previous questions of whether a mixer without any "strength" can have any cryptographic advantage. This development also justifies the examination of Block Mixing Transforms which have no "strength" but nevertheless mix. >You need to be much more careful in your definitions and proofs. Probably. It would be nice to be able to write proofs which would convince mathematicians, but my main goal is to express reality. If the concepts are true, they can be placed in a good format. >I'm >afraid if you clean up the early stuff, it won't be strong enough to >prove the later stuff. You can show bit-change probabilities for >random permutations, but then you can't apply them your block >operations without making unproven assumptions. I don't think things are nearly that bad. I prove the bit-change distribution of an invertible substitution over all possible input changes, something which might be considered fairly obvious, bit I don't see it in the texts. My principle assumption is that an invertible substitution is the ideal block cipher, and this is consistent with the usual analysis. If an ideal block cipher is an invertible substitution, it must behave like an invertible substitution, and have the same bit-change distribution. And, to the extent that the real block cipher approaches the ideal, it must behave in the same way. I also give specific proofs for the example Block Mixing Transform which allow it to be included in the reasoning. Given this development, we can reason about the output distribution over all possible inputs at every level of the Fenced DES cipher. Then we can reason about the combination of these levels in the final cipher, with results that show the essence of a block cipher. At this point, I claim a block cipher is a block cipher because it is invertible and avalanches every input bit. The first is assumed by construction, and the second is something we can measure (to some extent). That's all I know a block cipher must be. If I had other requirements for a block cipher, then I could try to prove those as well. Terry