Measured Boolean Function Nonlinearity in Variable Size Block Ciphers

A Ciphers By Ritter Page

Terry Ritter

Nonlinearity is the number of bits which must change in the truth table of a Boolean function to reach the closest affine function, and thus is a measure of one kind of cipher strength. Although we cannot measure the nonlinearity of a cipher of practical size, we can measure modest substitution tables and small block constructions. Since a random substitution table is the ideal model of a block cipher, we can measure many random tables and develop an ideal nonlinearity distribution. We can then measure block constructions of the same size and compare distributions. Experimental results indicate that VSBC constructions can produce nonlinearity distributions which are surprisingly close to the ideal. These experiments tend to contradict the claim that block ciphers which use VSBC constructions are inherently weak.
From: ritter@io.com (Terry Ritter)
Newsgroups: sci.crypt
Subject: Measured Nonlinearity in Variable Size Block Ciphers (LONG!)
Date: Sun, 18 Jan 1998 06:14:36 GMT
Lines: 690
Message-ID: <34c19dc1.27169161@news.io.com>


MEASURED NONLINEARITY IN VARIABLE SIZE BLOCK CIPHERS

Terry Ritter
Ritter Software Engineering
http://www.io.com/~ritter/

1998-01-17



Abstract

Nonlinearity is the number of bits which must change in the truth
table of a Boolean function to reach the closest affine function,
and thus is a measure of one kind of cipher strength.  Although
we cannot measure the nonlinearity of a cipher of practical size,
we *can* measure modest substitution tables and small block
constructions.  Since a random substitution table is the ideal
model of a block cipher, we can measure many random tables and
develop an ideal nonlinearity distribution.  We can then measure
block constructions of the same size and compare distributions.
Experimental results indicate that VSBC constructions can produce
nonlinearity distributions which are surprisingly close to the ideal.
These experiments tend to contradict the claim that block ciphers
which use VSBC constructions are inherently weak.


Introduction

The ideal block cipher is a keyed simple substitution table of
sufficient size.  Unfortunately, with 128-bit blocks, there would
be 2**128 entries in that table, which is completely out of the
question.  So the modern block cipher is a *construction* intended
to *simulate* a keyed substitution of the desired size.  At issue
is the effectiveness of the construction technology.  One way to
investigate this is by using Boolean function theory, since a
substitution table -- or cipher -- can be considered a set of
independent Boolean functions, one for each output bit.

A Boolean function produces a single-bit result for each possible
combination of values from perhaps many Boolean variables.  The
*nonlinearity* of a Boolean function is the Hamming distance to
the closest affine function [e.g., PIE88, PIE89, PIE89B].  (Also
see "Measuring Nonlinearity by Walsh Transform" by this author:

   http://www.io.com/~ritter/ARTS/MEASNONL.HTM

.)  That is, nonlinearity is the number of bits which must change
in the truth table of a Boolean function to reach the closest
affine function.  If "linearity" is considered a significant
cryptographic weakness, nonlinearity is an explicit measure of the
*lack* of that weakness.  So nonlinearity measures one form of
cipher "strength."

For cryptographic purposes, it is desired to take the nonlinearity
of a substitution table to be the *minimum* of the nonlinearity
values for each output bit in that table.  Nonlinearity is measured
by forming the (one-bit-wide) truth table for a particular output
bit, then performing a Fast Walsh-Hadamard Transform (FWT) on that
array.  Each result value is essentially a correlation count to a
particular affine function, and the minimum distance is found by
scanning the transform results.

In measuring nonlinearity, it is generally necessary to record the
function result for each possible combination of input variables.
If we have an "8-bit" table, we must record and then transform
256 elements, and if we have a "16-bit" table, we must record and
transform 64K elements.  Thus, measuring large functions rapidly
becomes impossible.  So, although we cannot hope to measure the
nonlinearity of a real 64-bit or 128-bit block cipher, we *can*
measure nonlinearity in substitution tables and small block
constructions.


Ideal Nonlinearity Distributions

By properly shuffling tables using a large-state cryptographic RNG
(in particular:

   http://www.io.com/~ritter/KEYSHUF.HTM

) we can sample a presumably uniform distribution of different table
arrangements.  We can measure the nonlinearity of each of these
tables, and accumulate the results.  For 4-bit tables, we get
something like that shown in Figure 1.


  Nonlinearity Distribution in 4-Bit Tables    Fig. 1

   0.6 |
   0.5 |     *  *
   0.4 |     *  *
   0.3 |     *  *
   0.2 |     *  *
   0.1 |     *  *
   0.0 |  *  *  *
  Prob +--+--+--+--
          0  2  4   Nonlinearity


In balanced Boolean functions, nonlinearity always takes even
values:  If we have two balanced functions at some distance from
each other and change one bit in the second function, the distance
between them will change by one bit, and the second function will
also no longer be balanced.  To balance the second function, we
have to change *another* bit, which either adds to the original
change, or cancels it out.  In the end, the distance between the
functions has changed by +2, -2, or 0 bits.

The 4-bit invertible substitutions are surprisingly weak: Despite
having 16 bits of Boolean function description, no 4-bit table was
found to be more than 4 bits away from an affine function, and
almost half were only 2 bits away.  A few of these tables
(probability 0.01) even contained a true affine function.

In contrast, the 8-bit tables have the opportunity to be far more
complex, since there are 256 bits in each 8-bit Boolean function.
This is shown in Figure 2.


  Nonlinearity Distribution in 8-Bit Tables    Fig. 2

  0.35  |                 *
  0.3   |                 *
  0.25  |              *  *
  0.2   |              *  *  *
  0.15  |              *  *  *
  0.1   |           *  *  *  *
  0.05  |        *  *  *  *  *
  0.00  |     *  *  *  *  *  *  *
  Prob  +--+--+--+--+--+--+--+--+--
             92    96    100   104   Nonlinearity


Some of the literature asserts that *random* S-Boxes will have
various good qualities provided the tables are "large enough"
[AY82, GOR82, OC91, OC93, YOU95B].  This leads to some anxiety
over the question of "How large is 'large enough'?"  In our
experiments, measurement of the nonlinearity of 1,000,000 random
8-bit tables found exactly one table with a nonlinearity as low
as 78; this is a 0.000001 probability (1 count in 1E6).  So not
only are random 8-bit tables with even a single affine coordinate
(nonlinearity = 0) *unlikely*, they are *so* unlikely as to be
almost impossible to find.  (Indeed, [GOR82] gives the probability
as 10**-72.)  So finding accidental "linearity" in random 8-bit
tables seems to be an issue only in theory: this is a non-issue
in practice.

Our block cipher construction problem essentially involves using
small random tables (here 4-bit tables with 16 entries each and a
mean nonlinearity of 3.0) to somehow emulate a table which is
three times as wide (here a 12-bit table of 4096 entries and a mean
nonlinearity of 1907.9).  Stated in this way, a solution seems quite
unlikely, for how can even 15 tables of nonlinearity 3 ever produce
a nonlinearity of 1908?


Variable Size Block Ciphers

A Variable Size Block Cipher (VSBC) (see, for example:

   http://www.io.com/~ritter/CRYPHTML.HTM#VSBCTech

and especially:

   http://www.io.com/~ritter/VSBCCORE.HTM

) differs from most other ciphers in its ability to handle blocks
of dynamically arbitrary size to the byte.  This is accomplished
with one-way mixing layers (see Figure 3) which guarantee that any
change to any input is reflected in all subsequent mixed bytes.
By using multiple such layers we can guarantee that any single-bit
change in the data will affect all columns of the ciphertext result.


  A One-Way Variable-Size Mixing Layer    Fig. 3

        |     |     |             |
        |     |     |             |
        +--->MIX-->MIX--> ... -->MIX----+
              |     |             |     |
              v     v             v     v


Figure 3 shows a single one-way mixing layer:  Each element value
(in practice, each element is typically a byte) enters at the top,
and the mixed results flow out the bottom.  The typical Variable
Size Block Cipher uses multiple layers of mixing, usually in
opposite directions, to produce high quality overall mixing in a
fixed-depth structure.

VSBC's generally use multiple substitution tables, each of which
is shuffled for primary keying.  (See again, for example:

   http://www.io.com/~ritter/KEYSHUF.HTM

.)  The issue here is whether this sort of structure can produce
higher nonlinearity values than the small substitution tables which
are the only nonlinear components.  Obviously, we would like to see
the result approach the ideal nonlinearity distribution for the
larger block.

Figure 4 shows a two-element VSBC construction:  The input block
enters at the top, is split into two and each half substituted.
Those results are mixed into two new block values; this is an
unusual invertible mixing which we call Balanced Block Mixing
(see, for example:

   http://www.io.com/~ritter/CRYPHTML.HTM#BBMTech

and especially:

   http://www.io.com/~ritter/BBM.HTM

).  After mixing, the results are substituted again.  There is
another mixing, another substitution and the two-element block
flows out the bottom.


    A Two-Column Toy VSBC    Fig. 4

        |     |
       SUB   SUB
        |     |
        +--->MIX----+
              |     |
             SUB   SUB
              |     |
        +----MIX<---+
        |     |
       SUB   SUB
        |     |
        v     v


We can compare the model of Figure 4 to the Mixing structure
investigated previously (see

   http://www.io.com/~ritter/ARTS/MIXNONLI.HTM

), as shown in Figure 5, and see that they turn out to be the
same, when reduced to this minimal structure.


   A Two-Element Mixing Cipher   Fig. 5

         |     |
        SUB   SUB
          \   /
           MIX
          /   \
        SUB   SUB
          \   /
           MIX
          /   \
        SUB   SUB
         |     |


The identity between the two-element-width Mixing cipher and the
toy VSBC of similar width means that the results from the earlier
investigation apply here:

   1. A 2-element-width construction with 3 mixing levels, using
   8 random 4-bit substitution tables, does appear to produce a
   nonlinearity distribution consistent with the ideal for an
   8-bit table.

   2. A 2-element-width construction with 2 mixing levels, using
   6 random 5-bit substitution tables, does appear to produce a
   nonlinearity distribution consistent with the ideal for a
   10-bit table.

On the other hand, the earlier results also make clear the inherent
weakness in 4-bit tables:  A 2-element-width construction with just
2 mixing levels, using 6 random 4-bit tables (the structure shown
in Figure 5) was clearly *not* ideal.  And in the current
investigation we *must* use 4-bit tables if we are to have a
reasonable amount of desktop computation.

Here we investigate whether a 12-bit VSBC will produce something like
the nonlinearity distribution of the ideal 12-bit cipher.  So first
we need to identify the ideal 12-bit nonlinearity distribution.


The 12-Bit Ideal

The ideal 12-bit nonlinearity distribution was constructed from
three different 12-hour trials of 100,000 random 12-bit tables each.
(We really would like to have trials at least 10 times this size,
but that would mean three 5-day runs for me, so the current result
should be thought of as a decent guess.)  Each ideal value is near
the average of the three trials, weighted toward any two similar
values, with the counts reduced by a factor of 10 and rounded to
an integer, as shown in Figure 6.


The Ideal 12-Bit Nonlinearity Distribution   Fig. 6

    NL     Trial 1   Trial 2   Trial 3     Ideal

  1838           1         0         0         0
  1840           0         0         0         0
  1842           0         0         0         0
  1844           0         0         2         0
  1846           3         0         0         0
  1848           0         0         2         0
  1850           3         0         0         0
  1852           2         2         1         0
  1854           2         2         2         0
  1856           2         4         2         0
  1858           2         0         4         1
  1860          10        14        12         0
  1862          14        10         6         1
  1864          16        11        14         2
  1866          32        28        17         3
  1868          29        28        32         3
  1870          27        52        58         5
  1872          68        64        66         7
  1874          92        84        98         9
  1876         119       134       127        12
  1878         185       162       197        19
  1880         253       247       249        25
  1882         323       353       312        32
  1884         488       467       507        48
  1886         652       643       674        65
  1888         926       885       915        91
  1890        1152      1185      1207       118
  1892        1506      1566      1660       157
  1894        2094      2132      2108       211
  1896        2762      2772      2751       276
  1898        3711      3643      3552       364
  1900        4596      4572      4665       460
  1902        5614      5753      5624       563
  1904        6974      7050      7022       702
  1906        8158      8210      8136       816
  1908        9395      9489      9313       939
  1910       10155     10101     10132      1013
  1912       10414     10276     10306      1032
  1914        9584      9482      9693       958
  1916        8092      8062      8105       809
  1918        6020      5915      5866       593
  1920        3701      3701      3715       370
  1922        1855      1903      1839       186
  1924         697       753       766        74
  1926         214       201       201        20
  1928          48        39        40         4
  1930           8         5         3         1
  1932           1         0         0         0



The Toy Model

A reasonable first step to investigating VSBC nonlinearity is to
extend the 2-element structure of Figure 4 into the 3-element
structure shown in Figure 7.


   A Three-Column Toy Variable-Size Block Cipher    Fig. 7

        |     |     |
       SUB   SUB   SUB
        |     |     |
        +--->MIX-->MIX----+
              |     |     |
             SUB   SUB   SUB
              |     |     |
        +----MIX<--MIX<---+
        |     |     |
       SUB   SUB   SUB
        |     |     |
        v     v     v


Here, for the first time, the one-way characteristic of the VSBC
mixing layers becomes apparent.  In the figure, three 4-bit elements
enter the top, are substituted and mixed, and the resulting ciphertext
flows out the bottom.  The overall result is a keyed permutation;
a simulated large simple substitution table.


Toy Model Test

In each trial, we first initialize all the substitutions from a
random key.  This produces a simulated 12-bit simple substitution
which we then measure for nonlinearity.  At first we will perform
just a small trial of 1,000 different 12-bit VSBC initializations,
and this result is shown in Figure 8.


  Chi-Square For 12-Bit VSBC   Fig. 8

    NL     Expect     Found

   1408        0         1
   1472        0         2
   1536        0        10
   1568        0        32
   1600        0        52
   1632        0        90
   1664        0       164
   1696        0       216
   1712        0         1
   1728        0       215
   1736        0         1
   1756        0         1
   1760        0       150
   1788        0         1
   1792        0        56
   1824        0         8


The figure shows the results of measuring 1,000 randomly-keyed
constructions, and *not* *one* of these values is as high as the
*minimum* value (typically 1850) we might expect from the ideal
12-bit distribution.  These results are so far from the expectation
as to scarcely require computation:  a computed chi-square total
would be something like 100,000.  Thus, the toy VSBC model of
Figure 7 does *not* even *begin* to approach the distribution we
want from an ideal cipher.


The Full Model

It is known from previous work that a two-level VSBC mixing is weak,
so the next step is to extend the structure of Figure 7 to have
2 one-way mixing levels in each direction, as shown in Figure 9.
Again, we must use the weak 4-bit tables to make the computation
reasonable.


  A Three-Column Variable-Size Block Cipher    Fig. 9

        |     |     |
       SUB   SUB   SUB
        |     |     |
        +--->MIX-->MIX----+
              |     |     |
             SUB   SUB   SUB
              |     |     |
              +--->MIX-->MIX----+
                    |     |     |
                   SUB   SUB   SUB
                    |     |     |
              +----MIX<--MIX<---+
              |     |     |
             SUB   SUB   SUB
              |     |     |
        +----MIX<--MIX<---+
        |     |     |
       SUB   SUB   SUB
        |     |     |
        v     v     v


As before, three 4-bit elements enter the top, are substituted and
mixed multiple times, and the resulting ciphertext flows out the
bottom.

(In practice, a working design would use the vastly stronger 8-bit
tables, and would also use dynamic table selection, so that the same
tables would only rarely occur in the same ciphering position.  But
neither of those improvements are used in these tests.)


Full Model Tests

In each trial, we first initialize all the substitutions from a
random key.  This produces a particular 12-bit cipher -- a simulated
12-bit simple substitution.  We then traverse the complete
transformation -- all 4096 possible data values -- and collect all
4096 12-bit ciphertext results.  We then traverse all 12 ciphertext
bit-columns, one-by-one, and perform a nonlinearity computation.
The minimum nonlinearity value found across all columns is then
accumulated into the growing distribution.  Here we will perform
full trials of 10,000 different 12-bit VSBC initializations each,
with one such trial shown in Figure 10.


  Chi-Square For 12-Bit VSBC   Fig. 10

    NL     Expect     Found       Chi-Sq   DF

   1850        0         0      }
   1852        0         2      }
   1854        0         0      }
   1856        0         0      }
   1858        1         2      }
   1860        0         0      }
   1862        1         0      }
   1864        2         1      }
   1866        3         4      }
   1868        3         2      }  0.100   0
   1870        5         7      }
   1872        7         5      }  0.100   1
   1874        9         5      }
   1876       12        16      }  0.100   2
   1878       19        28         4.363   3
   1880       25        25         4.363   4
   1882       32        46        10.488   5
   1884       48        42        11.238   6
   1886       65        85        17.392   7
   1888       91        77        19.546   8
   1890      118       126        20.088   9
   1892      157       153        20.190  10
   1894      211       214        20.233  11
   1896      276       263        20.845  12
   1898      364       346        21.735  13
   1900      460       453        21.842  14
   1902      563       576        22.142  15
   1904      702       719        22.554  16
   1906      816       813        22.565  17
   1908      939       947        22.633  18
   1910     1013      1030        22.918  19
   1912     1032       973        26.291  20
   1914      958       967        26.376  21
   1916      809       816        26.436  22
   1918      593       588        26.478  23
   1920      370       362        26.651  24
   1922      186       199        27.560  25
   1924       74        81        28.222  26
   1926       20        24      } 28.382  27
   1928        4         3      }
   1930        1         0      }
   1932        0         0      }
   1934        0         0      }
   1936        0         0      }
   1938        0         0      }


With 27 "degrees of freedom," the chi-square value of 28.382
found in Figure 8 is very believable.  The critical chi-square
values for DF = 27 are shown in Figure 11.


 Chi-Square Critical Values for DegFree = 27   Fig. 11

       1%      5%     25%     50%     75%     95%     99%
   12.878  16.151  21.749  26.336  31.528  40.113  46.962


Additional trials were performed (at 1.4 hours each), producing
the overall chi-square results of Figure 12.


 Chi-Square Values for 10,000-Cipher Trials    Fig. 12

   11.362   13.933   17.418   17.822   19.520
   20.052   20.954   20.992   21.024   21.084
   21.415   21.936   22.629   22.913   24.171
   24.224   24.291   25.306   26.263   26.781
   27.754   28.382   28.619   29.252   30.271
   31.369   31.388   32.555   33.015   33.367
   33.990   34.883   35.674   36.171   37.987
   39.853   39.980   43.197   45.514


Most of the collected values seem reasonable.

One value *is* unexpectedly low, meaning that the sample distribution
was unexpectedly *close* to the ideal.  The actual probability of
finding a chi-square value of 11.367 (by random sampling the ideal
distribution) with 27 degrees of freedom is 0.0036.  This is about
4 in 1000, so we would *expect* to get something like this if we
would just conduct 278 trials.  But random trials do not know how
many trials have passed before, and the fact is that the value
occurred early.  There is no preponderance of such occurrences.

Also, in this data set, the tails seem to be represented more
frequently than the center.

Despite these variations, all but one of the values seem at least
reasonable, and -- as we saw previously -- chi-square values in this
particular range do not repeatedly occur by chance alone.  We are
thus forced to conclude that the sampled distribution is reasonably
close to the constructed ideal distribution for random 12-bit
substitutions.


Conclusions

From these results we conclude that the 3-element VSBC construction
using 15 random 4-bit tables does in fact produce nonlinearity levels
and distributions *remarkably* *similar* to those of an ideal 12-bit
cipher.  These experiments thus support the VSBC approach to block
cipher design, and do *not* support the claim that such ciphers are
inherently weak.

Note that the ability to perform these experiments is based on the
"scalability" of the cipher design.  This is an important advantage
that very few designs offer:  Scalability supports far deeper
experimental analysis than is possible in a full-size cipher.

I now believe that the inability of a cipher to support the deep
analysis available to a scaled design is nothing less than a flaw
in the cipher design itself.  There is no theory of cipher strength
such that, if we only follow those rules, we will build a "strong"
cipher.  Nor can we hope to thoroughly review any large cipher
experimentally.  It would seem that the only way we can hope to
find completely unexpected strength problems is to *scale* the
cipher to a size which we *can* address experimentally.  Since
large ciphers cannot be thoroughly reviewed, they also cannot be
fully trusted.


References and Bibliography

[AY82]  Ayoub, F.  1982.  Probabilistic completeness of
substitution-permutation encryption networks.  IEE Proceedings,
Part E.  129(5): 195-199.

[DAE94]  Daemen, J., R. Govaerts and J. Vandewalle.  1994.
Correlation Matrices.  Fast Software Encryption.  275-285.

[FOR88]  Forre, R.  1988.  The Strict Avalanche Criterion:
Spectral Properties of Boolean Functions and an Extended
Definition.  Advances in Cryptology -- CRYPTO '88.  450-468.

[GOR82]  Gordon, J. and H. Retkin.  1982.  Are Big S-Boxes Best?
Cryptography.  Proceedings of the Workshop on Cryptography,
Burg Feuerstein, Germany, March 29-April 2, 1982.  257-262.

[HEY94]  Heys, H. and S. Tavares.  1994.  On the security of the
CAST encryption algorithm.  Canadian Conference on Electrical and
Computer Engineering.  Halifax, Nova Scotia, Canada, Sept. 1994.
332-335.

[HEY95]  Heys, H. and S. Tavares.  1995.  Known plaintext
cryptanalysis of tree-structured block ciphers.  Electronics
Letters.  31(10): 784-785.

[MEI89]  Meier, W. and O. Staffelbach.  1989.  Nonlinearity Criteria
for Cryptographic Functions.  Advances in Cryptology --
Eurocrypt '89.  549-562.

[MIR97]  Mirza, F.  1997.  Linear and S-Box Pairs Cryptanalysis
of the Data Encryption Standard.

[OC91]  O'Connor, L.  1991.  Enumerating nondegenerate permutations.
Advances in Cryptology -- Eurocrypt '91.  368-377.

[OC93]  O'Connor, L.  1993.  On the Distribution Characteristics in
Bijective Mappings.  Advances in Cryptology -- EUROCRYPT '93.
360-370.

[PIE88]  Pieprzyk, J. and G. Finkelstein. 1988.  Towards effective
nonlinear cryptosystem design.  IEE Proceedings, Part E.
135(6): 325-335.

[PIE89]  Pieprzyk, J. and G. Finkelstein.  1989.  Permutations
that Maximize Non-Linearity and Their Cryptographic Significance.
Computer Security in the Age of Information.  63-74.

[PIE89B]  Pieprzyk, J.  1989.  Non-linearity of Exponent Permutations.
Advances in Cryptology -- EUROCRYPT '89. 80-92.

[PIE93]  Pieprzyk, J., C. Charnes and J. Seberry.  1993.  Linear
Approximation Versus Nonlinearity.  Proceedings of the Workshop on
Selected Areas in Cryptography (SAC '94).  82-89.

[PRE90]  Preneel, B., W. Van Leekwijck, L. Van Linden, R. Govaerts
and J. Vandewalle.  1990.  Propagation Characteristics of Boolean
Functions.  Advances in Cryptology -- Eurocrypt '90.  161-173.

[RUE86]  Rueppel, R.  1986.  Analysis and Design of Stream Ciphers.
Springer-Verlag.

[XIO88]  Xiao, G-Z. and J. Massey.  1988.  A Spectral Characterization
of Correlation-Immune Combining Functions.  IEEE Transactions on
Information Theory.  34(3): 569-571.

[YOU95]  Youssef, A. and S. Tavares.  1995.  Resistance of Balanced
S-boxes to Linear and Differential Cryptanalysis.  Information
Processing Letters.  56: 249-252.

[YOU95B]  Youssef, A. and S. Tavares.  1995.  Number of Nonlinear
Regular S-boxes.  Electronics Letters.  31(19): 1643-1644.

[ZHA95]  Zhang, X. and Y. Zheng.  1995.  GAC -- the Criterion for
Global Avalanche Characteristics of Cryptographic Functions.
Journal for Universal Computer Science.  1(5): 316-333.


---
Terry Ritter   ritter@io.com   http://www.io.com/~ritter/
The Crypto Glossary:  http://www.io.com/~ritter/GLOSSARY.HTM
Also see: Measuring Nonlinearity by Walsh Transform (1998) (20K)
and: Measured Nonlinearity in Mixing Constructions (1997) (25K)
Terry Ritter, his current address, and his top page.
Last updated: 1998-02-26