Nonlinearity is the number of bits which must change in the truth table of a Boolean function to reach the closest affine function, and thus is a measure of one kind of cipher strength. Although we cannot measure the nonlinearity of a cipher of practical size, we can measure modest substitution tables and small block constructions. Since a random substitution table is the ideal model of a block cipher, we can measure many random tables and develop an ideal nonlinearity distribution. We can then measure block constructions of the same size and compare distributions. Experimental results indicate that VSBC constructions can produce nonlinearity distributions which are surprisingly close to the ideal. These experiments tend to contradict the claim that block ciphers which use VSBC constructions are inherently weak.
From: ritter@io.com (Terry Ritter)
Newsgroups: sci.crypt
Subject: Measured Nonlinearity in Variable Size Block Ciphers (LONG!)
Date: Sun, 18 Jan 1998 06:14:36 GMT
Lines: 690
Message-ID: <34c19dc1.27169161@news.io.com>
MEASURED NONLINEARITY IN VARIABLE SIZE BLOCK CIPHERS
Terry Ritter
Ritter Software Engineering
http://www.io.com/~ritter/
1998-01-17
Abstract
Nonlinearity is the number of bits which must change in the truth
table of a Boolean function to reach the closest affine function,
and thus is a measure of one kind of cipher strength. Although
we cannot measure the nonlinearity of a cipher of practical size,
we *can* measure modest substitution tables and small block
constructions. Since a random substitution table is the ideal
model of a block cipher, we can measure many random tables and
develop an ideal nonlinearity distribution. We can then measure
block constructions of the same size and compare distributions.
Experimental results indicate that VSBC constructions can produce
nonlinearity distributions which are surprisingly close to the ideal.
These experiments tend to contradict the claim that block ciphers
which use VSBC constructions are inherently weak.
Introduction
The ideal block cipher is a keyed simple substitution table of
sufficient size. Unfortunately, with 128-bit blocks, there would
be 2**128 entries in that table, which is completely out of the
question. So the modern block cipher is a *construction* intended
to *simulate* a keyed substitution of the desired size. At issue
is the effectiveness of the construction technology. One way to
investigate this is by using Boolean function theory, since a
substitution table -- or cipher -- can be considered a set of
independent Boolean functions, one for each output bit.
A Boolean function produces a single-bit result for each possible
combination of values from perhaps many Boolean variables. The
*nonlinearity* of a Boolean function is the Hamming distance to
the closest affine function [e.g., PIE88, PIE89, PIE89B]. (Also
see "Measuring Nonlinearity by Walsh Transform" by this author:
http://www.io.com/~ritter/ARTS/MEASNONL.HTM
.) That is, nonlinearity is the number of bits which must change
in the truth table of a Boolean function to reach the closest
affine function. If "linearity" is considered a significant
cryptographic weakness, nonlinearity is an explicit measure of the
*lack* of that weakness. So nonlinearity measures one form of
cipher "strength."
For cryptographic purposes, it is desired to take the nonlinearity
of a substitution table to be the *minimum* of the nonlinearity
values for each output bit in that table. Nonlinearity is measured
by forming the (one-bit-wide) truth table for a particular output
bit, then performing a Fast Walsh-Hadamard Transform (FWT) on that
array. Each result value is essentially a correlation count to a
particular affine function, and the minimum distance is found by
scanning the transform results.
In measuring nonlinearity, it is generally necessary to record the
function result for each possible combination of input variables.
If we have an "8-bit" table, we must record and then transform
256 elements, and if we have a "16-bit" table, we must record and
transform 64K elements. Thus, measuring large functions rapidly
becomes impossible. So, although we cannot hope to measure the
nonlinearity of a real 64-bit or 128-bit block cipher, we *can*
measure nonlinearity in substitution tables and small block
constructions.
Ideal Nonlinearity Distributions
By properly shuffling tables using a large-state cryptographic RNG
(in particular:
http://www.io.com/~ritter/KEYSHUF.HTM
) we can sample a presumably uniform distribution of different table
arrangements. We can measure the nonlinearity of each of these
tables, and accumulate the results. For 4-bit tables, we get
something like that shown in Figure 1.
Nonlinearity Distribution in 4-Bit Tables Fig. 1
0.6 |
0.5 | * *
0.4 | * *
0.3 | * *
0.2 | * *
0.1 | * *
0.0 | * * *
Prob +--+--+--+--
0 2 4 Nonlinearity
In balanced Boolean functions, nonlinearity always takes even
values: If we have two balanced functions at some distance from
each other and change one bit in the second function, the distance
between them will change by one bit, and the second function will
also no longer be balanced. To balance the second function, we
have to change *another* bit, which either adds to the original
change, or cancels it out. In the end, the distance between the
functions has changed by +2, -2, or 0 bits.
The 4-bit invertible substitutions are surprisingly weak: Despite
having 16 bits of Boolean function description, no 4-bit table was
found to be more than 4 bits away from an affine function, and
almost half were only 2 bits away. A few of these tables
(probability 0.01) even contained a true affine function.
In contrast, the 8-bit tables have the opportunity to be far more
complex, since there are 256 bits in each 8-bit Boolean function.
This is shown in Figure 2.
Nonlinearity Distribution in 8-Bit Tables Fig. 2
0.35 | *
0.3 | *
0.25 | * *
0.2 | * * *
0.15 | * * *
0.1 | * * * *
0.05 | * * * * *
0.00 | * * * * * * *
Prob +--+--+--+--+--+--+--+--+--
92 96 100 104 Nonlinearity
Some of the literature asserts that *random* S-Boxes will have
various good qualities provided the tables are "large enough"
[AY82, GOR82, OC91, OC93, YOU95B]. This leads to some anxiety
over the question of "How large is 'large enough'?" In our
experiments, measurement of the nonlinearity of 1,000,000 random
8-bit tables found exactly one table with a nonlinearity as low
as 78; this is a 0.000001 probability (1 count in 1E6). So not
only are random 8-bit tables with even a single affine coordinate
(nonlinearity = 0) *unlikely*, they are *so* unlikely as to be
almost impossible to find. (Indeed, [GOR82] gives the probability
as 10**-72.) So finding accidental "linearity" in random 8-bit
tables seems to be an issue only in theory: this is a non-issue
in practice.
Our block cipher construction problem essentially involves using
small random tables (here 4-bit tables with 16 entries each and a
mean nonlinearity of 3.0) to somehow emulate a table which is
three times as wide (here a 12-bit table of 4096 entries and a mean
nonlinearity of 1907.9). Stated in this way, a solution seems quite
unlikely, for how can even 15 tables of nonlinearity 3 ever produce
a nonlinearity of 1908?
Variable Size Block Ciphers
A Variable Size Block Cipher (VSBC) (see, for example:
http://www.io.com/~ritter/CRYPHTML.HTM#VSBCTech
and especially:
http://www.io.com/~ritter/VSBCCORE.HTM
) differs from most other ciphers in its ability to handle blocks
of dynamically arbitrary size to the byte. This is accomplished
with one-way mixing layers (see Figure 3) which guarantee that any
change to any input is reflected in all subsequent mixed bytes.
By using multiple such layers we can guarantee that any single-bit
change in the data will affect all columns of the ciphertext result.
A One-Way Variable-Size Mixing Layer Fig. 3
| | | |
| | | |
+--->MIX-->MIX--> ... -->MIX----+
| | | |
v v v v
Figure 3 shows a single one-way mixing layer: Each element value
(in practice, each element is typically a byte) enters at the top,
and the mixed results flow out the bottom. The typical Variable
Size Block Cipher uses multiple layers of mixing, usually in
opposite directions, to produce high quality overall mixing in a
fixed-depth structure.
VSBC's generally use multiple substitution tables, each of which
is shuffled for primary keying. (See again, for example:
http://www.io.com/~ritter/KEYSHUF.HTM
.) The issue here is whether this sort of structure can produce
higher nonlinearity values than the small substitution tables which
are the only nonlinear components. Obviously, we would like to see
the result approach the ideal nonlinearity distribution for the
larger block.
Figure 4 shows a two-element VSBC construction: The input block
enters at the top, is split into two and each half substituted.
Those results are mixed into two new block values; this is an
unusual invertible mixing which we call Balanced Block Mixing
(see, for example:
http://www.io.com/~ritter/CRYPHTML.HTM#BBMTech
and especially:
http://www.io.com/~ritter/BBM.HTM
). After mixing, the results are substituted again. There is
another mixing, another substitution and the two-element block
flows out the bottom.
A Two-Column Toy VSBC Fig. 4
| |
SUB SUB
| |
+--->MIX----+
| |
SUB SUB
| |
+----MIX<---+
| |
SUB SUB
| |
v v
We can compare the model of Figure 4 to the Mixing structure
investigated previously (see
http://www.io.com/~ritter/ARTS/MIXNONLI.HTM
), as shown in Figure 5, and see that they turn out to be the
same, when reduced to this minimal structure.
A Two-Element Mixing Cipher Fig. 5
| |
SUB SUB
\ /
MIX
/ \
SUB SUB
\ /
MIX
/ \
SUB SUB
| |
The identity between the two-element-width Mixing cipher and the
toy VSBC of similar width means that the results from the earlier
investigation apply here:
1. A 2-element-width construction with 3 mixing levels, using
8 random 4-bit substitution tables, does appear to produce a
nonlinearity distribution consistent with the ideal for an
8-bit table.
2. A 2-element-width construction with 2 mixing levels, using
6 random 5-bit substitution tables, does appear to produce a
nonlinearity distribution consistent with the ideal for a
10-bit table.
On the other hand, the earlier results also make clear the inherent
weakness in 4-bit tables: A 2-element-width construction with just
2 mixing levels, using 6 random 4-bit tables (the structure shown
in Figure 5) was clearly *not* ideal. And in the current
investigation we *must* use 4-bit tables if we are to have a
reasonable amount of desktop computation.
Here we investigate whether a 12-bit VSBC will produce something like
the nonlinearity distribution of the ideal 12-bit cipher. So first
we need to identify the ideal 12-bit nonlinearity distribution.
The 12-Bit Ideal
The ideal 12-bit nonlinearity distribution was constructed from
three different 12-hour trials of 100,000 random 12-bit tables each.
(We really would like to have trials at least 10 times this size,
but that would mean three 5-day runs for me, so the current result
should be thought of as a decent guess.) Each ideal value is near
the average of the three trials, weighted toward any two similar
values, with the counts reduced by a factor of 10 and rounded to
an integer, as shown in Figure 6.
The Ideal 12-Bit Nonlinearity Distribution Fig. 6
NL Trial 1 Trial 2 Trial 3 Ideal
1838 1 0 0 0
1840 0 0 0 0
1842 0 0 0 0
1844 0 0 2 0
1846 3 0 0 0
1848 0 0 2 0
1850 3 0 0 0
1852 2 2 1 0
1854 2 2 2 0
1856 2 4 2 0
1858 2 0 4 1
1860 10 14 12 0
1862 14 10 6 1
1864 16 11 14 2
1866 32 28 17 3
1868 29 28 32 3
1870 27 52 58 5
1872 68 64 66 7
1874 92 84 98 9
1876 119 134 127 12
1878 185 162 197 19
1880 253 247 249 25
1882 323 353 312 32
1884 488 467 507 48
1886 652 643 674 65
1888 926 885 915 91
1890 1152 1185 1207 118
1892 1506 1566 1660 157
1894 2094 2132 2108 211
1896 2762 2772 2751 276
1898 3711 3643 3552 364
1900 4596 4572 4665 460
1902 5614 5753 5624 563
1904 6974 7050 7022 702
1906 8158 8210 8136 816
1908 9395 9489 9313 939
1910 10155 10101 10132 1013
1912 10414 10276 10306 1032
1914 9584 9482 9693 958
1916 8092 8062 8105 809
1918 6020 5915 5866 593
1920 3701 3701 3715 370
1922 1855 1903 1839 186
1924 697 753 766 74
1926 214 201 201 20
1928 48 39 40 4
1930 8 5 3 1
1932 1 0 0 0
The Toy Model
A reasonable first step to investigating VSBC nonlinearity is to
extend the 2-element structure of Figure 4 into the 3-element
structure shown in Figure 7.
A Three-Column Toy Variable-Size Block Cipher Fig. 7
| | |
SUB SUB SUB
| | |
+--->MIX-->MIX----+
| | |
SUB SUB SUB
| | |
+----MIX<--MIX<---+
| | |
SUB SUB SUB
| | |
v v v
Here, for the first time, the one-way characteristic of the VSBC
mixing layers becomes apparent. In the figure, three 4-bit elements
enter the top, are substituted and mixed, and the resulting ciphertext
flows out the bottom. The overall result is a keyed permutation;
a simulated large simple substitution table.
Toy Model Test
In each trial, we first initialize all the substitutions from a
random key. This produces a simulated 12-bit simple substitution
which we then measure for nonlinearity. At first we will perform
just a small trial of 1,000 different 12-bit VSBC initializations,
and this result is shown in Figure 8.
Chi-Square For 12-Bit VSBC Fig. 8
NL Expect Found
1408 0 1
1472 0 2
1536 0 10
1568 0 32
1600 0 52
1632 0 90
1664 0 164
1696 0 216
1712 0 1
1728 0 215
1736 0 1
1756 0 1
1760 0 150
1788 0 1
1792 0 56
1824 0 8
The figure shows the results of measuring 1,000 randomly-keyed
constructions, and *not* *one* of these values is as high as the
*minimum* value (typically 1850) we might expect from the ideal
12-bit distribution. These results are so far from the expectation
as to scarcely require computation: a computed chi-square total
would be something like 100,000. Thus, the toy VSBC model of
Figure 7 does *not* even *begin* to approach the distribution we
want from an ideal cipher.
The Full Model
It is known from previous work that a two-level VSBC mixing is weak,
so the next step is to extend the structure of Figure 7 to have
2 one-way mixing levels in each direction, as shown in Figure 9.
Again, we must use the weak 4-bit tables to make the computation
reasonable.
A Three-Column Variable-Size Block Cipher Fig. 9
| | |
SUB SUB SUB
| | |
+--->MIX-->MIX----+
| | |
SUB SUB SUB
| | |
+--->MIX-->MIX----+
| | |
SUB SUB SUB
| | |
+----MIX<--MIX<---+
| | |
SUB SUB SUB
| | |
+----MIX<--MIX<---+
| | |
SUB SUB SUB
| | |
v v v
As before, three 4-bit elements enter the top, are substituted and
mixed multiple times, and the resulting ciphertext flows out the
bottom.
(In practice, a working design would use the vastly stronger 8-bit
tables, and would also use dynamic table selection, so that the same
tables would only rarely occur in the same ciphering position. But
neither of those improvements are used in these tests.)
Full Model Tests
In each trial, we first initialize all the substitutions from a
random key. This produces a particular 12-bit cipher -- a simulated
12-bit simple substitution. We then traverse the complete
transformation -- all 4096 possible data values -- and collect all
4096 12-bit ciphertext results. We then traverse all 12 ciphertext
bit-columns, one-by-one, and perform a nonlinearity computation.
The minimum nonlinearity value found across all columns is then
accumulated into the growing distribution. Here we will perform
full trials of 10,000 different 12-bit VSBC initializations each,
with one such trial shown in Figure 10.
Chi-Square For 12-Bit VSBC Fig. 10
NL Expect Found Chi-Sq DF
1850 0 0 }
1852 0 2 }
1854 0 0 }
1856 0 0 }
1858 1 2 }
1860 0 0 }
1862 1 0 }
1864 2 1 }
1866 3 4 }
1868 3 2 } 0.100 0
1870 5 7 }
1872 7 5 } 0.100 1
1874 9 5 }
1876 12 16 } 0.100 2
1878 19 28 4.363 3
1880 25 25 4.363 4
1882 32 46 10.488 5
1884 48 42 11.238 6
1886 65 85 17.392 7
1888 91 77 19.546 8
1890 118 126 20.088 9
1892 157 153 20.190 10
1894 211 214 20.233 11
1896 276 263 20.845 12
1898 364 346 21.735 13
1900 460 453 21.842 14
1902 563 576 22.142 15
1904 702 719 22.554 16
1906 816 813 22.565 17
1908 939 947 22.633 18
1910 1013 1030 22.918 19
1912 1032 973 26.291 20
1914 958 967 26.376 21
1916 809 816 26.436 22
1918 593 588 26.478 23
1920 370 362 26.651 24
1922 186 199 27.560 25
1924 74 81 28.222 26
1926 20 24 } 28.382 27
1928 4 3 }
1930 1 0 }
1932 0 0 }
1934 0 0 }
1936 0 0 }
1938 0 0 }
With 27 "degrees of freedom," the chi-square value of 28.382
found in Figure 8 is very believable. The critical chi-square
values for DF = 27 are shown in Figure 11.
Chi-Square Critical Values for DegFree = 27 Fig. 11
1% 5% 25% 50% 75% 95% 99%
12.878 16.151 21.749 26.336 31.528 40.113 46.962
Additional trials were performed (at 1.4 hours each), producing
the overall chi-square results of Figure 12.
Chi-Square Values for 10,000-Cipher Trials Fig. 12
11.362 13.933 17.418 17.822 19.520
20.052 20.954 20.992 21.024 21.084
21.415 21.936 22.629 22.913 24.171
24.224 24.291 25.306 26.263 26.781
27.754 28.382 28.619 29.252 30.271
31.369 31.388 32.555 33.015 33.367
33.990 34.883 35.674 36.171 37.987
39.853 39.980 43.197 45.514
Most of the collected values seem reasonable.
One value *is* unexpectedly low, meaning that the sample distribution
was unexpectedly *close* to the ideal. The actual probability of
finding a chi-square value of 11.367 (by random sampling the ideal
distribution) with 27 degrees of freedom is 0.0036. This is about
4 in 1000, so we would *expect* to get something like this if we
would just conduct 278 trials. But random trials do not know how
many trials have passed before, and the fact is that the value
occurred early. There is no preponderance of such occurrences.
Also, in this data set, the tails seem to be represented more
frequently than the center.
Despite these variations, all but one of the values seem at least
reasonable, and -- as we saw previously -- chi-square values in this
particular range do not repeatedly occur by chance alone. We are
thus forced to conclude that the sampled distribution is reasonably
close to the constructed ideal distribution for random 12-bit
substitutions.
Conclusions
From these results we conclude that the 3-element VSBC construction
using 15 random 4-bit tables does in fact produce nonlinearity levels
and distributions *remarkably* *similar* to those of an ideal 12-bit
cipher. These experiments thus support the VSBC approach to block
cipher design, and do *not* support the claim that such ciphers are
inherently weak.
Note that the ability to perform these experiments is based on the
"scalability" of the cipher design. This is an important advantage
that very few designs offer: Scalability supports far deeper
experimental analysis than is possible in a full-size cipher.
I now believe that the inability of a cipher to support the deep
analysis available to a scaled design is nothing less than a flaw
in the cipher design itself. There is no theory of cipher strength
such that, if we only follow those rules, we will build a "strong"
cipher. Nor can we hope to thoroughly review any large cipher
experimentally. It would seem that the only way we can hope to
find completely unexpected strength problems is to *scale* the
cipher to a size which we *can* address experimentally. Since
large ciphers cannot be thoroughly reviewed, they also cannot be
fully trusted.
References and Bibliography
[AY82] Ayoub, F. 1982. Probabilistic completeness of
substitution-permutation encryption networks. IEE Proceedings,
Part E. 129(5): 195-199.
[DAE94] Daemen, J., R. Govaerts and J. Vandewalle. 1994.
Correlation Matrices. Fast Software Encryption. 275-285.
[FOR88] Forre, R. 1988. The Strict Avalanche Criterion:
Spectral Properties of Boolean Functions and an Extended
Definition. Advances in Cryptology -- CRYPTO '88. 450-468.
[GOR82] Gordon, J. and H. Retkin. 1982. Are Big S-Boxes Best?
Cryptography. Proceedings of the Workshop on Cryptography,
Burg Feuerstein, Germany, March 29-April 2, 1982. 257-262.
[HEY94] Heys, H. and S. Tavares. 1994. On the security of the
CAST encryption algorithm. Canadian Conference on Electrical and
Computer Engineering. Halifax, Nova Scotia, Canada, Sept. 1994.
332-335.
[HEY95] Heys, H. and S. Tavares. 1995. Known plaintext
cryptanalysis of tree-structured block ciphers. Electronics
Letters. 31(10): 784-785.
[MEI89] Meier, W. and O. Staffelbach. 1989. Nonlinearity Criteria
for Cryptographic Functions. Advances in Cryptology --
Eurocrypt '89. 549-562.
[MIR97] Mirza, F. 1997. Linear and S-Box Pairs Cryptanalysis
of the Data Encryption Standard.
[OC91] O'Connor, L. 1991. Enumerating nondegenerate permutations.
Advances in Cryptology -- Eurocrypt '91. 368-377.
[OC93] O'Connor, L. 1993. On the Distribution Characteristics in
Bijective Mappings. Advances in Cryptology -- EUROCRYPT '93.
360-370.
[PIE88] Pieprzyk, J. and G. Finkelstein. 1988. Towards effective
nonlinear cryptosystem design. IEE Proceedings, Part E.
135(6): 325-335.
[PIE89] Pieprzyk, J. and G. Finkelstein. 1989. Permutations
that Maximize Non-Linearity and Their Cryptographic Significance.
Computer Security in the Age of Information. 63-74.
[PIE89B] Pieprzyk, J. 1989. Non-linearity of Exponent Permutations.
Advances in Cryptology -- EUROCRYPT '89. 80-92.
[PIE93] Pieprzyk, J., C. Charnes and J. Seberry. 1993. Linear
Approximation Versus Nonlinearity. Proceedings of the Workshop on
Selected Areas in Cryptography (SAC '94). 82-89.
[PRE90] Preneel, B., W. Van Leekwijck, L. Van Linden, R. Govaerts
and J. Vandewalle. 1990. Propagation Characteristics of Boolean
Functions. Advances in Cryptology -- Eurocrypt '90. 161-173.
[RUE86] Rueppel, R. 1986. Analysis and Design of Stream Ciphers.
Springer-Verlag.
[XIO88] Xiao, G-Z. and J. Massey. 1988. A Spectral Characterization
of Correlation-Immune Combining Functions. IEEE Transactions on
Information Theory. 34(3): 569-571.
[YOU95] Youssef, A. and S. Tavares. 1995. Resistance of Balanced
S-boxes to Linear and Differential Cryptanalysis. Information
Processing Letters. 56: 249-252.
[YOU95B] Youssef, A. and S. Tavares. 1995. Number of Nonlinear
Regular S-boxes. Electronics Letters. 31(19): 1643-1644.
[ZHA95] Zhang, X. and Y. Zheng. 1995. GAC -- the Criterion for
Global Avalanche Characteristics of Cryptographic Functions.
Journal for Universal Computer Science. 1(5): 316-333.
---
Terry Ritter ritter@io.com http://www.io.com/~ritter/
The Crypto Glossary: http://www.io.com/~ritter/GLOSSARY.HTM
Terry Ritter, his current address, and his top page.
Last updated: 1998-02-26