Nonlinearity is the number of bits which must change in the truth table of a Boolean function to reach the closest affine function, and thus is a measure of one kind of cipher strength. Although we cannot measure the nonlinearity of a cipher of practical size, we can measure modest substitution tables and small block constructions. Since a random substitution table is the ideal model of a block cipher, we can measure many random tables and develop an ideal nonlinearity distribution. We can then measure block constructions of the same size and compare distributions. Experimental results indicate that VSBC constructions can produce nonlinearity distributions which are surprisingly close to the ideal. These experiments tend to contradict the claim that block ciphers which use VSBC constructions are inherently weak.
From: ritter@io.com (Terry Ritter) Newsgroups: sci.crypt Subject: Measured Nonlinearity in Variable Size Block Ciphers (LONG!) Date: Sun, 18 Jan 1998 06:14:36 GMT Lines: 690 Message-ID: <34c19dc1.27169161@news.io.com> MEASURED NONLINEARITY IN VARIABLE SIZE BLOCK CIPHERS Terry Ritter Ritter Software Engineering http://www.io.com/~ritter/ 1998-01-17 Abstract Nonlinearity is the number of bits which must change in the truth table of a Boolean function to reach the closest affine function, and thus is a measure of one kind of cipher strength. Although we cannot measure the nonlinearity of a cipher of practical size, we *can* measure modest substitution tables and small block constructions. Since a random substitution table is the ideal model of a block cipher, we can measure many random tables and develop an ideal nonlinearity distribution. We can then measure block constructions of the same size and compare distributions. Experimental results indicate that VSBC constructions can produce nonlinearity distributions which are surprisingly close to the ideal. These experiments tend to contradict the claim that block ciphers which use VSBC constructions are inherently weak. Introduction The ideal block cipher is a keyed simple substitution table of sufficient size. Unfortunately, with 128-bit blocks, there would be 2**128 entries in that table, which is completely out of the question. So the modern block cipher is a *construction* intended to *simulate* a keyed substitution of the desired size. At issue is the effectiveness of the construction technology. One way to investigate this is by using Boolean function theory, since a substitution table -- or cipher -- can be considered a set of independent Boolean functions, one for each output bit. A Boolean function produces a single-bit result for each possible combination of values from perhaps many Boolean variables. The *nonlinearity* of a Boolean function is the Hamming distance to the closest affine function [e.g., PIE88, PIE89, PIE89B]. (Also see "Measuring Nonlinearity by Walsh Transform" by this author: http://www.io.com/~ritter/ARTS/MEASNONL.HTM .) That is, nonlinearity is the number of bits which must change in the truth table of a Boolean function to reach the closest affine function. If "linearity" is considered a significant cryptographic weakness, nonlinearity is an explicit measure of the *lack* of that weakness. So nonlinearity measures one form of cipher "strength." For cryptographic purposes, it is desired to take the nonlinearity of a substitution table to be the *minimum* of the nonlinearity values for each output bit in that table. Nonlinearity is measured by forming the (one-bit-wide) truth table for a particular output bit, then performing a Fast Walsh-Hadamard Transform (FWT) on that array. Each result value is essentially a correlation count to a particular affine function, and the minimum distance is found by scanning the transform results. In measuring nonlinearity, it is generally necessary to record the function result for each possible combination of input variables. If we have an "8-bit" table, we must record and then transform 256 elements, and if we have a "16-bit" table, we must record and transform 64K elements. Thus, measuring large functions rapidly becomes impossible. So, although we cannot hope to measure the nonlinearity of a real 64-bit or 128-bit block cipher, we *can* measure nonlinearity in substitution tables and small block constructions. Ideal Nonlinearity Distributions By properly shuffling tables using a large-state cryptographic RNG (in particular: http://www.io.com/~ritter/KEYSHUF.HTM ) we can sample a presumably uniform distribution of different table arrangements. We can measure the nonlinearity of each of these tables, and accumulate the results. For 4-bit tables, we get something like that shown in Figure 1. Nonlinearity Distribution in 4-Bit Tables Fig. 1 0.6 | 0.5 | * * 0.4 | * * 0.3 | * * 0.2 | * * 0.1 | * * 0.0 | * * * Prob +--+--+--+-- 0 2 4 Nonlinearity In balanced Boolean functions, nonlinearity always takes even values: If we have two balanced functions at some distance from each other and change one bit in the second function, the distance between them will change by one bit, and the second function will also no longer be balanced. To balance the second function, we have to change *another* bit, which either adds to the original change, or cancels it out. In the end, the distance between the functions has changed by +2, -2, or 0 bits. The 4-bit invertible substitutions are surprisingly weak: Despite having 16 bits of Boolean function description, no 4-bit table was found to be more than 4 bits away from an affine function, and almost half were only 2 bits away. A few of these tables (probability 0.01) even contained a true affine function. In contrast, the 8-bit tables have the opportunity to be far more complex, since there are 256 bits in each 8-bit Boolean function. This is shown in Figure 2. Nonlinearity Distribution in 8-Bit Tables Fig. 2 0.35 | * 0.3 | * 0.25 | * * 0.2 | * * * 0.15 | * * * 0.1 | * * * * 0.05 | * * * * * 0.00 | * * * * * * * Prob +--+--+--+--+--+--+--+--+-- 92 96 100 104 Nonlinearity Some of the literature asserts that *random* S-Boxes will have various good qualities provided the tables are "large enough" [AY82, GOR82, OC91, OC93, YOU95B]. This leads to some anxiety over the question of "How large is 'large enough'?" In our experiments, measurement of the nonlinearity of 1,000,000 random 8-bit tables found exactly one table with a nonlinearity as low as 78; this is a 0.000001 probability (1 count in 1E6). So not only are random 8-bit tables with even a single affine coordinate (nonlinearity = 0) *unlikely*, they are *so* unlikely as to be almost impossible to find. (Indeed, [GOR82] gives the probability as 10**-72.) So finding accidental "linearity" in random 8-bit tables seems to be an issue only in theory: this is a non-issue in practice. Our block cipher construction problem essentially involves using small random tables (here 4-bit tables with 16 entries each and a mean nonlinearity of 3.0) to somehow emulate a table which is three times as wide (here a 12-bit table of 4096 entries and a mean nonlinearity of 1907.9). Stated in this way, a solution seems quite unlikely, for how can even 15 tables of nonlinearity 3 ever produce a nonlinearity of 1908? Variable Size Block Ciphers A Variable Size Block Cipher (VSBC) (see, for example: http://www.io.com/~ritter/CRYPHTML.HTM#VSBCTech and especially: http://www.io.com/~ritter/VSBCCORE.HTM ) differs from most other ciphers in its ability to handle blocks of dynamically arbitrary size to the byte. This is accomplished with one-way mixing layers (see Figure 3) which guarantee that any change to any input is reflected in all subsequent mixed bytes. By using multiple such layers we can guarantee that any single-bit change in the data will affect all columns of the ciphertext result. A One-Way Variable-Size Mixing Layer Fig. 3 | | | | | | | | +--->MIX-->MIX--> ... -->MIX----+ | | | | v v v v Figure 3 shows a single one-way mixing layer: Each element value (in practice, each element is typically a byte) enters at the top, and the mixed results flow out the bottom. The typical Variable Size Block Cipher uses multiple layers of mixing, usually in opposite directions, to produce high quality overall mixing in a fixed-depth structure. VSBC's generally use multiple substitution tables, each of which is shuffled for primary keying. (See again, for example: http://www.io.com/~ritter/KEYSHUF.HTM .) The issue here is whether this sort of structure can produce higher nonlinearity values than the small substitution tables which are the only nonlinear components. Obviously, we would like to see the result approach the ideal nonlinearity distribution for the larger block. Figure 4 shows a two-element VSBC construction: The input block enters at the top, is split into two and each half substituted. Those results are mixed into two new block values; this is an unusual invertible mixing which we call Balanced Block Mixing (see, for example: http://www.io.com/~ritter/CRYPHTML.HTM#BBMTech and especially: http://www.io.com/~ritter/BBM.HTM ). After mixing, the results are substituted again. There is another mixing, another substitution and the two-element block flows out the bottom. A Two-Column Toy VSBC Fig. 4 | | SUB SUB | | +--->MIX----+ | | SUB SUB | | +----MIX<---+ | | SUB SUB | | v v We can compare the model of Figure 4 to the Mixing structure investigated previously (see http://www.io.com/~ritter/ARTS/MIXNONLI.HTM ), as shown in Figure 5, and see that they turn out to be the same, when reduced to this minimal structure. A Two-Element Mixing Cipher Fig. 5 | | SUB SUB \ / MIX / \ SUB SUB \ / MIX / \ SUB SUB | | The identity between the two-element-width Mixing cipher and the toy VSBC of similar width means that the results from the earlier investigation apply here: 1. A 2-element-width construction with 3 mixing levels, using 8 random 4-bit substitution tables, does appear to produce a nonlinearity distribution consistent with the ideal for an 8-bit table. 2. A 2-element-width construction with 2 mixing levels, using 6 random 5-bit substitution tables, does appear to produce a nonlinearity distribution consistent with the ideal for a 10-bit table. On the other hand, the earlier results also make clear the inherent weakness in 4-bit tables: A 2-element-width construction with just 2 mixing levels, using 6 random 4-bit tables (the structure shown in Figure 5) was clearly *not* ideal. And in the current investigation we *must* use 4-bit tables if we are to have a reasonable amount of desktop computation. Here we investigate whether a 12-bit VSBC will produce something like the nonlinearity distribution of the ideal 12-bit cipher. So first we need to identify the ideal 12-bit nonlinearity distribution. The 12-Bit Ideal The ideal 12-bit nonlinearity distribution was constructed from three different 12-hour trials of 100,000 random 12-bit tables each. (We really would like to have trials at least 10 times this size, but that would mean three 5-day runs for me, so the current result should be thought of as a decent guess.) Each ideal value is near the average of the three trials, weighted toward any two similar values, with the counts reduced by a factor of 10 and rounded to an integer, as shown in Figure 6. The Ideal 12-Bit Nonlinearity Distribution Fig. 6 NL Trial 1 Trial 2 Trial 3 Ideal 1838 1 0 0 0 1840 0 0 0 0 1842 0 0 0 0 1844 0 0 2 0 1846 3 0 0 0 1848 0 0 2 0 1850 3 0 0 0 1852 2 2 1 0 1854 2 2 2 0 1856 2 4 2 0 1858 2 0 4 1 1860 10 14 12 0 1862 14 10 6 1 1864 16 11 14 2 1866 32 28 17 3 1868 29 28 32 3 1870 27 52 58 5 1872 68 64 66 7 1874 92 84 98 9 1876 119 134 127 12 1878 185 162 197 19 1880 253 247 249 25 1882 323 353 312 32 1884 488 467 507 48 1886 652 643 674 65 1888 926 885 915 91 1890 1152 1185 1207 118 1892 1506 1566 1660 157 1894 2094 2132 2108 211 1896 2762 2772 2751 276 1898 3711 3643 3552 364 1900 4596 4572 4665 460 1902 5614 5753 5624 563 1904 6974 7050 7022 702 1906 8158 8210 8136 816 1908 9395 9489 9313 939 1910 10155 10101 10132 1013 1912 10414 10276 10306 1032 1914 9584 9482 9693 958 1916 8092 8062 8105 809 1918 6020 5915 5866 593 1920 3701 3701 3715 370 1922 1855 1903 1839 186 1924 697 753 766 74 1926 214 201 201 20 1928 48 39 40 4 1930 8 5 3 1 1932 1 0 0 0 The Toy Model A reasonable first step to investigating VSBC nonlinearity is to extend the 2-element structure of Figure 4 into the 3-element structure shown in Figure 7. A Three-Column Toy Variable-Size Block Cipher Fig. 7 | | | SUB SUB SUB | | | +--->MIX-->MIX----+ | | | SUB SUB SUB | | | +----MIX<--MIX<---+ | | | SUB SUB SUB | | | v v v Here, for the first time, the one-way characteristic of the VSBC mixing layers becomes apparent. In the figure, three 4-bit elements enter the top, are substituted and mixed, and the resulting ciphertext flows out the bottom. The overall result is a keyed permutation; a simulated large simple substitution table. Toy Model Test In each trial, we first initialize all the substitutions from a random key. This produces a simulated 12-bit simple substitution which we then measure for nonlinearity. At first we will perform just a small trial of 1,000 different 12-bit VSBC initializations, and this result is shown in Figure 8. Chi-Square For 12-Bit VSBC Fig. 8 NL Expect Found 1408 0 1 1472 0 2 1536 0 10 1568 0 32 1600 0 52 1632 0 90 1664 0 164 1696 0 216 1712 0 1 1728 0 215 1736 0 1 1756 0 1 1760 0 150 1788 0 1 1792 0 56 1824 0 8 The figure shows the results of measuring 1,000 randomly-keyed constructions, and *not* *one* of these values is as high as the *minimum* value (typically 1850) we might expect from the ideal 12-bit distribution. These results are so far from the expectation as to scarcely require computation: a computed chi-square total would be something like 100,000. Thus, the toy VSBC model of Figure 7 does *not* even *begin* to approach the distribution we want from an ideal cipher. The Full Model It is known from previous work that a two-level VSBC mixing is weak, so the next step is to extend the structure of Figure 7 to have 2 one-way mixing levels in each direction, as shown in Figure 9. Again, we must use the weak 4-bit tables to make the computation reasonable. A Three-Column Variable-Size Block Cipher Fig. 9 | | | SUB SUB SUB | | | +--->MIX-->MIX----+ | | | SUB SUB SUB | | | +--->MIX-->MIX----+ | | | SUB SUB SUB | | | +----MIX<--MIX<---+ | | | SUB SUB SUB | | | +----MIX<--MIX<---+ | | | SUB SUB SUB | | | v v v As before, three 4-bit elements enter the top, are substituted and mixed multiple times, and the resulting ciphertext flows out the bottom. (In practice, a working design would use the vastly stronger 8-bit tables, and would also use dynamic table selection, so that the same tables would only rarely occur in the same ciphering position. But neither of those improvements are used in these tests.) Full Model Tests In each trial, we first initialize all the substitutions from a random key. This produces a particular 12-bit cipher -- a simulated 12-bit simple substitution. We then traverse the complete transformation -- all 4096 possible data values -- and collect all 4096 12-bit ciphertext results. We then traverse all 12 ciphertext bit-columns, one-by-one, and perform a nonlinearity computation. The minimum nonlinearity value found across all columns is then accumulated into the growing distribution. Here we will perform full trials of 10,000 different 12-bit VSBC initializations each, with one such trial shown in Figure 10. Chi-Square For 12-Bit VSBC Fig. 10 NL Expect Found Chi-Sq DF 1850 0 0 } 1852 0 2 } 1854 0 0 } 1856 0 0 } 1858 1 2 } 1860 0 0 } 1862 1 0 } 1864 2 1 } 1866 3 4 } 1868 3 2 } 0.100 0 1870 5 7 } 1872 7 5 } 0.100 1 1874 9 5 } 1876 12 16 } 0.100 2 1878 19 28 4.363 3 1880 25 25 4.363 4 1882 32 46 10.488 5 1884 48 42 11.238 6 1886 65 85 17.392 7 1888 91 77 19.546 8 1890 118 126 20.088 9 1892 157 153 20.190 10 1894 211 214 20.233 11 1896 276 263 20.845 12 1898 364 346 21.735 13 1900 460 453 21.842 14 1902 563 576 22.142 15 1904 702 719 22.554 16 1906 816 813 22.565 17 1908 939 947 22.633 18 1910 1013 1030 22.918 19 1912 1032 973 26.291 20 1914 958 967 26.376 21 1916 809 816 26.436 22 1918 593 588 26.478 23 1920 370 362 26.651 24 1922 186 199 27.560 25 1924 74 81 28.222 26 1926 20 24 } 28.382 27 1928 4 3 } 1930 1 0 } 1932 0 0 } 1934 0 0 } 1936 0 0 } 1938 0 0 } With 27 "degrees of freedom," the chi-square value of 28.382 found in Figure 8 is very believable. The critical chi-square values for DF = 27 are shown in Figure 11. Chi-Square Critical Values for DegFree = 27 Fig. 11 1% 5% 25% 50% 75% 95% 99% 12.878 16.151 21.749 26.336 31.528 40.113 46.962 Additional trials were performed (at 1.4 hours each), producing the overall chi-square results of Figure 12. Chi-Square Values for 10,000-Cipher Trials Fig. 12 11.362 13.933 17.418 17.822 19.520 20.052 20.954 20.992 21.024 21.084 21.415 21.936 22.629 22.913 24.171 24.224 24.291 25.306 26.263 26.781 27.754 28.382 28.619 29.252 30.271 31.369 31.388 32.555 33.015 33.367 33.990 34.883 35.674 36.171 37.987 39.853 39.980 43.197 45.514 Most of the collected values seem reasonable. One value *is* unexpectedly low, meaning that the sample distribution was unexpectedly *close* to the ideal. The actual probability of finding a chi-square value of 11.367 (by random sampling the ideal distribution) with 27 degrees of freedom is 0.0036. This is about 4 in 1000, so we would *expect* to get something like this if we would just conduct 278 trials. But random trials do not know how many trials have passed before, and the fact is that the value occurred early. There is no preponderance of such occurrences. Also, in this data set, the tails seem to be represented more frequently than the center. Despite these variations, all but one of the values seem at least reasonable, and -- as we saw previously -- chi-square values in this particular range do not repeatedly occur by chance alone. We are thus forced to conclude that the sampled distribution is reasonably close to the constructed ideal distribution for random 12-bit substitutions. Conclusions From these results we conclude that the 3-element VSBC construction using 15 random 4-bit tables does in fact produce nonlinearity levels and distributions *remarkably* *similar* to those of an ideal 12-bit cipher. These experiments thus support the VSBC approach to block cipher design, and do *not* support the claim that such ciphers are inherently weak. Note that the ability to perform these experiments is based on the "scalability" of the cipher design. This is an important advantage that very few designs offer: Scalability supports far deeper experimental analysis than is possible in a full-size cipher. I now believe that the inability of a cipher to support the deep analysis available to a scaled design is nothing less than a flaw in the cipher design itself. There is no theory of cipher strength such that, if we only follow those rules, we will build a "strong" cipher. Nor can we hope to thoroughly review any large cipher experimentally. It would seem that the only way we can hope to find completely unexpected strength problems is to *scale* the cipher to a size which we *can* address experimentally. Since large ciphers cannot be thoroughly reviewed, they also cannot be fully trusted. References and Bibliography [AY82] Ayoub, F. 1982. Probabilistic completeness of substitution-permutation encryption networks. IEE Proceedings, Part E. 129(5): 195-199. [DAE94] Daemen, J., R. Govaerts and J. Vandewalle. 1994. Correlation Matrices. Fast Software Encryption. 275-285. [FOR88] Forre, R. 1988. The Strict Avalanche Criterion: Spectral Properties of Boolean Functions and an Extended Definition. Advances in Cryptology -- CRYPTO '88. 450-468. [GOR82] Gordon, J. and H. Retkin. 1982. Are Big S-Boxes Best? Cryptography. Proceedings of the Workshop on Cryptography, Burg Feuerstein, Germany, March 29-April 2, 1982. 257-262. [HEY94] Heys, H. and S. Tavares. 1994. On the security of the CAST encryption algorithm. Canadian Conference on Electrical and Computer Engineering. Halifax, Nova Scotia, Canada, Sept. 1994. 332-335. [HEY95] Heys, H. and S. Tavares. 1995. Known plaintext cryptanalysis of tree-structured block ciphers. Electronics Letters. 31(10): 784-785. [MEI89] Meier, W. and O. Staffelbach. 1989. Nonlinearity Criteria for Cryptographic Functions. Advances in Cryptology -- Eurocrypt '89. 549-562. [MIR97] Mirza, F. 1997. Linear and S-Box Pairs Cryptanalysis of the Data Encryption Standard. [OC91] O'Connor, L. 1991. Enumerating nondegenerate permutations. Advances in Cryptology -- Eurocrypt '91. 368-377. [OC93] O'Connor, L. 1993. On the Distribution Characteristics in Bijective Mappings. Advances in Cryptology -- EUROCRYPT '93. 360-370. [PIE88] Pieprzyk, J. and G. Finkelstein. 1988. Towards effective nonlinear cryptosystem design. IEE Proceedings, Part E. 135(6): 325-335. [PIE89] Pieprzyk, J. and G. Finkelstein. 1989. Permutations that Maximize Non-Linearity and Their Cryptographic Significance. Computer Security in the Age of Information. 63-74. [PIE89B] Pieprzyk, J. 1989. Non-linearity of Exponent Permutations. Advances in Cryptology -- EUROCRYPT '89. 80-92. [PIE93] Pieprzyk, J., C. Charnes and J. Seberry. 1993. Linear Approximation Versus Nonlinearity. Proceedings of the Workshop on Selected Areas in Cryptography (SAC '94). 82-89. [PRE90] Preneel, B., W. Van Leekwijck, L. Van Linden, R. Govaerts and J. Vandewalle. 1990. Propagation Characteristics of Boolean Functions. Advances in Cryptology -- Eurocrypt '90. 161-173. [RUE86] Rueppel, R. 1986. Analysis and Design of Stream Ciphers. Springer-Verlag. [XIO88] Xiao, G-Z. and J. Massey. 1988. A Spectral Characterization of Correlation-Immune Combining Functions. IEEE Transactions on Information Theory. 34(3): 569-571. [YOU95] Youssef, A. and S. Tavares. 1995. Resistance of Balanced S-boxes to Linear and Differential Cryptanalysis. Information Processing Letters. 56: 249-252. [YOU95B] Youssef, A. and S. Tavares. 1995. Number of Nonlinear Regular S-boxes. Electronics Letters. 31(19): 1643-1644. [ZHA95] Zhang, X. and Y. Zheng. 1995. GAC -- the Criterion for Global Avalanche Characteristics of Cryptographic Functions. Journal for Universal Computer Science. 1(5): 316-333. --- Terry Ritter ritter@io.com http://www.io.com/~ritter/ The Crypto Glossary: http://www.io.com/~ritter/GLOSSARY.HTM
Terry Ritter, his current address, and his top page.
Last updated: 1998-02-26