United States Patent 5,623,549; Part C

Balanced Block Mixers

All of the mixers referred to above are balanced block mixers which map multiple input code values to the same number of output code values.

Throughout this application, including the claims, the term "balanced block mixer" is intended to include all possible implementations (such as electronic hardware, computer implemented software, hybrids, or the like) or the process performed. Thus a balanced block mixer refers to either a process for balanced block mixing or to any implementation (for example, hardware, computer implemented software, hybrid, or the like) for performing balanced block mixing.

[Figs. 9(a) and 9(b)]

FIGURES 9(a) and 9(b) show the operations performed by balanced block mixers 174 and 176 in terms of their four component combiners 178-184. In the mechanisms of FIGURES 9(a) and 9(b), balanced block mixer 174 consists of combiners 178 and 180, and balanced block mixer 176 consists of combiners 182 and 184. A 2n-bit input message is split into two n-bit data blocks A and B which are input to balanced block mixer 174. Within the balanced block mixer 174, data blocks A and B are input to combiners 178 and 180. The two combiners 178 and 180 produce two n-bit output data blocks X and Y. Data blocks X and Y may then be further encrypted prior to transmission or use. FIGURE 9(b) illustrates apparatus for recreating input message I from data blocks X and Y. Upon receipt of the data blocks X and Y (perhaps as a single 2n-bit data block), balanced block mixer 176 (the inverse of balanced block mixer 174) takes two n-bit input data blocks X and Y and uses two combiners 182 and 184 to produce two n-bit data blocks A and B which are joined to reproduce the original input message.

Desirably, a balanced block mixer has the following properties:

The mapping is one-to-one and invertible. That is, every possible input message to the mixer produces a different output, and every possible output from the mixer is produced by a different input message to the mixer.
Each output of a mixer is a function of all inputs to the mixer. That is, treating the inputs and outputs to the mixer as values (e.g., binary values), every output value is a function of all input values. With reference to FIGURES 9(a) and 9(b), output blocks X and Y from balanced block mixer 174 are each functions of both input blocks A and B. That is, the outputs (X and Y) of balanced block mixer 174 are both a function of the entire input message I. Similarly, output blocks (A and B) of balanced block mixer 176 are both functions of both input blocks (X and Y).
Changes Propagate to All Outputs. Any change to any one of the input values to the mixer will change all of the output code values of the mixer. For example, with reference to FIGURES 9(a) and 9(b), a change to any value in input data block A to balanced block mixer 174 will cause a change in both output data blocks X and Y. Similarly, a change to any value in input data block B will cause a change in both output data blocks.
Balance and Input Independence. Stepping any input to the mixer through all possible values (while keeping the other inputs fixed) will step every output of the mixer through all possible values. For example, with reference to FIGURES 9(a) and 9(b), keeping the value of input B fixed and changing the value of input A through all possible values will cause outputs X and Y to each step though all possible values.

In the preferred embodiments, balanced block mixers (performing block mixing transforms) are size preserving. That is, the size of the output blocks is the same as the size of the input blocks. With reference to FIGURES 9(a) and 9(b), the sizes of data blocks X and Y are the same as the sizes of data blocks A and B. Thus, for example, if the input message I is 128 bits and A and B are 64-bit data blocks then X and Y are also 64-bit data blocks.

Creating and Generating Balanced Block Mixers

Balanced block mixers with the above-described properties can be obtained using orthogonal Latin squares. Essentially a balanced block mixer is implemented as two orthogonal Latin squares. Because the Latin squares are orthogonal, their accumulated output is reversible. The generation of orthogonal Latin squares is described in "The Completely Orthogonalized Latin Square," Stevens, W. L., Annals of Eugenics, 9: 82-93, 1939, which is hereby incorporated herein by reference. (A way of constructing orthogonal Latin Squares is also described in "Direct Construction of Mutually Orthogonal Latin Squares," Kull, H. et al., Computational Theory and Logic, 1987.)'

A Latin square is a two-dimensional array or matrix of symbols, such that each row and each column contains each symbol exactly once. A Latin square is used to mix two values by using one value to select a row of the square, and the other to select a column of the square, and then reporting the value of the Latin square at that row and column. This Latin square mixing is balanced: for any particular value on one input, changing the other input produces every possible output symbol.

A known Latin square is invertible if one of the mixing inputs is known, which is the usual situation for a stream cipher.

The values from a pair of Latin squares can be transformed to their original values without outside information, provided the Latin squares are "orthogonal". That is, each pair of output symbols is produced by just a single selection of input values. (If two pairs of input values produce the same output values, then it cannot be determined which of those input pairs to choose when deciphering.) Note that two exclusive-OR functions are not orthogonal, and so cannot be used in this way.

Using a more formal notation, a Latin Square of order (size) m is an m x m square array of the symbols S₀, S₁, . . ., S_m-1 such that each row and each column are a permutation of the symbols S₀, S₁, . . ., S_m-1, and each symbol appears only once in each row and in each column (clearly the m symbols can be replaced by m digits). Thus, for example, when m is 3, the following is a Latin square:

          B    C    A
          C    A    B
          A    B    C

Two Latin squares of the same size are said to be orthogonal to each other if, when the two squares are superimposed, each symbol of one square coincides exactly once with each symbol of the other square. For example, two Latin squares of size three with this property are:

          B    C    A                        B    C    A
          C    A    B                        A    B    C
          A    B    C                        C    A    B

When the above two Latin squares are superimposed (as shown below),

          (B,B)     (C,C)     (A,A)
          (C,A)     (A,B)     (B,C)
          (A,C)     (B,A)     (C,B)

each possible ordered pair of symbols occurs exactly once.

First, one method of constructing a Latin square is described. The Latin square L of size m, having the digits {0, 1, . . ., m-1} can be written as L[x, y] = (kx + y) mod m, where k is a non-zero constant and x and y are in the range 0 to m-1. That is, "(kx + y) mod m" is the number to be placed in the xth row and the yth column of the square. For example, a value of m=3 gives the following Latin square (only for the sake of notation, we denote "a mod b" as "a|b" herein):

          0         1         2
          k|3       (k+1)|3   (k+2)|3
          2k|3      (2k+1)|3  (2k+2)|3

For k=1, this gives the Latin square

          0    1    2
          1    2    0
          2    0    1

Construction of Orthogonal Latin Squares

The following construction derives from Stevens, cited above.

Assuming that a field of m numbers exists, a square S of the digits may be written as:

S = {u_ku_x + u_y}, where u_k is a non-zero constant, and where u_x, u_y = 0, 1, u₂, . . ., u_m-1, where u_ku_x+u_y is the number in row u_x and column u_y. The numbers may have no obvious order, but we may designate the m rows and columns by means of the numbers 0, 1, u₂, . . ., u_m-1 in any way. Stevens shows that any two squares defined as above, for different values of u_k, are orthogonal.

Clearly, for large values of n, the Latin squares of size 2ⁿ are too large and unwieldy to be usefully generated and stored. However, since the method of generating the squares uses functions which can be easily computed, the actual Latin squares need not ever be generated. In other words, whenever a Latin square is referred to or needed, its generating function can be used.

One way to create a finite field is to use a prime number. For some prime number p, the values 0, 1, . . ., p-1, constitute a finite field modulo p. However, for efficient computer implementation, we would like to have a field of exactly 2ⁿ elements, and 2ⁿ is not prime for n above 1. We might use primes close to 2ⁿ, provided that we can accept some unbalance, or we can use polynomial operations, as discussed below. Polynomial operations modulo an irreducible polynomial provide a way to generate fields of exactly 2ⁿ elements with exact balance and with efficient operation.

Given a Latin square LS of size m having as elements the numbers 0, 1, . . ., m-1, if a particular row value is fixed at say A, the value of LS[A, c] will take on all values in {0, 1, . . ., m-1} as c varies over all possible values 0, 1, . . ., m-1. Similarly, if a particular column is fixed at say B, the value of LS[r, B] will take on all values in {0, 1, . . ., m-1} as r varies over all possible values 0, 1, . . ., m-1. This corresponds to desired property number 4 for balanced block mixers, listed above.

Notes on Polynomial Arithmetic

A polynomial over S is an expression of the form

               u(x) = u_nxⁿ + . . . u₁x + u₀,

where the coefficients ui are elements of some algebraic system S, and the variable x may be regarded as a formal symbol with an indeterminate meaning.

The algebraic system S is usually the set of integers, or the rational numbers. When S is the set of integers {0, 1, . . ., m-1}, with addition, subtraction, and multiplication performed mod m, this is called polynomial arithmetic modulo m (or mod m).

Thus, a mod p polynomial P is one of the form:

          P(x) = p_nxⁿ + . . . p₁x + p₀

where the coefficients pi are integers in {0, 1, . . ., p-1} and the addition, subtraction and multiplication are performed mod p. A mod 2 polynomial P has all coefficients 0 or 1 and the addition, subtraction and multiplication are performed mod 2.

Notes on Fields

Definitions of mathematical fields can be found in most basic books on number theory. Some definitions are included here (from Koblitz, A Course in Number Theory and Cryptography, 2nd Edition, Springer-Verlag, 1994).

A field is a set with multiplication and addition operations which are associative, commutative and distributive. The set has an additive identity (zero), a multiplicative identity (one), additive inverses and multiplicative inverses for everything except zero. For example, the set of all real numbers, R, is a field and the set of integers modulo a prime number is a field.

The notion of a field extends to polynomials. As with integer fields (modulo some prime number), polynomial fields are described modulo a prime (irreducible) polynomial. For example, if P is a polynomial field modulo the prime polynomial p, we refer to P as a modulo p (or mod p) field. This essentially means that all arithmetic on polynomials in the field is performed modulo the prime polynomial p.

Polynomials being "mod 2, mod p" (as used herein) means that each polynomial is a mod 2 polynomial (i.e., has coefficients that are 0 or 1) and that the arithmetic over the polynomials is performed modulo p.

Examples of Balanced Block Mixers

The following combiners, with their operations performed modulo 2 and modulo p, for some irreducible mod 2 polynomial p of appropriate degree for the data blocks X, Y, A and B (polynomial p need not be primitive, but must be irreducible to generate a field), provide an example of a balanced block mixers for balanced block mixer 174 and its inverse balanced block mixer 176 of FIGURES 9(a) and 9(b):

     Combiner₁₇₈(A, B) = 3A + 2B,
     Combiner₁₈₀(A, B) = 2A + 3B,
     Combiner₁₈₂(X, Y) = 3X + 2Y,
     Combiner₁₈₄(X, Y) = 2X + 3Y.

Each input value to a balanced block mixer is a binary number which corresponds to the coefficients of a mod 2 polynomial. So, in the above example, A, B, X, and Y are each considered as mod 2 polynomials. Then, the operations performed by the combiners, e.g., 2A+3B, etc., are first performed as mod 2 polynomials, and those results are reduced mod p (i.e., modulo the irreducible polynomial p). So, for example, "2A + 3B" above could also be written as "(2A + 3B) mod p".

The data transform performed by the balanced block mixers shown above is a self-inverse, has good mixing correlation properties, is statistically balanced, and has a processing cost which is linear with block size.

Using the data or signal transformation X = Combiner(A, B) = 3A + 2B; Y = Combiner(A, B) = 2A + 3B, in a balanced block mixer, any value change to either of the input signals A or B must be reflected in both the X and Y output signals. Suppose, for example, that some change C is made to the input signal A:

           X  = 3A + 2B  (mod 2, mod p)
           X' = 3(A+C) + 2B
           X' = 3A + 3C + 2B
           dX = X' - X = 3C

but 3C is non-zero (thus affecting the output signal) for any signal change C which is not zero, and if C is zero, there has been no change to the input signal A. Suppose some change C is made to input signal B:

           X  = 3A + 2B  (mod 2, mod p)
           X' = 3A + 2(B+C)
           X' = 3A + 2B + 2C
           dX = X' - X = 2C

Similarly, 2C is also non-zero for any signal change C which is not zero.

Suppose that the signal change C is made half the value of p plus the highest bit 2^deg(p)-1, so that p will be activated in the process of determining the output signal and 2C will cancel the lower bits of p. Since p is irreducible there is no q such that 2q = p. Similar arguments apply for Y = 2A + 3B.

A present embodiment of the invention uses the degree-128 irreducible 0100004000000400200002000004000001 (hex), and the degree-64 irreducible 010002000000800201 as block mixing polynomials. Polynomials with about half the number of possible terms would be desirable some implementations.

[Figs. 9(a) and 9(b)]

With reference to FIGURES 9(a) and 9(b), and the above combiners, a 2n-bit input message signal I is split into two n-bit data blocks A and B which are input to balanced block mixer 174. Within balanced block mixer 174, data blocks A and B are each input to combiners 178 and 180. Combiner 178 takes the two input data blocks A and B and produces an output data block X (which is 3A+2B mod 2 mod p). Combiner 180 takes the same two input blocks A and B and produces an output data block Y (which is 2A+3B mod 2 mod p).

Upon receipt, data blocks X and Y are input to inverse balanced block mixer 176 where each data block is input to combiners 182 and 184. Combiner 182 takes its two input data blocks X and Y and produces a data block 3X+2Y mod 2 mod p, i.e., data block A. Combiner 184 takes its two input data blocks X and Y and produces a data block 2X+3Y mod 2 mod p, i.e., data block B.

Fenced Cryptographic Mechanisms

A fenced cryptographic mechanism is a cryptographic mechanism 100 (FIGURE 1) that uses a cipher mechanism as a component in a cryptographic mechanism which is at least as powerful and potentially stronger than the cipher mechanism, yet almost as fast. For example, a fenced DES mechanism uses the DES cipher mechanism as a component and a fenced IDEA mechanism uses the IDEA cipher mechanism as a component.

A core component mechanism of a fenced cryptographic mechanism is a substitution mechanism described below.

A 1x fenced cipher mechanism

[Fig. 10]

With reference to FIGURE 10, an embodiment of the present invention being cryptographic mechanism 100 in the form of a 1x fenced cipher mechanism for an n-bit cipher mechanism 180 is shown. (The term "fenced" is used to describe the "picket fence" appearance of the arrays 183a, 183b of substitution mechanisms 182 (S-mechanisms or S-devices herein) surrounding the cipher mechanism 180 in the schematic diagrams.) Each of substitution mechanisms 182 performs an eight-bit wide (256-byte) substitution randomized under the control of a cryptographic key. (The randomization is described below.) Therefore, for an n-bit cryptographic mechanism, two arrays of n/8 substitution mechanisms 182 are needed, one array on each side of the cipher mechanism 180, for a total of n/4 substitution mechanisms 182. For example, if the cipher mechanism 180 is a DES mechanism or an IDEA mechanism (with a 64-bit input block), then one array 183a, 183b of eight (8) substitution mechanisms 182 is used on either side of the cipher mechanism 180.

In operation, a fenced cryptographic mechanism 100 receives an n-bit signal I which is split into n/8 8-bit components, each of which is passed through a substitution mechanism 182 in the array 183a of substitution mechanisms. The combined output of the array 183a of substitution mechanisms 182 is an n-bit data value which is then input to cipher mechanism 180 for ciphering. The output of cipher mechanism 180 is an n-bit ciphered message which is then passed (in 8-bit components) through another array 183b of n/8 substitution mechanisms 182 to produce an n-bit plaintext signal.

Functionally, the cryptographic mechanism 100 in FIGURE 10 can be written as:

          for each x_i, i = 1 to n/8,
               S_i(x_i) -> x'_i
          CIPHER(x'₁ x'₂ . . . x'_n/8, k) -> y₁ y₂ . . . y_n/8
          for each y_i, i = 1 to n/8,
               S_n/8+i(y_i) -> y'_i

Substitution Mechanisms

The fenced cryptographic mechanisms use, in combination with other elements, substitution mechanisms. Substitution mechanisms essentially implement keyed substitution tables which can be initialized before ciphering, and left unchanged during ciphering or which can change during ciphering. The substitution mechanisms operate by transforming one k-bit input value into another k-bit output value using a substitution table of size 2^k. In the preferred embodiments described herein, the value of k is eight, and the substitution mechanisms effectively have tables with 2⁸=256 entries. However, other values of k could be used.

[Fig. 11]

FIGURE 11 shows a single substitution mechanism 182 in detail. Input data I to substitution mechanism 182 is transformed by the substitution mechanism 182 to output data O. The input data I is an index into substitution table or array 184, and selects one of the entries in the table. The value stored in the selected entry forms the output data O. A substitution table S is an indexable n-element vector of output codes or values. Essentially a substitution table is a memory device which stores values which can be selected by address or index value.

In cryptographic service, substitution tables are often initialized to contain exactly one occurrence of each possible index value, thus making the transformation they perform invertible. The arrangement of values in the substitution table can be controlled by a cryptographic key. Such a properly-initialized device (or area of memory) is called a substitution mechanism.

For each substitution mechanism, the inverse mechanism can be formed by properly arranging the data in another substitution mechanism. Thus, in substitution mechanism 182, if input I maps to output O, then in the inverse substitution mechanism, input O maps to output I.

The substitution mechanisms perform a one-to-one mapping of input values to output values, so that no two inputs will produce the same output, and no two outputs can derive from the same input.

Throughout this application, including the claims, the term "substitution mechanism" is intended to include all possible implementations (such as electronic hardware, computer implemented software, hybrids, or the like) or the process performed. Thus a substitution mechanism refers to either a substitution process or any implementation (for example, hardware, computer implemented software, hybrid, or the like) for performing the substitution process.

1x Fenced DES

[Fig. 10]

With reference to FIGURE 10, if the cipher mechanism 180 is a DES mechanism, then FIGURE 10 depicts a 1x fenced DES mechanism in which a 64-bit input message block X is divided into eight message sub-blocks x₁ x₂ . . . x₈, where each x_i is an 8-bit sub-block of X. Each of the blocks x_i is transformed via a corresponding substitution mechanism 182, S_i, to produce an 8-bit value x'_i. The eight 8-bit data values x'_i are input as a 64-bit data block to DES mechanism 180, which, using key k, produces a 64-bit output data block Y made up of 8-bit sub-blocks y₁ y₂ . . . y₈. Each of the 8-bit data sub-blocks y_i is transformed by a substitution block to an 8-bit data block y'_i. The output of the mechanism is then formed by joining or concatenating the eight y'_i sub-blocks to form a 64-bit ciphertext message Y'.

If cipher mechanism 180 is a DES cipher mechanism, then functionally, the mechanism in FIGURE 10 can be written as:

          for each x_i, i = 1 to 8,
               S_i(x_i) -> x'_i
          DES ENCIPHER(x'₁ x'₂ . . . x'₈, k) -> y₁ y₂ . . . y₈
          for each y_i, i = 1 to 8,
               S_8+i(y_i) -> y'_i

Decryption of the message encrypted as above operates by first inverting what were the last substitutions performed by the substitution mechanisms 182, then deciphering the message with the cipher mechanism 180 (e.g., a DES mechanism), and then inverting what were the first round of substitutions.

Functionally, decryption of a message encrypted by the mechanism shown in FIGURE 10 can be written as:

          for each y'_i, i = 1 to n/8,
               S-inverse_n/8+i(y'_i) -> y_i
          DECIPHER(y₁ y₂ . . . y_n/8, k) -> x₁' x₂' . . . x_n/8'
          for each x_i', i = 1 to n/8,
               S-inverse_i(x_i') -> x_i

A 1x Fenced DES is a three-layer structure like triple DES. Since all information flows through DES itself, the overall cipher cannot possibly be weaker than DES. Moreover, the 1x Fenced DES structure is clearly stronger than DES, because a "known plaintext" attack requires knowing the actual input and output values at the DES cipher itself, and these values are hidden by the input and output substitution layers. If even a single input bit to DES is wrong (that is, if even one value in one input substitution is off by even one bit), DES will avalanche and about half of the output bits from DES will be wrong.

The extreme sensitivity of good block ciphers to change -- a tiny change in the input (plaintext) value tends to change about half of the output bits -- is known as avalanche. In the context of an iterated product cipher (such as DES) which has multiple rounds, a small input change will affect a few adjacent bits each round, until, eventually, about half of the output bits change. Avalanche is a requirement in a good block cipher, and is an inherent property of a normal invertible substitution -- any change to the input value of an invertible substitution selects a new and necessarily different output value.

Thus, a single wrong bit has the same statistical effect as a mostly-wrong value; this means that it is not possible to know when the input value is "close," so it does not appear possible to solve the substitutions in an advantageous way.

[Fig. 12]

In some preferred embodiments, as shown, for example, in the cryptographic mechanism of FIGURE 12, arrays of substitution mechanisms are used to fence multiple parallel block cipher mechanisms. In the cryptographic mechanism 100 of FIGURE 12, an array 186a of sixteen 8-bit substitution mechanisms 182 (each labelled "S" in the drawing) takes a 128-bit input signal into two 64-bit cipher mechanisms 188. The output of the cipher mechanisms 188 is passed through another array 186b of sixteen 8-bit substitution mechanisms. Note that half the output of each substitution mechanism 182 is input to each of the two cipher mechanisms 188. In this manner, the output from each substitution mechanism 182 affects the output of each cipher mechanism 188. Thus, for an 8-bit substitution mechanism 182, up to eight cipher mechanisms 188 can be used in this configuration, with one bit of output from each substitution mechanism 182 going into each of the eight cipher mechanisms 188. In general, b-bit substitution mechanisms 182 can be used with up to b n-bit cipher mechanisms to produce an b.n-bit cryptographic mechanism 100.

Generating Keys for the Substitution Mechanisms

In order to enhance the strength of a cipher mechanism in a fenced cryptographic mechanism, each substitution mechanism must be shuffled by a cryptographic random number generator before ciphering begins (the random number generator is not needed during actual ciphering). The same random number generator can also generate the cipher keys.

Presently, the most efficient basis for such a random number generator is the additive random number generator described in Knuth, The Art of Programming: Vol 2: Seminumerical Algorithms, 2d Edition, Addison-Wesley, 1981. This is basically a linear feedback shift register (LFSR) using addition mod 2ⁿ. The additive random number generator is especially suited to a fast software implementation. In one version, there are three pointers into a circular queue, and the random number generator "step" operation consists of taking two of the pointed-to elements, adding them, and placing the result in the queue through the third pointer. The pointers are then incremented within the queue, and the result returned.

A reasonable structure for the Fenced DES random number generator is thirty-one (31) elements of thirty-two (32) bits each, for a total state of 992 bits. (DES has a 56-bit key.) This state can be initialized easily from a user key using an array of thirty-two (32) degree thirty-one (31) primitive mod-2 polynomials as cyclic redundancy checks (CRC's). This produces thirty-two (32) 31-bit values which can be re-arranged into thirty-one (31) 32-bit values to become the state for the random number generator. This CRC-based initialization allows the user key to be ASCII or binary, of arbitrary length, and thus provides a strong universal key interface.

Since the additive random number generator is essentially a linear mechanism, it is necessary to nonlinearize the sequence. One way to do this, mentioned in Ritter, "The Efficient Generation of Cryptographic Confusion Sequences," Cryptologia, 15(2): 81-139, 1991, by dropping out pseudo-random length sections of the linear sequence, leaving pseudo-random length "take" islands, and then offsetting each take sequence with a different pseudo-random offset value. With fairly short "take" islands, this renders the usual linear attacks worthless, at a cost of dropping some moderate fraction of the sequence. Additional isolation is provided by the cheap width of the random number generator, since only eight bits are needed, but thirty-two bits are calculated. This means that 3/4 of the random number generator state is always hidden, but must nevertheless be resolved before the random number generator can be completely exposed.

Increased Block Size Fenced Cryptographic Mechanisms Using Balanced Block Mixers and Substitution Mechanisms

In some embodiments of cryptographic mechanisms 100, combinations of substitution mechanisms and balanced block mixers are used. These can be used alone or in combination with other ciphers, and using these mechanisms, the block size of other ciphers can be increased.

2x Fenced Block Cipher Mechanisms

[Fig. 13]

In one embodiment, with a combination of substitution mechanisms and balanced block mixers, the block size n of a cipher mechanism can be doubled. (In the following, only for the sake of generality, it is assumed that n is an integer power of 2.) With reference to FIGURE 13, a 2x fenced block cryptographic mechanism consists of two keyed cipher mechanisms 190a, 190b, each taking n-bit blocks, n/2 substitution mechanisms (labelled S1 to Sn/2) in two arrays 192a, 192b, and two balanced block mixers 194a, 194b. A 2n-bit input signal I is split into n/4 8-bit parts X1, . . ., Xn/4 (as with the other signal splitting described herein, this splitting can be done merely based on the order of the bits in the input message, or some complex form of splitting can be used). Each 8-bit part Xi is passed through an array of substitution mechanism Si, i=1, . . ., n/4, producing a substituted 8-bit data part Xi'. The n/4 8-bit data parts Xi' are input to balanced block mixer 194a which produces two n-bit data blocks A and B. Data block A is encrypted by cipher mechanism 190a to produced n-bit enciphered block A', and data block B is encrypted by cipher mechanism 190b to produced n-bit enciphered block B'. Data blocks A' and B' are input to a second balanced block mixer 194b which produces balanced mixed output blocks. The output blocks are split into n/4 8-bit parts, each of which is passed through a substitution mechanism Sj, j=n/4+1, . . ., n/2, to produce n/4 substituted 8-bit data blocks yj. The n/4 8-bit data blocks yj are joined to form a 2n-bit ciphertext output signal.

2x Fenced DES

Where the cipher mechanisms 190a and 190b are DES mechanisms, n is 64 bits. In this case, using a combination of substitution mechanisms and balanced block mixers, the effective block size of DES can be doubled. With reference to FIGURE 13, two balanced block mixers 194a and 194b are used to produce two 64-bit inputs to DES mechanisms 190a, 190b from a 128-bit input message signal. The first array 192a of substitution mechanisms S1-S16 map sixteen 8-bit inputs to 8-bit outputs, as described above. The input signal to the system is a 128-bit block, split into sixteen 8-bit blocks, x1-x16. Each of these sixteen blocks is transformed via a corresponding substitution mechanism. That is, for i from 1 to 16, 8-bit data block xi is transformed by substitution mechanism Si, to give 8-bit data block x'i. Eight of the transformed 8-bit blocks are input (as a 64-bit block A) to one DES mechanism 190a, and the other eight 8-bit blocks are input (as 64-bit block B) to the other DES mechanism 190b. Data blocks A and B are encrypted to produce to 64-bit output blocks A' and B', where, functionally, A'=DES(A, k1) and B'=DES(B, k2). As above, keys k1 and k2 can be the same as each other or different. Data blocks A' and B' are mixed using balanced block mixer 194b, and the output 128-bits (as two 64-bit blocks) are then transformed (in 8-bit blocks) by array 192b of substitution mechanisms S17-S32 to produce 128-bit output ciphertext message signal O=y17'y18'...y32'.

A ciphertext message signal O produced by an encryption mechanism as described is decrypted as follows. The signal is split back into its 8-bit component parts yi' which are then passed through substitution mechanisms S-inverse17 to S-inverse32 (which perform the inverse transformations of S17 to S32, respectively), giving back the 8-bit data blocks y17 to y32. The 8-bit data blocks y1 to y17 are passed through the inverse of balanced block mixer 194b, giving two 64-bit data blocks A' and B'. Data blocks A' and B' are decrypted by DES mechanisms 190a, 190b with the appropriate keys, giving 64-bit data blocks A and B, respectively. Data blocks A and B are passed through the inverse of balanced block mixer 194a to give outputs Xi' which are transformed by substitution mechanisms Si-inverse (which performs the inverse transformation of substitution mechanism Si), for i from 1 to 16. The output of the inverse substitutions forms a 128-bit plaintext message signal.

4x Fenced Block Cipher Mechanisms

In one embodiment of the present invention, with a combination of substitution blocks and block mixers, the block size of a cipher is quadrupled. With reference to FIGURE 14, a 4x fenced block cryptographic mechanism consists of four keyed cipher mechanisms 196a-196d, each processing n-bit data blocks to produce an n-bit data block, 2n substitution mechanisms (labelled S1 to S2n) in two arrays 198a, 198b, and six balanced block mixers 200a-200f. The corresponding decryption mechanism uses the inverse substitutions and the inverse mixers, in the appropriate order.

4x Fenced DES

Where cipher mechanisms 196a-196d are DES cipher mechanisms, the encryption illustrated in FIGURE 14 processes a 256-bit input message block.

[Fig. 14]

Here there are two arrays 198a, 198b of thirty-two (32) 8-bit input substitution mechanisms S1 to S32 and thirty-two (32) 8-bit output substitution mechanisms S33 to S64, respectively, each substitution mechanism effectively a separately-shuffled and independent substitution table. The overall block size of the cryptographic mechanism is 256 bits, there are four DES mechanisms, plus two levels of input mixing (by mixers 200a-200c) and two levels of output mixing (by mixers 200d-200f). Note that the outermost balanced block mixers join two 128-bit blocks.

This approach spreads the effect of each bit in an input message to each of the four DES mechanisms, and produces a particular output bit from a combination of all four DES mechanisms. This forces an opponent (trying to break the code) to search all four DES keyspaces simultaneously to match a particular known plaintext-ciphertext pair.

While in the above, the substitution mechanisms are effectively 8-bit substitution tables, a different size could be used for these mechanisms.

A software implementation of the 4x DES construct of FIGURE 14 performs all sixty-four substitutions and all six mixings in less time than a single DES computation. Currently, it ciphers four times the data with about 4.8 times the computation, and has a keyspace exceeding 224 bits. (A much faster hybrid implementation might perform the DES computations in hardware.)

In the implementation, table and key initialization take about 182 times the computation of a single 256-bit block ciphering. (This is mainly a consequence of shuffling sixty-four small substitution tables, an initialization which could be separated from the use of new DES mechanism keys. Table initialization can occur less frequently than key initialization.)

The current implementation uses a fast 992-bit additive random number generator and a non-linear "jitterizer" as mentioned in Ritter. In the present implementation, a user key of arbitrary length and content is hashed by thirty-two (32) separate degree-31 primitive mod-2 polynomials with 11-19 non-zero terms, producing the 992-bit random number generator state, which also eventually generates the DES keys. This approach eliminates the need for keys to have a specific format unique to this particular cipher, which enables the selection of an arbitrary cipher from among many different ciphers, all of which can use the exact same key.

Deciphering simply uses inverse substitutions (the inverse of each encipher output substitution is used for decipher input) and DES in decipher mode.

Recall that a change in any one input data bit produces a distribution of changes out of the associated input substitution, depending on the particular substitution, original input value, and change. Any possible byte input has a 50 percent probability of affecting each of the eight output bits from that substitution.

Each substitution mechanism used in the fenced mechanisms effectively implements a substitution table. A substitution table S is an indexable n-element vector of output codes. An invertible substitution table S with inverse table S^-1 has the property that for any input code i in n, S^-1S(i) = i. This implies that S contains n different output codes.

An invertible substitution table S contains each output code value exactly once. Since each possible index selects a different element, any index change will select a different output code. Since different code values must differ in at least one bit, any input change must produce a change in at least one output bit.

Given invertible substitution table S with shuffled contents, the output distribution for any input code change is an arbitrary selection from the output codes which differ from the current output code. If the output codes are a complete set of 2^m values 0..2^m-1 for some m, it is likely that about half of the output bits will change for any input code change.

Conversely, since each output bit is produced by an output code, and the selected output code is completely dependent upon every bit in the input code, each output bit is dependent on every bit of the input. A network with this property is normally called complete, and localized completeness is also the basis for avalanche in an iterated block cipher.

If two 128-bit blocks are first mixed, then the resulting data mixed as two pairs of 64-bit blocks, suppose that there is a change in any one input data bit. Without the mixing, this produces an 8-bit substituted result which would normally affect just a single DES block. But the 128-bit block mixing produces at least one bit change in both 128-bit resulting blocks, and the 64-bit block mixings extend those changes to all four 64-bit resulting blocks, each of which is processed by a separate DES mechanism. Thus, any change of even a single input bit will affect all four DES operations.

Even though the output from each DES operation is pseudo random, a three-level structure is still necessary to prevent various kinds of attacks, for example, so-called "fix-in-the-middle" attacks. Therefore the output substitutions are important. The mixing is also used so that the result from a single DES block cannot be isolated and worked on independently.

To Part D: