The Story of Combiner Correlation: A Literature Survey

Research Comments from Ciphers By Ritter

Terry Ritter

Once upon a time, a stream cipher was little more than a linear feedback shift register (LFSR) and a simple exclusive-OR combiner. The fly in the ointment was the known-plaintext attack, which quickly bypassed the simple combiner to reveal the sequence from the LFSR. (The Berlekamp-Massey algorithm will recover the unknown state of a simple n-bit LFSR, and its feedback polynomial, with just 2n known bits.) Because of the weakness of the data-confusion combiner and the simplicity of the LFSR, even a degree-32 LFSR, with a cycle length of 2³² - 1 or 4 x 10⁹ steps, can be penetrated by knowing (or guessing) just 8 plaintext characters (64 bits).

The need to generate a more "complex" sequence led to the idea of using multiple LFSR's and somehow mixing them so that the ultimate complexity was the product of the individual complexities. This led to the construction of new confusion combiners, and to the analysis of those combiners. Eventually, Siegenthaler found that input-output correlations could be used to attack many of these combiners, even without using known-plaintext. This led to a sequence of developments producing stronger combiners, and more effective attacks. This process is the story of combiner correlation.

1965
- "Before the Beginning", MacLaren and Marsaglia combine random number sequences with a table.
1972
- The State-of-the-Open-Art seems to be a simple LFSR with a simple exclusive-OR combiner, but that is attackable with only 2n known bits.
1973
- Combiner: Geffe proposes combining the sequences from two LFSR's by using a third LFSR to select or multiplex between the two.
1977
- Combiner: Pless proposes combining the sequences from two LFSR's by feeding each into the J or K input of a J-K Flip Flop, and also deleting alternate output bits from the FF.
1979
- Attack: Rubin shows that at least one Pless "Arrangement" can be broken at practical effort.
1984
- Combiner: Bruer proposes that the single output bits from each of three or more LFSR's be combined by integer addition, and the output be taken from a threshold function.
- Attack: Siegenthaler proposes a form of "divide and conquer" attack allowing each LFSR to be worked on separately. Siegenthaler finds this to be possible when the combiner has some amount of correlation between an input sequence and the output.
- Attack: Retter shows that a known-plaintext attack can break a MacLaren-Marsaglia combiner.
1985
- Attack: Siegenthaler continues on the "correlation attack" and presents graphs showing the performance of a ciphertext only version. Siegenthaler claims to be able to break Geffe, Pless, and Bruer, and presents graphs showing the performance of the attack, but no explicit algorithms.
- Combiner: Rueppel proposes to combine the single bit output of each of two LFSR's, plus a saved "carry" bit, by integer addition. The combiner output is the "sum" output from the adder, and the carry bit is set from the "carry" output. This is the "summation cipher."
- Theory: Siegenthaler presents the general form of correlation-immune combiners as a combinatoric circuit with delayed feedback.
- Attack: Retter shows that MacLaren-Marsaglia combiners support an effective divide-and-conquer attack, and so are not cryptographically effective.
- Attack: Siegenthaler addresses the situation of a single LFSR which is tapped at several places to provide multiple inputs for a nonlinear combining function.
1986
- Theory: Maurer (in German) shows that the "summation cipher" is correlation-immune.
- Theory: Rueppel writes the book on stream ciphers.
- Production: Ciarcia presents a fully-realized design, and kits for that design, for a supposedly-secure simple stream cipher. (Later, Pearson shows how to efficiently break it.)
1987
- Attack: Mund, Gollmann and Beth improve the efficiency of a correlation attack by using the Walsh-Hadamard transform.
1988
- Theory: Guo-Zhen and Massey show that, for any memoryless combining function, combiner correlation can be computed by a Walsh transform of the combining function.
- Attack: Meier and Staffelbach describe algorithms and give results for their own correlation attack.
- Attack: Zeng and Huang approach the correlation attack as an error-correction problem, where the (hidden) ideal LFSR sequence has been partially corrupted by another sequence.
- Attack: Pearson shows how to break the Ciarcia design.
1989
- Attack: Forre modifies an algorithm from Meier and Staffelbach to attack a different form of combined generator. In this form, a single LFSR is tapped at several places to provide multiple inputs for a nonlinear combining function.
1990
- Attack: Meier and Staffelbach attack and break a "summation cipher" which uses two sizable LFSR's. They show that the same approach would not work with more than two LFSR's.
- Attack: Mihaljevic and Golic give a detailed algorithm (extended from Meier and Staffelbach "Algorithm B") and its performance.
- Attack: Zeng, Yang and Rao find a detailed new approach which they call a "linear syndrome" algorithm. This is an error-correcting approach which also is applied iteratively.
- Attack: Staffelbach and Meier continue to review the "summation cipher," and find that the carry is asymptotically balanced for even numbers of LFSR's, and biased for odd numbers of LFSR's. In general, the more LFSR's the better, and, thus, their proposed three-LFSR summation cipher is weak. cipher
- Three-Input Combiner: Golic and Mihaljevic investigate the linear complexity of a LFSR sequence which is locally re-ordered by memory and separate read and write LFSR's.
- Combiner: Ritter presents the concept of a dynamic function as a reversible combiner, so being useful both for confusion-confusion and data-confusion combining.
1991
- Attack: Chepyzhov and Smeets present a new algorithm for LFSR state recovery from noisy data.
- Theory: Camion et. al. establish a link between correlation-immune functions, and orthogonal arrays by way of Algebraic Coding Theory.
- Attack: Mihaljevic and Golic continue to improve and test their iterative algorithms.
- Survey: Zeng et. al. survey the situation with respect to stream ciphers, LFSR's and combiners.
1992
- Attack: Mihaljevic and Golic present a different iterative algorithm in some detail.
1994
- Attack: MacKay presents yet another attack algorithm, including C-style pseudo-code, using "variational free energy minimization."
- Attack: Mihaljevic presents yet another attack on LFSR mixing.
- Attack: Menicocci shows that the sequence generated by the Golic and Mihaljevic Variable-Memory BSG is cryptographically weak.
- Attack: Golic presents a general theory of modelling various LFSR-based generators.
1995
- Attack: Golic et. al. presents several forms of error-correcting algorithm in detail, and compares results.
- Attack: Klapper and Goresky show that the summation cipher is in fact the addition of "2-adic" numbers, and present a sequence reconstruction algorithm analogous to Berlekamp-Massey.

1965 -- Before the Beginning

MacLaren, M. and G. Marsaglia. 1965. Uniform Random Number Generators. Journal of the Association for Computing Machinery. 12(1): 83-89.

"In attempting to improve on the congruential generators, we tried combining two of them. This gave a generator which seems to be better than either of the two congruential generators, but it has the disadvantage of being slower."

"Suppose the first number U₁ for a congruential generator . . . is picked at random. Then the sequence U₁, U₂, ... may be considered a sequence of random variables. Moreover, each U_i will be uniform on the set of numbers in [0,1] which can be represented exactly in the computer. However, the different U_i are not independent, and it turns out that the distribution of an n-tuple (U₁, ..., U_n) may be quite far from the correct distribution. To improve the distribution of n-tuples, we propose using two different generators . . . and having one shuffle the sequence produced by the other."

Although not originally developed for cryptography, the MacLaren-Marsaglia combiner was broken by known plaintext attack, and by divide and conquer.

1972 -- State-of-the-Open-Art

One example is an article by an Applications Engineer at National Semiconductor:

Twigg, T. 1972. Need to keep digital data secure? Electronic Design. 23: 68-71. November 9. (Also see: Smith, M. 1973. Correction suggested in encoding article. Electronic Design. 9: 7. April 26.)

Twigg, an Applications Engineer at National Semiconductor, proposes to encrypt data with a pseudorandom sequence generated by a shift-register (SR) and exclusive-OR gates. (This is known as a "linear feedback shift register" or LFSR.) The SR is composed of 7474 TTL "D" flip-flops. The feedback polynomial is arbitrarily selectable using a switch at each stage.

But following this article -- in the very same issue -- is the moderately-famous article by Meyer and Tuchman at IBM:

Meyer, C. and W. Tuchman. 1972. Pseudorandom codes can be cracked. Electronic Design. 23: 74-76. November 9.

"In general, the code of any n-stage code generator, with arbitrary feedback . . . can be broken with any 2n bits of clear and corresponding enciphered text. Breaking the code consists of determining the switch settings and the initial states of the flip-flop stages. Once these conditions are known, the complete text can be deciphered."

They then gave a matrix-based reconstruction.

1973 -- Geffe (U.S.A.)

Probably after digesting the irony presented by Twigg and Meyer-Touchman in the same issue, Geffe published his designs for stronger stream ciphers:

Geffe, P. 1973. How to protect data with ciphers that are really hard to break. Electronics. January 4. 99-101.

Geffe presented a general design using a 33-bit linear feedback shift register (LFSR) with key-selectable feedback taps. But his main innovation was an attempt to create a nonlinear keystream by combining three separate LFSR's. One LFSR was used solely to select between two other LFSR's, as follows:

     A ------|\
             |&)--+
          +--|/   +--)\                    -        IF C THEN
     C ---+          )+)--- Z    Z = AC + BC   or      Z := A
          +-o|\   +--)/                             ELSE
             |&)--+                                    Z := B;
     B ------|/

Note that the Geffe combiner is actually a multiplexer, which is now a common structure used to select between two inputs. Let's look at some simple probability results from the Geffe combiner:

     A  B  C   Z   A=Z  B=Z  C=Z

     0  0  0   0    1    1    1
     0  0  1   0    1    1    0
     0  1  0   1    0    1    0
     0  1  1   0    1    0    0
     1  0  0   0    0    1    1
     1  0  1   1    1    0    1
     1  1  0   1    1    1    0
     1  1  1   1    1    1    1
              ---  ---  ---  ---
              50%  75%  75%  50%

First we note that the Geffe combiner does indeed produce a balanced result. That is, assuming the input sequences are uncorrelated and each has an equal number of 1's and 0's, the output will also have and equal number of 1's and 0's.

But then we see that, surprisingly, whatever the output value Z, the probability is 75% that input A has that same value, and the same can be said for B. This is an input-to-output correlation or "information leak," and was eventually used to break through the combiner to find the state of the individual SR's.

The Geffe combiner is broken by Siegenthaler as a ciphertext only attack.

1977 -- Pless (U.S.A.)

Pless, V. 1977. Encryption Schemes for Computer Confidentiality. IEEE Transactions on Computers. C-26(11): 1133-1136.

Pless proposes that we feed the bit output from each of two LFSR's into the J and K inputs of a J-K Flip Flop, and then also delete alternate output bits with an "alternator." The J-K Flip Flop is the actual Pless combiner:

          +----+                 -   -
     A ---|J  Q|--- Q[t+1] = Q[t]K + Q[t]J
          |    |
          |    |
     B ---|K   |
          +----+

And here are the probability results:

     Q[t]  A  B   Q[t+1]   A=Q[t+1]  B=Q[t+1]

        0  0  0      0         1         1
        0  0  1      0         1         0
        0  1  0      1         1         0
        0  1  1      1         1         1
        1  0  0      0         1         1
        1  0  1      1         0         1
        1  1  0      0         1         1
        1  1  1      1         0         1
                    ---       ---       ---
                    50%       75%       75%

We note that the Pless combiner does produce a balanced result. But whatever the output value Q[t+1], the probability is 75% that input A has that same value, and the same can be said for B. This is an input-to-output correlation that was used to break through the combiner and find the state of the individual SR's.

After arguing for "alternators" in "Arrangements" B and C, Pless fails to include them in Arrangement D, and this is the configuration which was broken by Rubin in a known-plaintext attack. The Pless combiner was also broken by Siegenthaler in a ciphertext-only attack.

1979 -- Rubin (U.S.A.)

Rubin, F. 1979. Decrypting a Stream Cipher Based on J-K Flip-Flops. IEEE Transactions on Computers. C-28(7): 483-487.

(Also reprinted in Cryptologia,, 5(1), Jan. 1981 and Cryptology: Yesterday, Today and Tomorrow, 1987, Deavours, C et. al., eds., 283-293.)

"ABSTRACT: Pless has proposed a stream cipher based on J-K flip-flops that uses eight linear shift registers with feedback, having a combined length of 97 bits, four J-K flip-flops, and a four-stage cycling counter. The cipher has 2.54 x 10⁵¹ initial states (keys), and generates a presumably pseudorandom stream whose period is 1.52 x 10²⁹ bits. Despite these impressive statistics, it is computationally feasible to solve such a cipher with a known-plaintext attack, using as few as 15 characters."

1984 -- Bruer (Sweden)

Bruer, J. 1984. On Pseudo Random Sequences as Crypto Generators. Proceedings of the International Zurich Seminar on Digital Communications 157-161.

Bruer proposes that the single output bits from each of three or more LFSR's be combined by integer addition, and the output be taken from a threshold function. This combining is also broken by Siegenthaler in a ciphertext only attack.

1984 -- Siegenthaler (Switzerland)

Siegenthaler, T. 1984. Correlation-Immunity of Nonlinear Combining Functions for Cryptographic Applications. IEEE Transactions on Information Theory. IT-30: 776-780.

"Pseudonoise generators for cryptographic applications consisting of several linear feedback shift registers with a nonlinear combining function have been proposed as running key generators in stream ciphers." The purpose of the nonlinear combining function f . . . is to make the keystream difficult for the cryptanalyst to predict." "However, if the function f is not properly chosen, a cryptanalyst may make a selective attack on each subkey . . . ; this can be performed by correlating the ciphertext with the sequence generated by subgenerator S_i for each choice of K_i.

"In general, to make the generator . . . resistant to a correlation attack, one should ensure that there is no statistical dependence between any small subset of the n subgenerator sequences and the keystream sequence."

"LetX_j = (X_1j, X_2j, . . ., X_nj) be the n-tuple of subgenerator output digits at time j. We shall say that the combining function f is mth-order correlation-immune if every m-tuple obtained by choosing m components from X_j is statistically independent of Z_j for all j = 1, 2, 3, . . . ."

1984 -- Retter (U.S.A.)

Retter, C. 1984. Cryptanalysis of a MacLaren-Marsaglia System. Cryptologia. 8(2): 97-108.

"The research and development groups at Data General do much of their work on a large network of interconnected minicomputers." "For this reason, various file encryption programs have been developed. The early versions were trivial, but by 1980 a program was in use which its author claimed to be 'virtually unbreakable short of exhaustive search.' Since the key was 31 bits, exhaustive search might have been possible, but on the available minicomputers it would have taken days of CPU time even with known plaintext. The system proved to be far less secure, and can usually be broken in minutes using just a guess about the plaintext."

"This algorithm is a version of the MacLaren-Marsaglia algorithm, which Knuth [2] contends 'will satisfy virtually anyone's requirements for randomness." "The method of attack used was the known-plaintext attack."

1985 -- Siegenthaler (Switzerland)

In a manuscript generally contemporaneous with his previous article, Siegenthaler presents graphs showing the performance of his correlation attack. In particular, he takes on the Geffe combiner and shows that it supports a correlation ciphertext-only attack.

Siegenthaler, T. 1985. Decrypting a Class of Stream Ciphers Using Ciphertext Only. IEEE Transactions on Computers. C-34: 81-85.

1985 -- Rueppel (Switzerland)

Rueppel, R. 1985. Correlation Immunity and the Summation Generator. Advances in Cryptology -- CRYPTO '85. 260-272. Springer-Verlag.

". . . integer addition, when viewed over GF(2), defines an inherently nonlinear function with memory whose correlation-immunity is maximum."

". . . consider a classical running-key generator for use in a stream cipher system. Such a running-key generator consists of N driving linear feedback shift registers (LFSRs) and some nonlinear function operating on the N output sequences in order to produce the running-key." "Unfortunately, for . . . memoryless combining functions f there exists a tradeoff between the attainable nonlinear order and the attainable level of correlation-immunity."

". . . one bit of memory suffices to obtain nonlinear combiners that are maximally correlation-immune and have maximum nonlinear order at the same time."

Rueppel then goes on to propose that the one bit output from each of two LFSRs be combined by integer addition along with a "carry" memory bit. The carry is a one-bit delay for the carry output of the adder, and the sum output is the combined output.

1985 -- Siegenthaler (Switzerland)

Siegenthaler, T. 1985. Design of Combiners to Prevent Divide and Conquer Attacks. Advances in Cryptology -- CRYPTO '85. 273-279. Springer-Verlag.

"A common form of running key generators for use in stream ciphers consists of n driving sources and some combiner. We assume . . . that a finite state machine (FSM) combines the n input sequences . . . into an output sequence . . . ."

"A cryptanalyst possibly tries to break the above system by breaking the individual subkeys of the n sources. To prevent such divide and conquer attacks, the symbols generated by the FSM should be statistically independent on the symbols of one (or several) input sequences."

1985 -- Retter (U.S.A.)

Retter, C. 1985. A Key-Search Attack on MacLaren-Marsaglia Systems. Cryptologia. 9(2): 114-130.

"The idea of combining multiple pseudo-random number generators in order to produce a more secure keystream sequence has been proposed in various forms [5, section 6.4]. Most of these methods are intended to create nonlinear sequences by using linear generators, since linear sequences are easily invertible."

"The MacLaren-Marsaglia algorithm [1,2,3] is a somewhat more complex method of combining two generators. It stores a collection of previous values from one generator in a table, and uses the other generator to select which value to output from the table at each cycle."

"It is the purpose of this paper to show that the algorithm can be attacked by searching for the key to one of its generators, while ignoring the other." "Therefore, the amount of computation required to break the combined keystream generator is only about twice what would be required if one of the input generators had been used to generate the keystream directly."

1985 -- Siegenthaler (Switzerland)

Siegenthaler addresses the situation of a single LFSR which is tapped at several places to provide multiple inputs for a nonlinear combining function.

Siegenthaler, T. 1985. Cryptanalysts Representation of Nonlinearly Filtered M-Sequences. Advances in Cryptology: EUROCRYPT '85. 103-110. Springer-Verlag.

"Abstract: A running key generator consisting of a maximum-length (ML) linear feedback shift register (LFSR) and some nonlinear feedforward state filter function is investigated. It is shown how a cryptanalyst can find an equivalent system in a ciphertext-only attack. The analysis uses a Walsh orthogonal expansion of the state filter function and its relation to the crosscorrelation function (CCF) between the ML-sequence and the produced running key."

1986 -- Maurer (West Germany?)

Maurer, U. 1986. On the Linear Complexity and Correlation Immunity of the Summation Cipher. Mitteilungen AGEN 44: 5-12. (Dec. 1986) (In German.)

"Abstract. The Summation cipher, introduced by Massey and Rueppel, uses integer addition with carry as the combining function in the key stream generator for an additive stream cipher." "The correlation immunity of the summation cipher is proved. If the driving shift-registers are short, the resulting leakage of the input sequences through the combiner is shown to be the effect of the periodic repetition of a biased input sequence.

1986 -- Rueppel (Switzerland)

Rueppel releases not just a single article, but an entire book on stream cipher design.

Rueppel, R. 1986. Analysis and Design of Stream Ciphers. Springer-Verlag.

1986 -- Ciarcia (U.S.A.)

Ciarcia, S. 1986. Build a Hardware Data Encryptor. Byte. September. 97-111.

"This easy-to-build device is extremely difficult to crack."

"The technique used here is to combine the output of two pseudorandom sequencers to produce one pseudorandom stream." "The length of the bit stream generated by the top four shift register [devices] in figure 3 is 2³¹ - 1 or 2,147,483,647. The length of the bit stream of the lower three shift register [devices] is 2²³ - 1 or 8,388,607."

This design is broken by Pearson.

1987 -- Mund, Gollmann and Beth (West Germany)

Mund, S., D. Gollmann and T. Beth. 1987. Some remarks on the cross correlation analysis of pseudo random generators. Advances in Cryptology -- EUROCRYPT '87. 25-35. Springer-Verlag.

"Siegenthaler has shown how cross-correlation techniques can be applied to identify pseudo random generators consisting of linear feedback shift registers and a scrambling function." "It is possible to speed up this attack by using the Walsh-Hadamard Transform to compute simultaneously the cross correlation between (cⁿ) and the outputs of all possible initial states of the given register."

"We show that there exists a trade-off between the dimension of the Hadamard matrix and the number of bits required to compute the cross correlation analysis."

1988 -- Guo-Zhen and Massey (China and Switzerland)

Massey himself weighed in with a way to test (some) combining functions for correlation:

Guo-Zhen, X. and J. Massey. 1988. A Spectral Characterization of Correlation-Immune Combining Functions. IEEE Transactions on Information Theory. 34(3): 569-571.

"Abstract -- It is shown that a Boolean combining function f(x) of n variables is mth-order correlation immune if and only if its Walsh transform F(w) vanishes for all w with Hamming weights between 1 and m, inclusive. ..."

"A binary [i.e., GF(2)-valued] random variable is said to be balanced if it is equally likely to take on the values 0 and 1. Siegenthaler [2] has defined the combining function f(x) to be mth-order correlation immune if the random variable Z = f(X₁, X₂, ..., X_m) is statistically independent of every set of m random variables chosen from the balanced and independent binary random variables X₁, X₂, ..., X_N."

"Theorem: The Boolean combining function f(x) for n binary variables is mth-order correlation immune, where 1 <= m <= n, if and only if its Walsh transform satisfies F(w) = 0, for 1 <= W(w) <= m."

(Here W(w) is the Hamming weight; that is, the number of 1's in the vector.) The result basically says that if we perform a Walsh transformation on the sequence produced by stepping through the combining function, we can tell how correlation immune that function is. (Unfortunately, this cannot hold for dynamic functions.)

1988 -- Meier and Staffelbach (Switzerland)

Meier and Staffelbach give us two actual algorithms ("A" and "B") for developing LFSR state behind the combiner.

Meier, W. and O. Staffelbach. 1988. Fast Correlation Attacks on Stream Ciphers. Advances in Cryptology -- EUROCRYPT '88. 301-314. Springer-Verlag.

"This leads to new design criteria for stream ciphers:"

"Any correlation to a LFSR with lest than 10 taps should be avoided."
"There should be no correlation to a general LFSR of length shorter than 100 (especially when the feedback connection is assumed to be known)."

"It is remarkable that the importance of the number of LFSR taps for the correlation analysis was not recognized in the cryptographic literature so far."

1988 -- Zeng and Huang (China)

Zeng, K. and M. Huang. 1988. On the Linear Syndrome Method in Cryptanalysis. Advance in Cryptology -- CRYPTO '88. 469-478.

"Suppose the cryptanalyst has at his hand a sufficiently long segment of the binary sequence

B = A + X,

"where A is a linear sequence with known feedback polynomial f(x) and x is a sequence with unknown or very complicated algebraic structure, but is sparse in the sense that, if we denote its signals by x(i), i > 0, then we shall have

s = prob( x(i) = 1 ) = 1/2 - e, 0 < e < 1/2 .

"We call s the error rate of the sequence A in the sequence B."

"One way for tackling this problem is to make use of the ideas of error correction . . . ."

1988 -- Pearson (U.S.A.)

Pearson, P. 1988. Cryptanalysis of the Ciarcia Circuit Cellar Data Encryptor. Cryptologia. 12(1): 1-10.

"This paper abstracts the encryption algorithm presented in [the Ciarcia] article, and recasts it in matrix notation for further analysis."

"The two LFSRs in this design are binary registers of length 31 bits and 23 bits, resulting in a total of 2 possible internal states for the keystream generator. This large number (2 = 1.8 x 10¹⁶, more impressively pronounced as "eighteen million billion") effectively eliminates the possibility of finding the key by exhaustive search."

"Here is how equation (2) would be used in a known-plaintext attack. First, 54 consecutive bits of the keystream must be calculated by XORing 54 bits of intercepted ciphertext with the corresponding 54 bits of plaintext. (For example, if the attacker knows that the transmission starts with 'Dear Sir' in 8-bit ASCII, he already has 64 plaintext bits -- 8 characters of 8 bits each.) Next, these 54 bits are arranged into a column vector K₁ and multiplied by D_-1 to yield A₀."

"Finally, this A₀ is loaded into a keystream generator, where it will generate first the 54 keystream bits from which it was calculated, then the keystream bits needed to decrypt the rest of the intercepted message."

"If the attacking cryptanalyst doesn't know the first 54 bits of plaintext, there are other avenues still open."

1989 -- Forre (Switzerland)

Where previously we were concerned with a nonlinear combining of multiple separate LFSR's, here Forre is concerned with attacking the nonlinear combining of multiple bits of a single LFSR:

Forre, R. 1989. A Fast Correlation Attack on Nonlinearly Feedforward Filtered Shift-Register Sequences. Advances in Cryptology -- EUROCRYPT '89. 586-595. Springer-Verlag.

In the process, Forre discusses (but does not explicitly detail) the original algorithm, and identifies situations where it may fail. The algorithm is then modified for the desired structure, and graphs indicate fairly extensive experimentation with it.

"These experiments showed that the success of the attack depends on the following factors:

The number of feedback tabs of the LFSR: the more taps there are, the more bits are involved in each linear relation and the less reliable is the assignment of probabilities . . . ."
The (absolute and relative) heights of the correlation peaks between the running-key sequence and the LFSR-sequence. Higher peaks are much easier detected by the algorithm than lower ones. . . ."

1990 -- Meier and Staffelbach (Switzerland)

Meier and Staffelbach return with a response to combiners with memory.

Meier, W. and O. Staffelbach. 1990. Correlation Properties of Combiners with Memory in Stream Ciphers. Advances in Cryptology -- EUROCRYPT '90. 204-213. Springer-Verlag.

By now it is known that any memoryless combiner has a correlation sum equal to one. They say: "the 'total' correlation is independent of the combining function." Then they go on to show a similar result of combining functions with memory (apparently a 1-bit memory).

". . . the summation cipher with two LFSRs can be successfully cryptanalyzed for LFSRs of considerable length with arbitrary feedback connection." "It is shown in [8] that a similar cryptanalysis is no longer possible for a summation cipher with more than 2 LFSRs."

1990 -- Mihaljevic and Golic (Yugoslavia)

Mihaljevic, M. and J. Golic. 1990. A Fast Iterative Algorithm for a Shift Register Initial State Reconstruction Given the Noisy Output Sequence. Advances in Cryptology -- AUSCRYPT '90. 165-175. Springer-Verlag.

"This problem of cryptanalysis can be regarded as the problem of a LFSR initial state reconstruction using the noisy output sequence . . . ."

"In this paper we consider a class of algorithms to which Algorithm B [from Meier-Staffelbach] belongs. In this class the initial state reconstruction is based on the error correction principle. It means that the procedure is iterative: in each step we first calculate the posterior probabilities, bit-by-bit (phase I), and them make a bit-by-bit decision (phase II)."

1990 -- Zeng, Yang and Rao (China and USA)

Zeng, K., C. Yang and T. Rao. 1990. An Improved Linear Syndrome Algorithm in Cryptanalysis with Applications. Advances in Cryptology -- CRYPTO '90. 34-47.

"What is given is a certain segment of a binary sequence of the form B = A + X, where A is a linear recursive sequence with known feedback polynomial f(x) and the sequence X is unknown but sparse in the sense that Prob( x(t) = 1 ) = s₀ < 1/2, s₀ being called the initial error rate of the sequence A in the sequence B."

"The method suggests to fix an integer r >= 3, find out a set of r-nomial multiples . . . of the feedback polynomial f(x), compute an odd number, say 2m + 1, of syndromes . . . and revise the signals b(i) to new signals b'(i) according to the rule of majority decision, namely, put b'(i) = NOT b(i) if at least m + 1 syndromes are 1, otherwise b'(i) = b(i)."

"The main idea in the improved LS algorithm is to make the revisions with a reducing number of syndromes, with the length of the segment under processing being reduced correspondingly."

The article goes on to give explicit mathematical algorithms for the method. The process is iterated -- repeated -- to improve the "error correction."

1990 -- Staffelbach and Meier (Switzerland)

Staffelbach, O. and W. Meier. 1990. Cryptographic Significance of the Carry for Ciphers Based on Integer Addition. Advances in Cryptology -- CRYPTO '90. 601-614.

"Integer addition has been proposed for use in cryptographic transformations since this operation is nonlinear when considered over GF(2)."

"In these ciphers nonlinearity or confusion is achieved via the carry. In fact if the carry happens to be zero, integer addition is linear when considered over GF(2) . . . ." "Therefore the strength of these ciphers heavily relies on the randomness of the carry. In particular it is required that the least significant bit (l.s.b.) of the carry is balanced or nearly balanced. However it may happen that this postulate is satisfied in the average, but is violated locally. In fact for the summation combiner with n = 2 inputs it has been shown in [4] that the carry is balanced in the average, but is strongly biased in runs of consecutive equal output digits."

"The aim of the present paper is to investigate the probability distribution of the carry for a summation combiner with an arbitrary number n of inputs." ". . . it is proved that the carry is balanced for even n and biased for odd n."

1990 -- Golic and Mihaljevic (Yugoslavia)

Golic, J. and M. Mihaljevic. 1990. Minimal Linear Equivalent Analysis of a Variable-Memory Binary Sequence Generator. IEEE Transactions on Information Theory. 36(1): 190-192.

"Geffe [1] proposed a nonlinear binary sequence generator (BSG) with two linear feedback shift registers (LFSR's) and a memory: one LFSR is used to load the memory from which a binary sequence is read out according to the addresses taken from the other LFSR . . . ." "A somewhat similar principle due to MacLaren and Marsaglia, called randomizing by shuffling, was described in [2] as a way of improving on the statistical properties of random or pseudorandom sequences."

"We consider a BSG consisting of three LFSR's and a memory . . . ." "First, the output bit b(t) is read out of the memory location addressed by the read address X(t) [from LFSR₂]. Second, the output bit a(t) of LFSR₁ is written into the memory location addressed by the write address Y(t) [from LFSR₃]. The BSG just described will be referred to as a MEM-BSG. It realizes a time-varying nonlinear function of the phase shifts of a maximum-length sequence."

"It is proved that the linear complexity and the period of output sequences of MEM-BSG are, respectively, at least equal to the linear complexity and the period of output sequences of the corresponding multiplexer-based nonlinear generator, due to Jennings [3] (the MUX-BSG), which consists of two LFSR's and a multiplexer. Moreover, the hardware implementation of the MEM-BSG usually is much simpler than that of the corresponding MUX-BSG."

"Of special interest for spread-spectrum communication systems are the so-called bent-function sequences [9], [14], [15], which possess asymptotically optimal correlation properties." ". . . both the MEM-BSG and the bent-function BSG are suitable for generating fast binary sequences of sufficiently high linear complexities . . . ."

1990 -- Ritter (U.S.A.)

In a little-noticed article, I took combiner design in the other direction: Although previous combiners were concerned only with combining confusion (e.g., LFSR) sequences, I was concerned with the data-confusion combiner, because this was the outermost line of defense. Although this required that the design be potentially reversible, a non-reversible version could be used to combine confusion.

Ritter, T. 1990. Substitution Cipher with Pseudo-Random Shuffling: The Dynamic Substitution Combiner. Cryptologia. 14(4): 289-303.

"ABSTRACT: A cipher mechanism or process which can be viewed as a modified substitution cipher. A translation table is used to replace plaintext symbols with ciphertext symbols; the modification consists of changing the contents of the translation table after each substitution. The dynamic translation table acts to confuse symbol frequency statistics and so frustrate the usual cryptanalytic attacks."

1991 -- Chepyzhov and Smeets (USSR / Sweden)

Chepyzhov, V. and B. Smeets. 1991. On A Fast Correlation Attack on Certain Stream Ciphers. Advances in Cryptology -- EUROCRYPT '91. 176-185. Springer-Verlag.

"Abstract--In this paper we present a new algorithm for the recovery of the initial state of a linear feedback shift register when a noisy output sequence is given. Our work is focussed on the investigation of the asymptotical behaviour of the recovery process rather than on the construction of an optimal recovery procedure."

"In the correlation attack as it was originally described by Siegenthaler [1], one uses an exhaustive search through the state space of the shift register. Such a search is not very realistic when the degree r (= length of the LFSR) of the feedback polynomial of the LFSR exceeds 60 . . . ." "Recently it was shown by Meier and Staffelbach [2] [ not reviewed here ] that in certain cases one can avoid this exhaustive search. In particular, they showed that if the number t of feedback taps is small, then it is possible to restore the initial state by an iterative procedure with much less complexity than exhaustive search."

"Our algorithm is using the key stream almost as efficiently as possible at the expense of an increase of the complexity of the first stage. Our algorithm that we use for the first stage is derived from efficient algorithms for finding the non-zero codeword of lowest weight in a linear code [4], [5]. The second stage of our algorithm is almost identical to Gallager's algorithm for the decoding of low-density parity-check codes [6]."

1991 -- Camion et. al. (France)

Here, Camion and others "establish the link between correlation-immune functions and orthogonal arrays."

Camion, P., C. Carlet, P. Charpin and N. Sendrier. 1991. On Correlation-immune functions. Advances in Cryptology -- CRYPTO '91. 86-100. Springer-Verlag.

"Definition 3.1 An M x m matrix V with entries from a set of q elements is called an orthogonal array of size M, with m constraints, q levels, strength k, and also index u if any set of k columns of V contains all q^k possible row vectors exactly u times. Such an array is denoted by (M,m,q,k). Clearly M = uq^k."

"Theorem 3.1 A boolean function f on G is correlation immune of order k if and only if its truth table is an orthogonal array (M,m,2,k)."

1991 -- Mihaljevic and Golic (Yugoslavia)>

Mihaljevic, M. and J. Golic. 1991. A Comparison of Cryptanalytic Principles Based on Iterative Error-Correction. Advances in Cryptology -- EUROCRYPT '91. 527-531. Springer-Verlag.

"ABSTRACT: A cryptanalytic problem of a linear feedback shift register initial state reconstruction using a noise output sequence is considered . . . ."

"The following three principles are considered:

"P.1: Error-correction is based on the number of satisfied parity checks.

"P.2: Error-correction is based on the estimation of the relevant posterior probabilities obtained by using the average posterior probability estimated in the previous iteration as the prior probability in the current iteration.

"P.3: Error-correction is based on the estimation of the relevant posterior probabilities obtained by using the posterior probabilities estimated in the previous iteration as the prior probabilities in the current iteration."

Experiments indicate that P.1 is most efficient, and P.3 is somewhat more capable.

1991 -- Zeng, Wang, Wei and Rao (U.S.A.)

Zeng, K., C. Yang, D. Wei and T. Rao. 1991. Pseudorandom Bit Generators in Stream-Cipher Cryptography. IEEE Computer. February. 8-17.

"The central problem in stream-cipher cryptography . . . is the difficulty of generating a long unpredictable sequence of binary signals from a short and random key." "The problem is this: On which basis can one draw the conclusion that the output signals of a certain given keystream generator are hard to predict? No universally applicable and practically checkable criteria have been developed to certify the security of bit generators. For that matter, no general theory of cryptanalysis is known to exist except for an ever-expanding arsenal of concrete attack methods elaborated for various practical purposes."

1992 -- Mihaljevic and Golic (Yugoslavia)

Mihaljevic, M. and J. Golic. 1992. Convergence of a Bayesian Iterative Error-Correction Procedure on a Noisy Shift Register. Advances in Cryptology -- EUROCRYPT '92. 124-137. Springer-Verlag.

"ABSTRACT: Convergence of an algorithm for a linear feedback shift register initial state reconstruction using the noisy output sequence, based on a bitwise Bayesian iterative error-correction procedure, and different weight parity-checks, is analyzed. ..."

"Many of the published keystream generators are based on binary linear feedback shift registers (LFSRs) combined by a memoryless function. Such a generator is called a combination generator."

"In this paper, we consider an iterative algorithm employing the parity-checks of different weights and Bayesian decision rule in error-correction for each bit, assuming that the error-rate from the previous iteration is used as the noise probability in the current one."

1994 -- MacKay (U.K.)

MacKay, D. 1994. A Free Energy Minimization Framework for Inference Problems in Modulo 2 Arithmetic. Fast Software Encryption. 179-195. Springer-Verlag.

"ABSTRACT. This paper studies the task of inferring a binary vector s given noisy observations of the binary vector t = As modulo 2, where A is a M x N binary matrix." "The unknown binary vector is replaced by a real vector of probabilities that are optimized by variational free energy minimization."

"Consider three binary vectors: s [signal] . . . and r [received] and n [noise] . . . related by:

(As + n) mod 2 = r

where A is a binary matrix. Our task is to infer s given r and A, and given assumptions about the statistical properties of s and n."

"One way to attack a discrete combinatorial problem is to create a related problem in which the discrete variables s are replaced by real variables, over which a continuous optimization can then be performed." "In the present context, the question is then 'how should one generalize the posterior probability (4) to the case where s is replaced by a vector with real components?'"

"The variational free energy minimization method (Feynman 1972) takes an 'awkward' probability distribution such as the one in (4), and attempts to approximate it by a simpler distribution. . . ."

[ MacKay also includes a detailed description of the algorithms in C-like pseudo-code. ]

1994 -- Mihaljevic (Yugoslavia)

Mihaljevic, M. 1994, A Correlation Attack on the Binary Sequence Generators with Time-Varying Output Function. Advances in Cryptology -- ASIACRYPT '94. 67-79. Springer-Verlag.

"Abstract: A binary sequence generator (BSG) consisting of three regularly clocked linear feedback shift registers combined by a time-varying memoryless function is cryptanalysed. A novel distance measure for the binary sequences comparison relevant for the cryptanalysis is proposed . . . ." "It is pointed out that the novel distance based approach to cryptanalysis could be applied for attacking the binary MacLaren-Marsaglia shuffler . . . ."

"The problem is to find out the conditions under which it is possible to reconstruct the initial contents of individual shift registers knowing a segment of the keystream sequence, based on the correlation / statistical dependence between the keystream sequence and a set of the shift register sequences."

"The main objective of this paper is cryptanalysis of a BSG presented in [Golic 1990] which is an extension of similar structures from [Geffe 1973] (the BSG consisting of two LFSR's and a variable memory), [MacLaren and Marsaglia 1968], [Knuth II]. This generator consists of three regularly clocked linear feedback shift registers (LFSR's) combined by a time-varying memoryless function."

1994 -- Menicocci (Italy)

Menicocci, R. 1994, Intrinsic weakness of variable-memory keystream generators. Electronics Letters. 30(11): 850-851.

"Introduction: The variable-memory binary sequence generator (MEM-BSG) [1] consists of three linear feedback shift registers (LFSRs) and a variable memory. Because of its convenience for generating fast sequences of large period and complexity [1], the MEM-BSG appears suitable for cryptographic applications. In this Letter we point out that there exists a correlation between the output sequence of the generator and the sequence generated by one of the registers. We also show why this correlation represents an intrinsic weakness of the MEM-BSG when used as a keystream generator."

1994 -- Golic (Yugoslavia)

Golic, J. 1994. Intrinsic Statistical Weakness of Keystream Generators. Advances in Cryptology -- ASIACRYPT '94. 91-103. Springer-Verlag.

"Abstract: It is shown that an arbitrary binary keystream generator with M bits of memory can be linearly modelled as a non-autonomous linear feedback shift register of length at most M with an additive input sequence of nonbalanced identically distributed binary random variables." "Linear models for clock-controlled shift registers and arbitrary shift register based keystream generators are derived. Several examples including the time-variant memoryless combiner, the basic summation generator, the stop-and-go cascade, and the shrinking generator are presented."

1995 -- Golic et. al. (Australia)

Golic, J., M. Salmansizadeh, A. Clark, A. Khodkar and E. Dawson. 1995. Discrete Optimization and Fast Correlation Attacks. Cryptography: Policy and Algorithms. 186-200. Springer-Verlag.

"Stream ciphers which generate pseudo-random sequences using the output of a number of linear feedback shift registers (LFSRs) combined by some nonlinear function, with or without memory, have long been proposed for use in secure communications. The purpose of nonlinear combiners is to produce a system which can withstand any practical cryptanalytic attack based on low linear complexity of the observed keystream sequence (see [13]) or high linear correlation to individual LFSR sequences (see [14] and [5])."

"This paper considers the immunity of these combiners to fast divide and conquer correlation attacks [9]. The problem is to find the conditions under which it is possible to reconstruct the initial contents of individual shift registers using a segment of the keystream generator output sequence. Correlation attacks are based on the statistical dependence between the observed keystream sequence and a set of shift register sequences [5], [14]. If such an attack outperforms an exhaustive search over the initial contents of the corresponding shift registers, it is then called a fast correlation attack."

1995 -- Klapper and Goresky (U.S.A.)

Klapper, A. and M. Goresky. 1995. Cryptanalysis Based on 2-Adic Rational Approximation. Advances in Cryptology -- CRYPTO '95. 262-273.

"The development of cryptosystems tends to alternate between the design of new systems that resist known attacks, and the design of new attacks against systems." "Occasionally a very general attack is found that can potentially be used against a large class of cryptosystems."

"This approach can be used to attack Massey and Rueppel's summation combiner [16, 19]. In their setup, the outputs from several short maximal period LFSRs . . . with pairwise relatively prime periods are combined using addition with carry."

"However, addition with carry is precisely addition in the 2-adic numbers." ". . . if we combine m-sequences of period 2ⁿ - 1 for n = 7, 11, 13, 15, 16, 17, then the resulting sequence has linear span nearly 2⁷⁹, but the 2-adic span is less than 2¹⁸. Thus 2¹⁹ bits suffice to determine this sequence -- and far fewer unless care is taken in the choice of m-sequences.

Terry Ritter, his current address, and his top page.

Last updated: 1996-08-15

The Story of Combiner Correlation: A Literature Survey

Research Comments from Ciphers By Ritter

Terry Ritter

Contents

1965 -- Before the Beginning

1972 -- State-of-the-Open-Art

1973 -- Geffe (U.S.A.)

1977 -- Pless (U.S.A.)

1979 -- Rubin (U.S.A.)

1984 -- Bruer (Sweden)

1984 -- Siegenthaler (Switzerland)

1984 -- Retter (U.S.A.)

1985 -- Siegenthaler (Switzerland)

1985 -- Rueppel (Switzerland)

1985 -- Siegenthaler (Switzerland)

1985 -- Retter (U.S.A.)

1985 -- Siegenthaler (Switzerland)

1986 -- Maurer (West Germany?)

1986 -- Rueppel (Switzerland)

1986 -- Ciarcia (U.S.A.)

1987 -- Mund, Gollmann and Beth (West Germany)

1988 -- Guo-Zhen and Massey (China and Switzerland)

1988 -- Meier and Staffelbach (Switzerland)

1988 -- Zeng and Huang (China)

1988 -- Pearson (U.S.A.)

1989 -- Forre (Switzerland)

1990 -- Meier and Staffelbach (Switzerland)

1990 -- Mihaljevic and Golic (Yugoslavia)

1990 -- Zeng, Yang and Rao (China and USA)

1990 -- Staffelbach and Meier (Switzerland)

1990 -- Golic and Mihaljevic (Yugoslavia)

1990 -- Ritter (U.S.A.)

1991 -- Chepyzhov and Smeets (USSR / Sweden)

1991 -- Camion et. al. (France)

1991 -- Mihaljevic and Golic (Yugoslavia)>

1991 -- Zeng, Wang, Wei and Rao (U.S.A.)

1992 -- Mihaljevic and Golic (Yugoslavia)

1994 -- MacKay (U.K.)

1994 -- Mihaljevic (Yugoslavia)

1994 -- Menicocci (Italy)

1994 -- Golic (Yugoslavia)

1995 -- Golic et. al. (Australia)

1995 -- Klapper and Goresky (U.S.A.)