Jail Cell RC4

Note: A completed paper regarding Jail Cell RC4 (including all the information below) is available in Microsoft Word document here.

A Java applet implementing the best known attack on Jail Cell RC4 can be viewed here.

Introduction

The purpose of this cipher is the fulfill the requirements for a cipher in the following scenario:

You are locked into a jail cell, and cannot take anything in with you. The jailer gives you paper and a pencil and instructs you to write a message to someone outside, which he will deliver for you. The message must be ready by dawn.
You must encrypt your message such that the intended recipient (who knows the key) will be able to read it, but the jailer cannot.

I am attempting to modify the RC4 cipher to be paper-and-pencil computable. I realize that others have tried to construct secure paper-and-pencil ciphers, but I would be very appreciative of any input you may have. In particular, if you are aware of (or can invent) any cryptanalytical attacks of practical use against this cipher, I would appreciate it if you would send me a description of those attacks.
The keystream generator is the same as standard RC4, but the permutation table is smaller (only 37 entries instead of 256) and the key-scheduling algorithm has been completely changed. A more detailed description and partial analysis follows.

Source code for a C++ program implementing the cipher described above is available here, although the goal is for the cipher to be paper-and-pencil-computable.

Generalized RC4

RC4 is usually defined such that the size of the S-box (N) is always a power of two (2ⁿ). Here it is defined such that N = m, the size of whatever alphabet is used, regardless of whether this is a power of two.
The internal state of the keystream generator for generalized RC4 consists of a permutation table of an alphabet, S[0..m-1] (where m is the size of the alphabet), and two counters, i and j. The counters are both initialized to zero, the initial state of the permutation table is the key. To generate one keystream letter k:

(1) i = i + 1 mod m (2) j = j + i + S[i] mod m (3) swap S[i] and S[j] (4) x = S[i] + S[j] mod m (5) k = S[x]

Each letter of the plaintext is shifted by the corresponding output letter of the keystream.

Security Considerations

This algorithm differs from standard RC4 only in two respects: (1) each keystream value is applied to a plaintext character as a shift, rather than using XOR, and (2) the size of the S-box is variable.
Change (1) is not important to security. Actually, any one-to-one function which maps a plaintext character to a ciphertext character using the key value would suffice, and would be equally secure (it might actually be more secure if the function was not known to the cryptanalyst, but this would violate Kirchoff's assumption).
Change (2), if the size of the S-box (and therefore the key) is made smaller (which is the case for Jail Cell RC4), attacks obviously become more computationally feasible, but it should not make the cipher subject to any new class of attacks. Existing attacks of which the author is aware become computationally infeasible when the alphabet size (m) reaches about 2^5 = 32 characters (S-box size = 32) unless a very large amount of encrypted text is available (along the order of several trillion characters). The proposed S-box size for Jail Cell RC4 is 37 (allowing for characters A-Z, 0-9, and a period).
S-box size might be enlarged to a prime number in the vicinity of 64 by differentiating between capital and lowercase letters (and other minor considerations) if this becomes necessary to security, but this will substantially increase the amount of paper necessary, and more importantly, increase the time necessary to initialize the S-box using the key scheduling algorithm outlined below.

Key Scheduling

The S-box permutation is derived from an array of key values K, where each value is a nonzero alphabetical character or its numerical equivalent. For m = 37, K needs to contain at least 13 values to achieve 64 bits of actual key information.
The first key value K[0] is taken to be the first alphabetical character to be inserted into the S-box. It is inserted into the slot of the S-box designated by the next key value K[1].
Each successive key value indicates the position relative to the last insertion (mod m) where the next character should be inserted. For example, if the most recent character A is inserted into slot 7 (see example at below), and the next key value is 12, then B is inserted into slot (7 + 12 mod 37) = 19. If this slot is occupied (for example, if X is already in slot 19), then the position is displaced by the same key value again (19 + 12 mod 37 = slot 31). If m is prime, and key values of 0 are disallowed, then all key values will be relatively prime to m and therefore all table slots will eventually be probed if no open slot is found.

. . . 6 7 8 . . . 18 19 20 . . . 30 31 32

A X B

----------> O ----------> ^

When the end of the key is reached, key values are reused, starting again with K[0] (but using it as a displacement this time). Note that because zeroes cannot be used in the key, zero cannot be the first character placed, nor can the first character be placed in S[0]. This means that zero will appear in S[0] with 1/36 probability instead of 1/37. This could be avoided if K[0] and K[1] were allowed to be zero, but if this is the case, these key values cannot be reused, since zero is not relatively prime to m = 37.
Because RC4 is a stream cipher and the same S-box configuration cannot be used to send more than one message, K needs to be modified for each message. The modificaiton proceeds as follows: K[0] is incremented by 1 (mod m), and K[last] is rotated to the front of the array (with all other values being moved down the list one slot). This obviously does not rotate through all possible keys, but it does rotate through (m - 1) * (the size of K), which for proposed Jail Cell RC4 is at least (36 * 13) = 468 keys, probably enough for most realistic paper-and-pencil applications.
For a key of length l, the (l - 1)th output and later are guaranteed to be influenced by every word of the key, because the i counter forced the generated to utilize at least x values from the S-box in the first x outputs, and no value is inserted by the first key value (that value determines the first value to be inserted). Therefore, if the first (l - 2) outputs are discarded, the cipher should not be vulnerable to partial-key attacks.
However, it should be noted that if an attacker can recover the full state of the S-box, reconstruction of the original key is trivial, and so all other messages sent by rotating the key as described above will then be readable by the attacker.

Estimated Time for Key Scheduling Algorithm

According to [1], the accuracy of mentally computing a single-digit simple addition or subtraction becomes asymptotic with respect to time at about one second. This is used as an estimate for the time to compute such an operation.
A two-digit modular addition includes two single-digit simple additions, possibly a carry (about half the time), and possibly a two-digit simple subtraction (about half the time). Such an operation is therefore estimated to take, on average, about five seconds to compute.
Looking up a value in a data table or recording a value on paper is estimated to take approximately one second.

Operation Estimated Time

one-digit simple addition 1 second

two-digit modular addition 5 seconds

table look-up 1 second

data recording 1 second

[1] Paul Verhaeghen, Reinhold Kliegl, Ulrich Mayr; "Sequential and coordinative complexity in time-accuracy functions for mental arithmetic," Psychology & Aging Vol. 12(4), Dec., U.S.: American Psychological Assn. 1997, 555-564

Best Case Analysis

If no collisions, then placing each character requires a two-digit modular addition to shift from last value placed, a table look-up to confirm that slot is empty, and a data entry to place value into table.
First placement does not require the addition, because it is added to zero (the identity). Last placement does not require the addition either, because the value is simply placed in the last available slot.

Worst Case Analysis

Each collision during key set-up causes one additional two-digit modular addition and one additional table look-up to be required. If all key values are independently random (which is not quite the case, but approximately so, and this dramatically simplifies the analysis), then the maximum number of collisions for the ith placement is i - 2 (since it cannot collide with itself, the value just placed, or a value that has yet to be placed).
No collisions occur on the last character placed, because it is simply placed in the last open slot. The maximum number of collisions is therefore:

The worst case scenario in practice is not quite this bad, because each placement is not independent of the others. For example, it is not possible to experience the maximum number of collisions for a placement on two consecutive placements. However, the reuse of key values also subjects the placements to a very mild form of secondary clustering, which may partially negate such effects. However, in any case, the worst case scenario cannot be any worse than this, and may be significantly better.

Average-Case Analysis

To estimate average-case collisions, each insertion is modeled as being independent and it is assumed that key values are not reused.
No collisions ever occur on placements #1, #2, or #37.
On placement #3, a maximum of 1 collision is possible, and occurs with probability 1/36.
The average number of collisions for this placement is 1/36.
On placement #4, a maximum of 2 collisions are possible. At least one collision occurs with probability 2/36, and given that at least one collision occurs, the second occurs with probability 1/35.
The average number of collisions for this placement is (2/36) + [(2/36) x (1/35)] = 2/35.
On placement #5, a maximum of 3 collisions are possible. At least one collision occurs with probability 3/36. Given at least one collision occurs, a second occurs with probability 2/35. Given that two occur, the third occurs with probability 1/34.
Average collisions: (3/36) + [(3/36) x (2/35)] + [(3/36) x (2/35) x (1/34)]
= (3/36) x (1 + (2/35) x (1 + (1/34)))
= 3/34
Exhaustive computation reveals that the average number of collisions for a single placement is always equal to the maximum number of collisions (i - 2) divided by 37 minus the maximum number of collisions.
Which indicates that the average number of total collisions in the key set-up is:

Case Operations Estimated Time

Best Case 35 two-digit modular additions
37 table look-ups
37 data entries 249 sec
(4.15 min)

Worst Case 630 two-digit modular additions
632 table look-ups
37 data entries 3,819 sec
(63.65 min)

Average Case 100 two-digit modular additions
102 table look-ups
37 data entries 639 sec
(10.65 min)

While only a very determined cryptographer would be likely to compute the worst-case key by hand, the average case does not seem unreasonable.
If a computer is available when determining the initial key, one might be strategically chosen to limit the number of collisions and reduce computation time. Alternatively, the sender and receiver could agree that if any key set-up is found to entail more than a certain number of collisions, set-up is aborted and the next key in the rotation is used.

Error Creation and Propagation

Since the cipher relies on human computation, there is likely to be a non-negligible rate of introduction of errors into the cipher.
There are three basic types of errors possible when computing RC4:
The first is if either the i counter or the j counter is set to an erroneous value. Because both counters are set each round based, at least in part, upon their values from the previous round, these errors are propagated throughout the rest of the keystream. Such an error causes the entire keystream from that point onward to become invalid.
The second occurs when one or more values in the permutation table become inaccurate. On any round where the i counter points to an erroneous value, the error is transmitted to the value of the j counter, creating an error of the first type, and invalidating the entire keystream from that point onward.
On any round where the j counter points to an erroneous value, that word of the output will be incorrect. Additionally, the incorrect value will be moved to the cell just probed by the i counter, insuring that it will not be pointed to by i (and propagated throughout the rest of the keystream) for at least N rounds.
The third type of error occurs when the output for the current round is miscalculated, or the plaintext character being encrypted is inadvertently shifted by a value other than the output (the intended shift). This type of error corrupts only that letter of the message, and cannot propagate.
Because line (1) only involves a simple increment, it is unlikely that any errors will occur there. However, if an error did occur on line (1), this would create an error of the first type and corrupt the keystream from that point forward.
If an error occurred on line (2) while setting the j counter, this would also create an error of the first type and corrupt the keystream from that point forward.
An error on line (3) would create an error of the second type, causing one or two improper values to be entered into the S-box. This would corrupt the current output, and might corrupt additional outputs later on (see above).
An error on lines (4) or (5), or while applying the keystream word to the plaintext as a shift, would result in an error of the third type, corrupting only that character of the message.
If an error is made while computing the key scheduling algorithm, an insertion may be made in the wrong position, creating an error of the second type. Because each insertion is also used as the starting point for the next insertion, barring a second error that happens

. . .	6	7	8	. . .	18	19	20	. . .	30	31	32
. . .		A		. . .		X		. . .		B
		---------->				O	---------->			^

Operation	Estimated Time
one-digit simple addition	1 second
two-digit modular addition	5 seconds
table look-up	1 second
data recording	1 second

Case	Operations	Estimated Time
Best Case	35 two-digit modular additions 37 table look-ups 37 data entries	249 sec (4.15 min)
Worst Case	630 two-digit modular additions 632 table look-ups 37 data entries	3,819 sec (63.65 min)
Average Case	100 two-digit modular additions 102 table look-ups 37 data entries	639 sec (10.65 min)