RC4 Stream Cipher Explained: Algorithm and Steps

RC4 Stream Cipher Explained: Algorithm and Steps
RC4 (Rivest Cipher 4) is one of the most well-known stream ciphers in the history of cryptography. Although it is deprecated today, RC4 is still widely taught because it clearly demonstrates how stream ciphers, keystream generation, and XOR-based encryption work.
This article explains RC4 from the ground up, including:
- What RC4 is
- Its internal algorithms (KSA & PRGA)
- Step-by-step working
What Is RC4?
RC4 is a symmetric-key stream cipher designed by Ron Rivest in 1987.
Key Characteristics:
- Stream cipher → encrypts data byte-by-byte
- Symmetric → same key for encryption and decryption
- Uses XOR to combine plaintext with a keystream
Core Formula
Encryption:
C_i = P_i ⊕ K_i
Decryption:
P_i = C_i ⊕ K_i
Where:
- P_i = plaintext byte
- K_i = keystream byte
- C_i = ciphertext byte
Why XOR for Decryption?
Because XOR is reversible: (A ⊕ B) ⊕ B = A
- Encryption:
C = P ⊕ K - Decryption:
P = C ⊕ K = (P ⊕ K) ⊕ K = P
High-Level Idea of RC4
RC4 works in two main phases:
1. KSA (Key Scheduling Algorithm) → Initializes and scrambles an internal array using the key
2. PRGA (Pseudo-Random Generation Algorithm) → Generates a keystream byte-by-byte
That keystream is XORed with plaintext to encrypt (or decrypt).
Process Flow:
Secret Key → KSA → Scrambled S Array → PRGA → Keystream → XOR with Plaintext → Ciphertext
Internal Data Structures
RC4 uses:
S array: A permutation of numbers 0–255 (256 bytes)
- Initially:
S[0] = 0, S[1] = 1, ..., S[255] = 255 - After KSA: Permuted based on the key
Key: Variable length (typically 40–256 bits)
Two indices: i and j
- Used to select and swap elements in S array
Step 1: Key Scheduling Algorithm (KSA)
Purpose
To initialize and shuffle the array S[0…255] based on the secret key.
Steps
- Initialize
S[i] = ifori = 0…255 - Use the key to permute S through swapping operations
Pseudocode (KSA)
for i = 0 to 255:
S[i] = i
j = 0
for i = 0 to 255:
j = (j + S[i] + key[i mod key_length]) mod 256
swap(S[i], S[j])
How KSA Works
Initialization:
- Set
Sto identity permutation:[0, 1, 2, ..., 255]
Permutation:
- For each index
ifrom 0 to 255:- Calculate
jusing currentS[i]and key byte - Swap
S[i]andS[j] - This shuffles the array in a key-dependent way
- Calculate
After KSA:
Sbecomes a key-dependent permutation- This array is the internal state of RC4
- Same key always produces the same initial
Sstate
Step 2: Pseudo-Random Generation Algorithm (PRGA)
Purpose
To generate the keystream, one byte at a time.
Steps
- Update indices
iandj - Swap elements in
S - Output one keystream byte
Pseudocode (PRGA)
i = 0
j = 0
while (data to encrypt):
i = (i + 1) mod 256
j = (j + S[i]) mod 256
swap(S[i], S[j])
t = (S[i] + S[j]) mod 256
K = S[t]
output K
How PRGA Works
For each byte needed:
- Increment
i:i = (i + 1) mod 256 - Update
j:j = (j + S[i]) mod 256 - Swap: Exchange
S[i]andS[j] - Compute index:
t = (S[i] + S[j]) mod 256 - Output keystream byte:
K = S[t]
Each iteration produces one keystream byte.
Important: The state array S is modified during PRGA, so each keystream byte depends on the previous state.
What Are We Ensuring? (Key Guarantees)
Every keystream byte depends on the entire past state:
iincreases sequentially — Simple counter that cycles through S arrayjaccumulates values from S — Uses current S state to determine next jSis continuously mutated via swaps — State changes with each iteration
📌 Guarantee:
The current keystream byte depends on all previous swaps, not just the key.
What This Means:
-
First keystream byte depends on:
- Initial S state (from KSA)
- First swap in PRGA
-
Second keystream byte depends on:
- Initial S state (from KSA)
- First swap
- Second swap
-
N-th keystream byte depends on:
- Initial S state (from KSA)
- All previous N swaps in PRGA
This prevents simple prediction.
- ❌ You cannot predict the next keystream byte without knowing the entire current state
- ❌ You cannot compute keystream bytes independently
- ❌ Each byte requires all previous computation
- ✅ Creates pseudo-random appearance
- ✅ Makes cryptanalysis more difficult
Why Generate a Keystream? (Key vs. Keystream)
What happens if we use the key directly? (BAD)
Problem: Key Repetition
If we simply repeat the key as the keystream:
Plaintext = [10, 20, 30, 40, 50, 60]
Key = [3, 1, 4]
Key repeated as keystream:
Keystream = [3, 1, 4, 3, 1, 4] ❌ Pattern visible!
Issues:
- ⚠️ Visible repetition in the keystream
- ⚠️ Short key reused multiple times
- ⚠️ Patterns make cryptanalysis easier
- ⚠️ Vulnerable to known-plaintext attacks
Now: Generate a keystream from the key (GOOD)
Instead of repeating the key, we use an algorithm:
Key → Algorithm (KSA + PRGA) → Keystream
Example:
Same key
Key = [3, 1, 4]
Generated keystream (example output)
Keystream = [91, 203, 17, 88, 240, 61] ✅ No visible repetition
Benefits:
- ✅ No visible repetition — appears random
- ✅ Long keystream from short key
- ✅ Appears unpredictable (statistically)
Critical Warning:
- Same key produces the same keystream
- If the same key is used for multiple messages, the keystream repeats
- This breaks security:
C₁ ⊕ C₂ = P₁ ⊕ P₂(key cancels out) - Always use unique keys or initialization vectors (IVs)
Step 3: Encryption
Encryption is simple:
Ciphertext = Plaintext XOR Keystream
RC4 Pseudocode (All-in-One)
Complete RC4 Algorithm:
RC4(key K, plaintext P): //Key = [1, 2, 3], Plaintext = [10, 20, 30]
// ---------- KSA ----------
for i = 0 to 255:
S[i] = i
j = 0
for i = 0 to 255:
j = (j + S[i] + K[i mod keylen]) mod 256
swap(S[i], S[j])
// ---------- PRGA + Encryption ----------
i = 0
j = 0
for n = 0 to len(P)-1:
i = (i + 1) mod 256
j = (j + S[i]) mod 256
swap(S[i], S[j])
t = (S[i] + S[j]) mod 256
k = S[t]
C[n] = P[n] XOR k
return C //Ciphertext = [11, 23, 28] for keystream = [1, 3, 2]
Conclusion
RC4 Summary:
- ✅ RC4 uses KSA to permute the S array based on the key
- ✅ PRGA generates keystream bytes one at a time
- ✅ Encryption & decryption use XOR with keystream
- ✅ Same key → same keystream → same result
- ⚠️ Deprecated due to security vulnerabilities
While RC4 is no longer secure, studying it provides valuable insights into cryptographic design and the evolution of encryption standards. Modern alternatives like ChaCha20 and AES incorporate lessons learned from RC4's weaknesses, making them significantly more secure for real-world applications.