Part II: Coding Theory
From Shannon's Limits to Practical Codes
Part I established fundamental limits: the source coding theorem bounded compression, and the channel capacity theorem set the ceiling for reliable communication. Part II asks: how do we actually achieve these limits? The answer lies in coding theory—the design of efficient compression schemes and error-correcting codes that approach Shannon's ideals in practice.
We begin with source coding algorithms—Huffman and arithmetic codes—which construct variable-length prefix-free codes that provably compress data close to the entropy bound. We then study error-correcting codes, from the elegant Hamming(7,4) code to modern Reed-Solomon constructions. Finally, we examine turbo codes and LDPC codes, the breakthrough designs that brought us within a fraction of a decibel of the Shannon limit.
Key Question of Coding Theory
Given a channel with capacity \( C \) and a source with entropy \( H \), can we design codes that transmit reliably at rate \( R < C \) while compressing to length approaching \( H \)? Shannon said yes — this part shows how.
Chapters in This Part
Chapter 4: Huffman & Arithmetic Codes
Optimal prefix-free codes, the greedy Huffman algorithm, and arithmetic coding via interval subdivision
Chapter 5: Error-Correcting Codes
Hamming distance, Hamming(7,4) code, parity-check matrices, syndrome decoding, and the Gilbert-Varshamov bound
Chapter 6: Turbo & LDPC Codes
Parallel concatenation, iterative decoding, Tanner graphs, belief propagation, and polar codes approaching the Shannon limit
The Coding Theory Landscape
Source Coding (Compression)
Goal: represent a source \( X \) using as few bits as possible. Shannon's source coding theorem gives the lower bound:
\( L^* \geq H(X) \)
Huffman codes achieve \( H(X) \leq L < H(X) + 1 \). Arithmetic codes over blocks approach \( H(X) \) arbitrarily closely.
Channel Coding (Error Correction)
Goal: transmit reliably over a noisy channel of capacity \( C \). The noisy-channel coding theorem guarantees reliable transmission for any rate \( R < C \).
\( P_e \to 0 \text{ as } n \to \infty, \; R < C \)
Turbo and LDPC codes achieve this at SNR within 0.1 dB of the Shannon limit.
Historical Timeline
Prerequisites & Notation
From Part I
- Shannon entropy \( H(X) = -\sum_x p(x)\log_2 p(x) \)
- Source coding theorem and typical sequences
- Mutual information \( I(X;Y) \)
- Channel capacity \( C = \max_{p(x)} I(X;Y) \)
New in Part II
- Prefix-free (instantaneous) codes and Kraft inequality
- Generator matrix \( G \) and parity-check matrix \( H \)
- Hamming distance \( d(x,y) = \mathrm{wt}(x \oplus y) \)
- Finite fields \( \mathbb{F}_q \) and polynomial codes
- Factor graphs, sum-product algorithm