Information Theory and Coding ‒ IPG ‐ EPFL

Instructor	Emre Telatar
Office	INR 117
Email	[email protected]
Office Hours	By appointment


Teaching Assistant	Adway Girish
Office	INR 038
Email	[email protected]
Office Hours	By appointment

Lectures	Monday	11h15 – 13h00 (Room: BC 01)
	Tuesday	13h15 – 15h00 (Room: ELA 2)
Exercises	Tuesday	15h15 – 17h00 (Room: ELA 2)

Language:		English
Credits :		8 ECTS

Course Information

See the course information.

Moodle link

Announcements

The final exam will be held on Thursday, January 25th 2024 in INJ 218 at 09h15.
The midterm exam will be held on Tuesday, October 31st in ELA 2 at 13h15.
The first lecture will be held on Tuesday, September 19th in ELA 2 at 13h15.
The weekly homeworks are not graded. The graded homework will be announced explicitly.

Date	Topics Covered	Homework	Solutions	Remarks/ Extra material
Sep 18				Public Holiday
Sep 19	Source coding / Data compression: – injective, uniquely decodable, prefix-free (binary) codes – Kraft sum, Kraft inequalities	HW 1	Soln 1
Sep 25	– (Partial) converse to Kraft inequality – Properties of log (and ln) – Expected codeword length: lower and upper bounds, asymptotic per-letter tightness	HW 2
Sep 26	– “Optimal” prefix-free code design: Huffman algorithm – Entropy and friends (joint, conditional)		Soln 2
Oct 02	– KL divergence, (conditional) mutual information – Inequalities and properties of information measures – Data processing inequality	HW 3
Oct 03	– “Distributions, source codes, regret, divergence” – Entropy rate		Soln 3
Oct 09	– Data compression theorem for stationary sources – ε-typicality – Properties of typical sequences and sets	HW 4
Oct 10	– Source coding via typicality – Joint typicality – Motivating (toy) example for universal data compression		Soln 4
Oct 16	– Lempel-Ziv algorithm – Finite-state, information-lossless machines (competitors to LZ) – Lower bound to lengths of k-distinct words	HW 5		LZ notes
Oct 17	– Proof that LZ beats FSILMs – Are FSILMs powerful? – Modifying LZ to deal with finite sequences – Connections to Learning Theory		Soln 5
Oct 23	– “Memoryful”, memoryless, stationary channels – Communication with feedback – Capacity of a stationary, memoryless channel – Examples: BSC, BEC	HW 6
Oct 24	– Fano’s inequality: relating conditional entropy to probability of error – Fano’s inequality (bis): multi-letter – Channel coding converse theorem (bad news) – Definitions: encoder/decoder pair, rate, error probabilities – Channel coding achievability theorem (good news; proof upcoming) – Source-channel separation		Soln 6
Oct 30	– How to maximize mutual information (as in the definition of capacity) – KKT for general smooth functions (necessary conditions) – KKT for smooth, concave functions (necessary and sufficient conditions) – Examples using KKT – KKT for mutual information maximization – Connection to min-max regret as seen in source coding
Oct 31	Midterm exam			Midterm Solutions
Nov 06	– (Goal) proof of “Good News” Thm (maximum error probability) – Expurgation to get sufficient Thm’ (average error probability) – Non-constructive, probabilistic method to show existence – Proof of Thm’ by the probabilistic method – Alternate decoder: Typicality decoder (vs. maximum likelihood score)	HW 7
Nov 07	– Definitions associated with the “coding theorem” – Channels with cost – Aside on random coding and sphere packing exponents – Coding theorem for channels with cost – Another example of “proof by random construction” — Multiple-Access Channel		Soln 7
Nov 13	– Differential entropy for real, vector-valued random variables – Properties of differential entropy – Relationship between H(.) and h(.) – (Differential) mutual information – Differential entropy of Gaussian vectors – Mutual information of Gaussian channel with Gaussian input	HW 8
Nov 14	– Extremality of h(.) under different conditions – Additive Gaussian noise channel with power constraint – Capacity of additive Gaussian channel with power constraint – Saddlepoint theorem		Soln 8
Nov 20	– Parallel Gaussian channels – Water-filling solution – Hamming distance, minimum distance of a code	HW 9		Coding notes
Nov 21	– Sphere-packing bound – Hamming code (example of a perfect code) – Singleton bound – Gilbert bound		Soln 9
Nov 27	– Proof of Gilbert-Varshamov bound – Linear codes – Reminder: fields, vector spaces over fields – Generator and parity-check matrices – Linear codes achieve BSC capacity	HW 10
Nov 28	– Reed-Solomon codes		Soln 10
Dec 4	– Polar codes – Construction of synthetic channels – BEC example	HW 11		Graded HW
Dec 5	– Complexity of encoding and decoding – ε-good and ε-bad channels – First attempt at error probability		Soln 11
Dec 11	– ε-noble and ε-lucky channels – Refined (successful) attempt at error probability – Rate-distortion setup – Bad news (converse) theorem	HW 12
Dec 12	– Good news theorem and proof – Rate-distortion function – Example: Binary, Hamming distance		Soln 12
Dec 18	– Back to Multiple-Access Channel (first introduced on 7/11/2023) – Convex hull for good news theorem – Bad news theorem and proof	HW 13		Graded HW Soln
Dec 19	– Completing proof of converse (convex hull) – Maximal vs. average error probability criterion – Unsolved problems: Broadcast channel, Interference channel – Secrecy (solved)
Jan 25	Final Exam			Final Solutions

Textbook

T. M. Cover and J. A. Thomas, Elements of Information Theory, Wiley 2006.

Additional Reading Material

C. E. Shannon, A Mathematical Theory of Communications, Bell System Technical Journal, 1948.

To Review Basic Probability Theory

K. L. Chung, A Course in Probability Theory, Academic Press.
W. Feller, An Introduction to Probability Theory and Its Applications (vol. 1), Wiley.
G. Grimmett and D. Stirzaker, Probability and Random Processes, Oxford University Press.
A. Papoulis, Probability, Random Variables, and Stochastic Processes, McGraw-Hill.
S. M. Ross, A First Course in Probability, Pearson Education.