Information Theory and Coding

Instructor Emre Telatar
Office INR 117
Email [email protected]
Office Hours By appointment
Teaching Assistant Adway Girish
Office INR 038
Email [email protected]
Office Hours By appointment
Lectures Monday 11h15 – 13h00 (Room: BC 01)
  Tuesday 13h15 – 15h00 (Room: ELA 2)
Exercises Tuesday 15h15 – 17h00 (Room: ELA 2)
Language:   English
Credits :   8 ECTS

Course Information

See the course information.

Moodle link


  • The final exam will be held on Thursday, January 25th 2024 in INJ 218 at 09h15.
  • The midterm exam will be held on Tuesday, October 31st in ELA 2 at 13h15.
  • The first lecture will be held on Tuesday, September 19th in ELA 2 at 13h15.
  • The weekly homeworks are not graded. The graded homework will be announced explicitly.
  Topics Covered     Homework Solutions Remarks/
Extra material
Sep 18             Public Holiday  
Sep 19  

 Source coding / Data compression:
– injective, uniquely decodable, prefix-free (binary) codes
– Kraft sum, Kraft inequalities

    HW 1 Soln 1    
Sep 25  

– (Partial) converse to Kraft inequality
– Properties of log (and ln)
– Expected codeword length: lower and upper bounds, asymptotic per-letter tightness

    HW 2      
Sep 26  

– “Optimal” prefix-free code design: Huffman algorithm
– Entropy and friends (joint, conditional)

      Soln 2    
Oct 02  

– KL divergence, (conditional) mutual information
– Inequalities and properties of information measures
– Data processing inequality


HW 3

Oct 03  

– “Distributions, source codes, regret, divergence”
– Entropy rate

      Soln 3    
Oct 09  

– Data compression theorem for stationary sources
– ε-typicality
– Properties of typical sequences and sets


HW 4

Oct 10  

– Source coding via typicality
– Joint typicality
– Motivating (toy) example for universal data compression

      Soln 4    
Oct 16  

– Lempel-Ziv algorithm
– Finite-state, information-lossless machines (competitors to LZ)
– Lower bound to lengths of k-distinct words


HW 5

  LZ notes  
Oct 17  

– Proof that LZ beats FSILMs
– Are FSILMs powerful?
– Modifying LZ to deal with finite sequences
– Connections to Learning Theory

      Soln 5    
Oct 23  

– “Memoryful”, memoryless, stationary channels
– Communication with feedback
– Capacity of a stationary, memoryless channel
– Examples: BSC, BEC


HW 6

Oct 24  

– Fano’s inequality: relating conditional entropy to probability of error
– Fano’s inequality (bis): multi-letter
– Channel coding converse theorem (bad news)
– Definitions: encoder/decoder pair, rate, error probabilities
– Channel coding achievability theorem (good news; proof upcoming)
– Source-channel separation

      Soln 6    
Oct 30  

– How to maximize mutual information (as in the definition of capacity)
– KKT for general smooth functions (necessary conditions)
– KKT for smooth, concave functions (necessary and sufficient conditions)
– Examples using KKT
– KKT for mutual information maximization
– Connection to min-max regret as seen in source coding



Oct 31  

Midterm exam




Nov 06  

– (Goal) proof of “Good News” Thm (maximum error probability)
– Expurgation to get sufficient Thm’ (average error probability)
– Non-constructive, probabilistic method to show existence
– Proof of Thm’ by the probabilistic method
– Alternate decoder: Typicality decoder (vs. maximum likelihood score)


HW 7

Nov 07  

– Definitions associated with the “coding theorem”
– Channels with cost
– Aside on random coding and sphere packing exponents
– Coding theorem for channels with cost
– Another example of “proof by random construction” — Multiple-Access Channel

      Soln 7    
Nov 13  

– Differential entropy for real, vector-valued random variables
– Properties of differential entropy
– Relationship between H(.) and h(.)
– (Differential) mutual information
– Differential entropy of Gaussian vectors
– Mutual information of Gaussian channel with Gaussian input


HW 8

Nov 14  

– Extremality of h(.) under different conditions
– Additive Gaussian noise channel with power constraint
– Capacity of additive Gaussian channel with power constraint
– Saddlepoint theorem

      Soln 8    
Nov 20  

– Parallel Gaussian channels
– Water-filling solution
– Hamming distance, minimum distance of a code


HW 9


Coding notes

Nov 21  

– Sphere-packing bound
– Hamming code (example of a perfect code)
– Singleton bound​​​
– Gilbert bound

      Soln 9    
Nov 27  

– Proof of Gilbert-Varshamov bound
– Linear codes
– Reminder: fields, vector spaces over fields
– Generator and parity-check matrices
– Linear codes achieve BSC capacity


HW 10



Nov 28  

– Reed-Solomon codes

      Soln 10    
Dec 4  

– Polar codes
– Construction of synthetic channels
– BEC example


HW 11


Graded HW

Dec 5  

– Complexity of encoding and decoding
– ε-good and ε-bad channels
– First attempt at error probability

      Soln 11    
Dec 11  

– ε-noble and ε-lucky channels
– Refined (successful) attempt at error probability
– Rate-distortion setup
– Bad news (converse) theorem


HW 12



Dec 12  

– Good news theorem and proof
– Rate-distortion function
– Example: Binary, Hamming distance

      Soln 12    
Dec 18  

– Back to Multiple-Access Channel (first introduced on 7/11/2023)
– Convex hull for good news theorem
– Bad news theorem and proof


HW 13


Graded HW Soln

Dec 19  

– Completing proof of converse (convex hull)
– Maximal vs. average error probability criterion
– Unsolved problems: Broadcast channel, Interference channel
– Secrecy (solved)

Jan 25  

Final Exam







Additional Reading Material

To Review Basic Probability Theory