Shannon's noiseless coding theorem

Shannon's noiseless coding theorem places an upper and a lower bound on the minimal possible expected length of codewords as a function of the entropy of the input word (which is viewed as a random variable) and of the size of the target alphabet.

Shannon's statement

Let X be a random variable taking values in some finite alphabet Σ1 and let f be a decipherable code from Σ1 to Σ2 where |Σ2*| = a. Let S denote the resulting wordlength of f(X).

If f is optimal in the sense that it has the minimal expected wordlength for X, then

\frac{H(X)}{\log_2 a} \leq \mathbb{E}S < \frac{H(X)}{\log_2 a} +1

Proof

Let si denote the wordlength of each possible xi (i = 1, …, n). Define qi = asi/C, where C is chosen so that qi = 1.

Then

H(X)amp;=amp;ni=1pilog2piamp;amp;amp;amp;ni=1pilog2qiamp;amp;amp;=amp;ni=1pilog2asi+ni=1log2Camp;amp;amp;amp;ni=1sipilog2aamp;amp;amp;amp;ESlog2a

where the second line follows from Gibbs' inequality and the third line follows from Kraft's inequality: C=ni=1asi1 so log C ≤ 0.

For the second inequality we may set

si = ⌈−logapi

so that

- \log_a p_i \leq s_i < -\log_a p_i + 1

and so

asi ≤ pi

and

asi ≤ ∑pi = 1

and so by Kraft's inequality there exists a prefix-free code having those wordlengths. Thus the minimal S satisfies

ESamp;=amp;pisiamp;amp;amp;lt;amp;pi(logapi+1)amp;amp;amp;=amp;pilog2pilog2a+1amp;amp;amp;=amp;H(X)log2a+1

Category: Quantum Information Theory

Last modified: 

Monday, October 26, 2015 - 17:56