**Channel capacity**, is the amount of discrete information that can be reliably transmitted over a channel. By the noisy-channel coding theorem, the channel capacity of a given channel is the limiting information transport rate (in units of information per unit time) that can be achieved with vanishingly small error probability.

Information theory, developed by Claude E. Shannon in 1948, defines the notion of channel capacity and provides a mathematical model by which one can compute the maximal amount of information that can be carried by a channel. The key result states that the capacity of the channel, as defined above, is given by the maximum of the mutual information between the input and output of the channel, where the maximization is with respect to the input distribution.

#### Mathematical Definition

` o---------o`

` | Noise |`

` o---------o`

` |`

` V`

o--------o M o---------o X o---------o Y o---------o M' o----------o | Source |---->| Encoder |---->| Channel |---->| Decoder |---->| Receiver | o--------o o---------o o---------o o---------o o----------o

Here X represents the space of messages transmitted, and Y the space of messages received during a unit time over our channel. Let *p*(*x*∣*y*) be the conditional probability distribution function of X given Y. We will consider *p*(*x*∣*y*) to be an inherent fixed property of our communications channel (representing the nature of the noise of our channel). Then the joint distribution of X and Y is completely determined by our channel and by our choice of *f*(*x*), the marginal distribution of messages we choose to send over the channel. Under these constraints, we would like to maximize the amount of information, or the signal, we can communicate over the channel. The appropriate measure for this is the transinformation or mutual information, and this maximum transinformation is called the channel capacity and is given by:

*C* = max_{f}*I*(*X*; *Y*).

#### Noisy channel coding theorem

The noisy channel coding theorem states (roughly) that whenever the rate of the source is less than the channel capacity, then there is an encoding and decoding scheme that can be used to ensure that the probability of error, *P**r*(*M*ʹ ≠ *M*), is as small as desired for a sufficiently long message block M to be transmitted.