- Unused symbols. Show that the capacity of the channel with probability
transition matrix
is achieved by a distribution that places zero probability on one of input symbols.
What is the capacity of this channel? Give an intuitive reason why that letter is not
used.
- Erasures and errors in a binary channel. Consider a channel with binary inputs
that has both erasures and errors. Let the probability of error be
and the
probability of erasure be
, so the the channel is as illustrated
below:
- Find the capacity of this channel.
- Specialize to the case of the binary symmetric channel (
).
- Specialize to the case of the binary erasure channel (
).
- Channels with dependence between the letters. Consider the following channel
over a binary alphabet that takes in two bit symbols and produces a two bit output, as
determined by the following mapping:
,
,
, and
. Thus if the two bit
sequence 01 is input to the channel, the output is 10 with probability 1. Let
denote the two input symbols and
denote the corresponding output symbols.
- Calculate the mutual information
as a function of the input distribution on
the four possible pairs of inputs.
- Show that the capacity of a pair of transmissions on this channel is 2 bits.
- Show that under the maximizing input distribution,
. Thus the distribution
on the input sequences that achieves capacity does not necessarily maximize the mutual
information between individual symbols and their corresponding outputs.
- Jointly typical sequences. As we did in problem
of Chapter
for the typical set
for a single random variable, we will calculate the jointly typical set for a pair of
random variables connected by a binary symmetric channel, and the probability of error for
jointly typical decoding for such a channel.
We will consider a binary symmetric channel with crossover probability 0.1. As argued
in class, the input distribution that achieves capacity is the uniform distribution. We
will therefore take p(x) = (1/2, 1/2) and therefore the joint distribution p(x,y)
for this channel is given by
The marginal distribution of Y is also (1/2,1/2).
- Calculate H(X), H(Y), H(X,Y) and I(X;Y)
for the joint distribution above.
- Let
be drawn i.i.d. according the Bernoulli(1/2) distribution. Of the
possible sequences of length n, which of them are typical, i.e., member of
for
?
Which are the typical sequences in
?
- The jointly typical set
is defined as the set of sequences that
satisfy equations (8.36-8.38) of EIT. The first two equations correspond to the conditions
that
and
are in
and
respectively.
Consider the last condition, which can be rewritten to state that
. Let k
be the number of places in which the sequence
differs from
(k is a function of
the two sequences). Then we can write
An alternative way at looking at this probability is to look at the binary symmetric
channel as in additive channel
, where Z is a binary random
variable that is equal to 1 with probability p, and is independent of X. In
this case,
Show that the condition that
being jointly typical is equivalent to the
condition that
is typical and
is typical.
- We now calculate the size of
for n=25 and
. As in
problem
of Chapter
, here is a table of
the probabilities and numbers of sequences of with k ones
(Sequences with more than 12 ones are omitted since their total probability is
negligible (and they are not in the typical set).)
What is the size of the set
?
- Now consider random coding for the channel, as in the proof of the channel coding
theorem. Assume that
codewords
are chosen uniformly over
the
possible binary sequences of length n. One of these codewords is
chosen and sent over the channel. The receiver looks at the received sequence and tries to
find a codeword in the code that is jointly typical with the received sequence. As argued
above, this corresponds to finding a codeword
such that
. For a fixed codeword
,
what is the probability that the received sequence
is such that
is
jointly typical?
- Now consider a particular received sequence
, say. Assume that we choose a sequence
at
random, uniformly distributed among all the
possible binary n-sequences. What is
the probability that the chosen sequence is jointly typical with this
? (Hint: this is
the probability of all sequences
such that
.)
- Now consider a code with
codewords of length 12 chosen at random,
uniformly distributed among all the
sequences of length n=25. One of
these codewords, say the one corresponding to i=1, is chosen and sent over the
channel. As calculated in part (e), the received sequence, with high probability, is
jointly typical with the codeword that was sent. What is probability that one or more of
the other codewords (which were chosen at random, independently of the sent codeword) is
jointly typical with the received sequence? (Hint: You could use the union bound but you
could also calculate this probability exactly, using the result of part (f) and the
independence of the codewords)
- Given that a particular codeword was sent, the probability of error (averaged over the
probability distribution of the channel and over the random choice of other codewords) can
be written as
There are two kinds of error: the first occurs if the received sequence
is not jointly
typical with the transmitted codeword, and the second occurs if there is another codeword
jointly typical with the received sequence. Using the result of the previous parts,
calculate this probability of error.
By the symmetry of the random coding argument, this does not depend on which codeword
was sent.
The calculations above show that average probability of error for a random code with
512 codewords of length 25 over the binary symmetric channel of crossover probability 0.1
is about 0.34. This seems quite high, but the reason for this is that the value of
that we
have chosen is too large. By choosing a smaller
, and a larger n in the definitions
of
, we can get the probability of error to be as small as we want, as long
as the rate of the code is less than
.
Also note that the decoding procedure described in the problem is not optimal. The
optimal decoding procedure is maximum likelihood, i.e., to choose the codeword that is
closest to the received sequence. It is possible to calculate the average probability of
error for a random code for which the decoding is based on an approximation to maximum
likelihood decoding, where we decode a received sequence to the unique codeword that
differs from the received sequence in
bits, and declare an error otherwise. The
only difference with the jointly typical decoding described above is that in the case when
the codeword is equal to the received sequence! The average probability of error for this
decoding scheme can be shown to be about 0.285.
- Sum Channel.
- Consider two discrete memoryless channels
and
with capacities
and
respectively. A new channel is defined with input alphabet
, output alphabet
,
and probability transitiion function
if
and
if
.
Thus the channel has both channels available for use, but only one channel can be used at
a time. Show that the capacity of this sum channel is
- Use the above result to calculate the capacity of the following channel
- Geometric interpretation of channel capacity. Consider a channel
,
and assume that the channel input alphabet and output alphabet are of size m, e.g.,
let
. We can write the probability transition function of the channel as a
matrix:
where
is the i-th row of the probability transistion matrix, i.e,
.
Consider an input distribution
for the channel, and let
be the
corresponding output distribution, i.e.,
.
- Show that for this channel
- Let
be another distribution on
. Show that
Use this to argue
and that for any distribution
,
- Combining the first two parts, we have
Now justify the following argument:
(This is a very general result that max min = min max for functions that are convex in
one dimension and concave in the other and constitutes the fundamental theorem of game
theory.)
Combining the previous two equations, we have
A simple geometric interpretaion of this result is as follows: C is the minimum
``radius'' of an information ball that contains all the rows
of the channel transition
matrix, where all distances are measured with relative entropy. The center of this ball is
located at
that is induced by the capacity acheiving input distribution.
- Encoder and decoder as part of the channel:
Consider a Binary Symmetric Channel with crossover probability 0.1. A possible coding
scheme for this channel with two codewords of length 3 is to encode message
as
000 and
as 111. With this coding scheme, we can consider the combination of
encoder, channel and decoder as forming a new BSC, with two inputs
and
and
two outputs
and
.
- Calculate the crossover probabilty of this channel.
- What is the capacity of this channel in bits per transmission of the original channel?
- What is the capacity of the original BSC with crossover probability 0.1?
- Prove a general result that for any channel, considering the encoder, channel and
decoder together as a new channel from messages to estimated messages will not increase
the capacity in bits per transmission of the original channel.
- Codes of length 3 for a BSC and BEC: In Problem
, you calculated the
probability of error for a code with two codewords of length 3 (000 and 111) sent over a
binary symmetric channel with crossover probability
. For this problem, take
.
- Find the best code of length 3 with four codewords for this channel. What is the
probability of error for this code? (Note that all possible received sequences should be
mapped onto possible codewords)
- What is the probability of error if we used all the 8 possible sequences of length 3 as
codewords?
- Now consider a binary erasure channel with erasure probability 0.1. Again, if we used
the two codeword code 000 and 111, then received sequences 00E,0E0,E00,0EE,E0E,EE0 would
all be decoded as 0, and similarly we would decode 11E,1E1,E11,1EE,E1E,EE1 as 1. If we
received the sequence EEE we would not know if it was a 000 or a 111 that was sent - so we
choose one of these two at random, and are wrong half the time.
What is the probability
of error for this code over the erasure channel?
- What is the probability of error for the codes of parts (a) and (b) when used over the
binary erasure channel?
- Channel Capacity: Calculate the capacity of the following channels with
probability transition matrices:
- Capacity of the carrier pigeon channel. Consider a commander of an army
besieged a fort for whom the only means of communication to his allies is a set of carrier
pigeons. Assume that each carrier pigeon can carry one letter (8 bits), and assume that
pigeons are released once every 5 minutes, and that they take exactly 3 minutes to reach
their destination.
- Assuming all the pigeons reach safely, what is the capacity of this link in bits/hour?
- Now assume that the enemies try to shoot down the pigeons, and that they manage to hit a
fraction
of them. Since the pigeons are sent at a constant rate, the receiver knows
when the pigeons are missing. What is the capacity of this link?
- Now assume that the enemy is even more cunning, and every time they shoot down a pigeon,
they send out a dummy pigeon carrying a random letter (chosen uniformly from all 8-bit
letters). What is the capacity of this link in bits/hour?
Set up an appropriate model for the channel in each of the above cases, and indicate
how to go about finding the capacity.
- A channel with two independent looks at Y. Let
and
be
conditionally independent and conditionally identically distributed given X.
- Show
- Conclude that the capacity of the channel
is less than twice the capacity of the channel
- Tall, fat people
Suppose that average height of people in a room is 5 feet.
Suppose the average weight is 100 lbs.
- Argue that no more than
of the population is 15 feet tall.
- Find an upper bound on the fraction of 300 lb, 10 footers in the room.
- Can signal alternatives lower capacity? Show that adding a row to a channel
transition matrix does not decrease capacity.
- Binary multiplier channel
- Consider the channel Y=XZ where X and Z are independent
binary random variables that take on values 0 and 1. Z is Bernoulli(
), i.e.
.
Find the capacity of this channel and the maximizing distribution on X.
- Now suppose the receiver can observe Z as well as Y. What is the capacity?
- Noise alphabets
Consider the channel
, where Y=X+Z, and Z is uniformly distributed
over three distinct integer values
- What is the maximum capacity over all choices of the
alphabet? Give distinct
integer values
and a distribution on
achieving this.
- What is the minimum capacity over all choices for the
alphabet? Give distinct
integer values
and a distribution on
achieving this.
- Narrow channel
Suppose a signal
goes through an intervening transition
:
where
,
, and
. Here p(v|x)
and p(y|v) are arbitrary and the channel has transition probability
.
Show
.
- Noisy typewriter. Consider the channel with
and transition
probabilities p(y|x) given by the following matrix:
- Find the capacity of this channel.
- Define the random variable z=g(y) where
For the following two distributions for x, compute I(X;Z)
- Find the capacity of the channel between x and z, specifically where
,
,
and the transition probabilities p(z|x) are given by
- For the X distribution of part i. of b, does
form a Markov chain?
- Capacity
What are the capacities of the following channels?
- Erasure channel
Let
be a discrete memoryless channel with
capacity C. Suppose this channel is immediately cascaded with an erasure channel
that erases
of its symbols.
Specifically,
and
Determine the capacity of this channel.
- Choice of channels.
Find the capacity C of the union of 2 channels
and
where, at each
time, one can send a symbol over channel 1 or over channel 2 but not both. Assume the
output alphabets are distinct and do not intersect.
- Show
Thus
is the effective alphabet size of a
channel with capacity C.
- Compare with problem
of
Chapter
, where
,
and interpret (a) in terms of the effective number of noise-free symbols.
- Binary multiplier channel.
- Consider the discrete memoryless channel Y=XZ where X and Z
are independent binary random variables that take on values 0 and 1. Let
. Find the
capacity of this channel and the maximizing distribution on X.
- Now suppose the receiver can observe Z as well as Y. What is the capacity?
- Noise alphabets.
Consider the channel
, where Y=X+Z, and Z is uniformly distributed
over three distinct integer values
- What is the maximum capacity over all choices of the
alphabet? Give distinct
integer values
and a distribution on
achieving this.
- What is the minimum capacity over all choices for the
alphabet? Give distinct
integer values
and a distribution on
achieving this.
- Source and channel.
We wish to encode a Bernoulli(
) process
for transmission over a
binary symmetric channel with error probability p.
Find conditions on
and p so that the probability of
error
can be made to go to zero as
.