Channel Capacity and The Channel Coding Theorem, Part I: Information Theory 2013
Channel Capacity and The Channel Coding Theorem, Part I: Information Theory 2013
Theorem, Part I
Information Theory 2013
Lecture 4
Michael Roth
H(X |Y ) − 1
Pe ≥ .
log |X |
Motivation and preview
A communicates with B: A induces a state in B. Physical process
gives rise to noise.
Mathematical analog: source W , transmitted sequence X n , etc.
^
W Xn Channel Yn W
Encoder Decoder
Message p(y |x) Estimate
of
Message
C = max I(X ; Y ).
p(x )
1
1/2
1/2
Noisy channel with
2
nonoverlapping outputs
X Y
• output random, but input uniquely
3 determined.
1/3
• C = 1, achieved for uniform X .
1
2/3
4
Some channels II
A A A A
B B B
C C C C
D D D
E E
Noisy typewriter
• input either unchanged or
shifted (both w.p. 21 ).
• use of every second input:
log 13 bits per transmission
without error.
• I(X ; Y ) = H(Y ) − H(Y |X ) =
H(Y ) − H( 12 , 12 ) = H(Y ) − 1.
• C = max I(X ; Y ) =
log 26 − 1 = log 13.
Y Y
Z Z Z
Noisy channel Noiseless subset of inputs
Some channels III
1−p
0 0
Binary symmetric channel
• simplest channel with errors.
p • probability of switched input is p.
p
• “all received bits unreliable”.
• C = 1 − H(p) achieved for
1
1−p
1
uniform X .
Properties:
• C ≥ 0, since I(X ; Y ) ≥ 0.
• C ≤ log |X | and C ≤ log |Y|.
• I(X ; Y ) continuous function of p(x ).
• I(X ; Y ) concave in p(x ).
Consequences:
• maximum exists and is finite.
• convex optimization tools can be employed.
Preview of the channel coding theorem
Yn
Intuitive idea:
Xn
• for large block lengths every channel looks
like the noisy typewriter.
• one (typical) input sequence gives
≈ 2nH(Y |X ) output sequences.
• total number of (typical) output
sequences ≈ 2nH(Y ) must be divided into
sets of size 2nH(Y |X ) .
• total number of disjoint sets ≤ 2n(H(Y )−H(Y |X )) = 2nI(X ;Y ) .
• can send at most 2nI(X ;Y ) distinguishable sequences of
length n.
• channel capacity as log of the maximum number of
distinguishable sequences.
Definitions I
^
W Xn Channel Yn W
Encoder Decoder
Message p(y |x) Estimate
of
Message
1
− log p(x n ) − H(X ) < ,
n
1
− log p(y n ) − H(Y ) < ,
n
1
)
− log p(x , y ) − H(X , Y ) < ,
n n
n
where p(x n , y n ) =
Qn
i=1 p(xi , yi ).
Jointly typical sequences II
Joint AEP: Let (X n , Y n ) have lengths n, drawn i.i.d. from
p(x n , y n ). Then:
n o
(n)
1. Pr (X n , Y n ) ∈ A → 1 as n → ∞.
(n)
2. |A | ≤ 2n(H(X ,Y )+) .
n o
(n)
3. Pr (X̃ n , Ỹ n ) ∈ A ≤ 2−n(I(X ;Y )−3) for
(X̃ n , Ỹ n ) ∼ p(x n )p(y n ).
yn
xn
. .. . . . . . .. . . . . • 2nH(X ) typical X sequences.
. . ..
.. . . . .. . . . . . • 2nH(Y ) typical Y sequences.
. . . . . ..
. .. . . . . . • only 2nH(X ,Y ) jointly typical
.. . .
.. . . . .. . . .. . . sequences.
. . .
.. . .
.
. .
.
.
.
.. ..
. • one in 2nI(X ;Y ) pairs is jointly
. .
. . . .. . . . . . .. typical.