Part 3 Information and Quantification

Download as pdf or txt
Download as pdf or txt
You are on page 1of 28

Information Theory / CE231 Lecturer Ali M.

Alsahlany

Ministry of Higher Education and Scientific Research


Al-Furat Al- Awsat Technical University
Engineering Technical college / Najaf
Communication Engineering Department

Information Technology (CE231)


2nd Class 2018/2019
Lecturer Ali M. Alsahlany

2/24/2019 1
Information Theory / CE231 Lecturer Ali M. Alsahlany

Lecture Outlines :
• Introduction
• General model of communication system
• Information Source
• Self Information
• Entropy
• Information Rate
• Joint Entropy
• Conditional Entropy
• Mutual Information

2/24/2019 2
Information Theory / CE231 Lecturer Ali M. Alsahlany
Lecture 3: Information and quantification Information Theory

“The fundamental problem of communication is that of reproducing at one point either


exactly or approximately a message selected at another point.”
(Claude Shannon 1948)

Information Theory is concerned with the theoretical limitations and potentials of systems that communicate.
E.g., \What is the best compression or communications rate we can achieve"

Communication is sending information from one place and/or time to another place and/or time over a medium
that might have errors.

General model of communication


Noise

Source Encoder Channel Decoder Receiver

2/24/2019 3
Information Theory / CE231 Lecturer Ali M. Alsahlany
Lecture 3: Information and quantification General model of communication system

Noise

Source Encoder Channel Decoder Receiver

Source • Voice, Words, Pictures, Music.

• Telephone line, High frequency radio link, Space communication link, Biological organism (send message
Channel
from brain to foot, or from ear to brain)

• Some signal with time-varying frequency response, cross-talk, thermal noise, impulsive switch noise, etc.
Noise • Represents our imperfect understanding of the universe. Thus, we treat it as random, often however
obeying some rules, such as that of a probability distribution.

Receiver • The destination of the information transmitted, Person, Computer, Disk, Analog Radio or TV internet

2/24/2019 4
Information Theory / CE231 Lecturer Ali M. Alsahlany
Lecture 3: Information and quantification General model of communication system

Noise

Source Encoder Channel Decoder Receiver

• processing done before placing info into channel


• First stage: data reduction (keep only important bits or remove
Encoder source redundancy),
• Followed by redundancy insertion catered to channel.
• A code = a mechanism for representation of information of one
signal by another.
• An encoding = representation of information in another form.

• exploit and then remove redundancy


Decoder • remove and fix any transmission errors
• restore the information in original form

2/24/2019 5
Information Theory / CE231 Lecturer Ali M. Alsahlany
Lecture 3: Information and quantification Information Source

Consider Discrete Information Source

Assuming:
- Information source generates a group of symbols from a given alphabet S = {so , s1 , ..., s K −1 }

- Each symbol has a probability Pk S = {so , s1 , ..., s K −1}


Information
K −1
Source
- Symbols are independent P( sk ) = Pk , k = 0,1,..., K − 1 , ∑P
k =0
k =1

The amount of information gained from knowing that the source produces the symbols is sk is related with as pk
follows :
- If Pk = 1
Then there is no uncertainty of occurrence of the event ; no gain of information i.e., there is no need for
communications because the receiver knows everything.
- As P decreases
k

Then the uncertainty increases; the reception of sk corresponds to some gain of information

But How Much?


2/24/2019 6
Information Theory / CE231 Lecturer Ali M. Alsahlany
Lecture 3: Information and quantification Information Source / Self Information

Self Information: Is a function which measures the amount of information after observing the symbol sk

1
I ( sk ) = log b
P ( sk ) log 𝑎𝑎 (𝑝𝑝) =
𝐿𝐿𝐿𝐿 𝑝𝑝
𝐿𝐿𝐿𝐿 𝑎𝑎
2, bits 10, Hartleys
e=2.718, nats
The unit of information depends on the base of the log
The amount of information in bits about a symbol is closely related to its probability of occurrence
A low probability event contains a lot of information and vice versa

Properties of Self Information


I ( sk )

I(Si) ≥ 0 , a real nonnegative measure

P[sk ] I(Si) is a continuous function of p


0 1
I ( sk ) > I ( si ) if Pk < Pi
2/24/2019 7
Information Theory / CE231 Lecturer Ali M. Alsahlany
Lecture 3: Information and quantification Information Source / Self Information

-The information obtained from the occurrence of two independent events is the sum of the information obtained
from the occurrence of the individual events
1
I ( AB ) = log b
P ( AB )
1
= log b
P ( A) P ( B )
1 1
= log b + log b
P ( A) P( B)
= I ( A) + I ( B )
∴ I ( AB ) = I ( A) + I ( B )
Example 1: Let H and T are the outcomes of a flipping coin, calculate the self information for the following
cases:

(a) Fair coin with P (H) = P (T) = 0.5 (a) I (H) =I (T))= 1 bit
(b) Unfair coin with P(H) = 1/8, P(T) = 7/8 (b) I (H) = 3 bits I (T) = 0.193 bits

2/24/2019 8
Information Theory / CE231 Lecturer Ali M. Alsahlany
Lecture 3: Information and quantification Information Source / Self Information

Example 2: A source puts out one of five possible messages during each message interval. The probability of
these messages are {m1,…,m5} :P1 = 1/2, P2 = 1/4, P3 = 1/8, P4 = 1/16 and P5 = 1/16 What is the information
content of these messages in bit?

1 1
I ( m1 ) = log 2 = − log 2 P ( m1 ) = − log 2 ( ) = 1bit
P ( m1 ) 2
1 1
I ( m2 ) = log 2 = − log 2 P ( m2 ) = − log 2 ( ) = 2bits
P ( m2 ) 4

1 1
I ( m3 ) = log 2 = − log 2 P ( m3 ) = − log 2 ( ) = 3bits
P ( m3 ) 8

1 1
I (m4 ) = log 2 = − log 2 P (m4 ) = − log 2 ( ) = 4bits
P (m4 ) 16

1 1
I ( m5 ) = log 2 = − log 2 P ( m5 ) = − log 2 ( ) = 4bits
P ( m5 ) 16

2/24/2019 9
Information Theory / CE231 Lecturer Ali M. Alsahlany
Lecture 3: Information and quantification Information Source / Self Information

Exercise 1: For 128 equally likely and independents messages find the information content (in bits) in each of
the messages.
Solution:
1
I (m) = log 2 = − log 2 P ( m) = log 2 (128) = 7bits
P(m )

Homework 1: Suppose in sizing up the data storage requirements for a word processing system to be used in
production of a book, it is required to calculate the information capacity. The book consist of 450 pages, 500
words per page, each containing 5 symbols are chosen at random from a 37- ary alphabet (26 letters, 10
numerical digits and one blank space). Calculate the information capacity of the book.

2/24/2019 10
Information Theory / CE231 Lecturer Ali M. Alsahlany
Lecture 3: Information and quantification Information Source / Entropy

Entropy: It is the average number of bits per symbol required to describe a source

For a source containing N independent symbols, its entropy is defined as


N
H= ∑ P I (s )
i =1
i i

N
1
H = ∑ Pi log b
i =1 P ( si )

Example 3 : Calculate the entropy of the outcomes of a fair flipping coin


1 1
H = P ( H ) log 2 + P (T ) log 2
P( H ) P (T )
1 1
H = 0.5 log 2 + 0.5 log 2 = 0.5 + 0.5 = 1bit / symbol
0.5 0.5

2/24/2019 11
Information Theory / CE231 Lecturer Ali M. Alsahlany
Lecture 3: Information and quantification Information Source / Entropy

Properties of Entropy

H is a positive quantity H ≥ 0
If all a priori probabilities are equally likely (Pi = 1 / N for all N symbols)
then the entropy is maximum and given by:

H = log b N
The proof:

If all a priori probabilities are equally likely (Pi = 1/N for all N symbols)
N N
H = −∑ Pi log 2 ( Pi ) = −(1 / N )∑ log 2 (1 / N )
i =1 i =1

= −(1 / N ) [ N log b (1 / N )]
= − log 2 (1 / N ) = log 2 N
∴ 0 ≤ H ≤ log b N
2/24/2019 12
Information Theory / CE231 Lecturer Ali M. Alsahlany
Lecture 3: Information and quantification Information Source / Entropy

i.e., you need log2N bits to represent a variable that can take one of N values if N is a power of 2
If these values are equally probable, the entropy is equal to certain number of bits

If one of the events is more probable than others, observation of that event is less informative

Conversely, rarer events provide more information when observed. Since observation of less probable
events occurs more rarely, the net effect is that the entropy received from non uniformly distributed
data is less than log2N

Entropy is zero when one outcome is certain, so entropy refers to disorder or uncertainty of a
message

According to Shannon, the entropy is the average of the information contained


in each message of the source, irrespective the meaning of the message

2/24/2019 13
Information Theory / CE231 Lecturer Ali M. Alsahlany
Lecture 3: Information and quantification Information Source / Entropy

Example 4: Find and plot the entropy of the binary code in which the probability of occurrence for
the symbol 1 is P and for the symbol 0 is 1-P
2
H
H = −∑ Pi log 2 Pi = − P log 2 P − (1 − P ) log 2 (1 − P ) 1
i =1

v logv 0 as v 0
P = 0 ⇒ H = 0 bit/symbol
P = 1 ⇒ H = 0 bit/symbol 0 1/2 1 p
1 1 1 1 1 1 1
P= ⇒ H = − log 2 − log 2 = + = 1 bit/symbol
2 2 2 2 2 2 2
1 1 1 3 3
P = ⇒ H = − log 2 − log 2 = 0.8113 bits/symbol
4 4 4 4 4
Example 5: Calculate the average information in bits/character in English assuming each letter is
equally likely
26
H = ∑ log 2 N = log 2 26 = 4.7 bits/character
i =1

2/24/2019 14
Information Theory / CE231 Lecturer Ali M. Alsahlany
Lecture 3: Information and quantification Information Source / Entropy

Exercise 2: For Uniform distribution P(xi)= 1/N find the average information
𝟏𝟏 𝟏𝟏 𝑵𝑵 𝟏𝟏 𝟏𝟏
Solutions: 𝑯𝑯 𝑿𝑿 = ∑𝑵𝑵
𝒊𝒊=𝟏𝟏 𝑰𝑰 𝒙𝒙𝒙𝒙 = ∑𝒊𝒊=𝟏𝟏 −𝒍𝒍𝒍𝒍𝒍𝒍𝟐𝟐 𝑰𝑰 𝒙𝒙𝒙𝒙 = ∗ 𝑵𝑵 ∗ 𝒍𝒍𝒍𝒍𝒍𝒍𝟐𝟐 𝑵𝑵
𝑵𝑵 𝑵𝑵 𝑵𝑵 𝑵𝑵

= 𝒍𝒍𝒍𝒍𝒍𝒍𝟐𝟐 𝑵𝑵

Homework 2: Consider source transmitting six symbols with probability as given :


A (1/2) D (1/16)
B (1/32) E (1/4)
C (1/8) F (1/32)
Find the average information or entropy.

Homework 3: A source sending 2 symbols. A and B if the 𝑯𝑯𝒙𝒙𝒊𝒊 = 𝟎𝟎. 𝟔𝟔 and self information is 0.3 find the
probability of A and B.

2/24/2019 15
Information Theory / CE231 Lecturer Ali M. Alsahlany
Lecture 3: Information and quantification Information Rate

The information rate is represented by R and it is represented in average number of bits of information per second.
And is given as,

Information Rate : R = rH
Information rate R it is calculated as follows:
𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔 𝒃𝒃𝒃𝒃𝒃𝒃𝒃𝒃 𝒃𝒃𝒃𝒃𝒃𝒃𝒃𝒃
𝑹𝑹 = 𝒓𝒓 𝒊𝒊𝒊𝒊 ∗ 𝑯𝑯 𝒊𝒊𝒊𝒊 =
𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔 𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔 𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔

Example 6: A PCM source transmits four samples (messages) with a rate 2 samples / second. The probabilities of
occurrence of these 4 samples (messages) are p1 = p4 = 1/8 and p2 = p3 = 3/8. Find out the information rate of the
source.
𝟏𝟏 𝟏𝟏 𝟏𝟏 𝟏𝟏
Solution: 𝑯𝑯 = 𝒑𝒑𝟏𝟏 𝒍𝒍𝒍𝒍𝒍𝒍𝟐𝟐 + 𝒑𝒑𝟐𝟐 𝒍𝒍𝒍𝒍𝒍𝒍𝟐𝟐 + 𝒑𝒑𝟑𝟑 𝒍𝒍𝒍𝒍𝒍𝒍𝟐𝟐 + 𝒑𝒑𝟒𝟒 𝒍𝒍𝒍𝒍𝒍𝒍𝟐𝟐 ( )
𝒑𝒑𝟏𝟏 𝒑𝒑𝟐𝟐 𝒑𝒑𝟑𝟑 𝒑𝒑𝟒𝟒
𝟏𝟏 𝟑𝟑 𝟑𝟑 𝟑𝟑 𝟑𝟑 𝟏𝟏 𝒃𝒃𝒃𝒃𝒃𝒃𝒃𝒃
𝑯𝑯 = 𝒍𝒍𝒍𝒍𝒍𝒍𝟐𝟐 𝟖𝟖 + 𝒍𝒍𝒍𝒍𝒍𝒍𝟐𝟐 + 𝒍𝒍𝒍𝒍𝒍𝒍𝟐𝟐 + 𝒍𝒍𝒍𝒍𝒍𝒍𝟐𝟐 𝟖𝟖 = 𝟏𝟏. 𝟖𝟖
𝟖𝟖 𝟖𝟖 𝟖𝟖 𝟖𝟖 𝟖𝟖 𝟖𝟖 𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎
𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎 𝒃𝒃𝒃𝒃𝒃𝒃𝒃𝒃 𝒃𝒃𝒃𝒃𝒃𝒃𝒃𝒃
𝑹𝑹 = 𝒓𝒓𝒓𝒓 = 𝟐𝟐𝟐𝟐 ∗ 𝟏𝟏. 𝟖𝟖 = 𝟑𝟑. 𝟔𝟔
𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔 𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎 𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔
In the example we discussed above, there are four samples (levels). Those four levels can be coded using binary
PCM as shown down in the table:

2/24/2019 16
Information Theory / CE231 Lecturer Ali M. Alsahlany
Lecture 3: Information and quantification Information Rate

Message or level Probability Binary digits


Q1 1/8 00
Q2 3/8 01
Q3 3/8 10
Q4 1/8 11

Since one bit is capable of conveying 1 bit of information, the above coding scheme is capable of conveying 4 bits of
information per second. But in example, we have obtained that we are transmitting 3.6 bits of information per
second. This shows that the information carrying ability of binary PCM is not completely utilized by the
transmission scheme discussed in example . This situation is improved in the next example.

Example 7: In the transmission scheme of example 6, calculate the information rate if all messages are equally likely.
Solution: Since they are equally likely, their probabilities p1 = p2 = p3 = p4 = 1/4
𝑯𝑯 = 𝒍𝒍𝒐𝒐𝒐𝒐𝟐𝟐 𝟒𝟒 = 𝟐𝟐 𝒃𝒃𝒃𝒃𝒃𝒃𝒃𝒃/𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎
𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎 𝒃𝒃𝒃𝒃𝒃𝒃𝒃𝒃 𝒃𝒃𝒃𝒃𝒃𝒃𝒃𝒃
𝑹𝑹 = 𝒓𝒓𝒓𝒓 = 𝟐𝟐 ∗ 𝟐𝟐 = 𝟒𝟒
𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔 𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎 𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔
Just before this example we have seen that a binary coded PCM with 2 bits per message is capable of conveying 4
bits of information per second. This has been made possible since all the messages are equally likely. Thus with
binary PCM coding the maximum information rate is achieved if all messages are equally likely.

2/24/2019 17
Information Theory / CE231 Lecturer Ali M. Alsahlany
Lecture 3: Information and quantification Joint Entropy

The joint entropy represents the amount of information needed on average to specify the value of two
discrete random variables
The entropy of the pairing (X,Y)

𝑯𝑯(𝑿𝑿, 𝒀𝒀)= -∑𝒎𝒎 𝒏𝒏


𝒋𝒋=𝟏𝟏 ∑𝒊𝒊=𝟏𝟏 𝒑𝒑 (𝒙𝒙𝒊𝒊 , 𝒚𝒚𝒋𝒋 ) 𝒍𝒍𝒍𝒍𝒍𝒍𝟐𝟐 𝒑𝒑(𝒙𝒙𝒊𝒊 , 𝒚𝒚𝒋𝒋 )

Example 8: Let X represent whether it is sunny or rainy in a particular town on a given day. Let Y represent whether
it is above 70 degrees or below seventy degrees. Compute the entropy of the joint distribution P(X, Y ) given by
P(sunny, hot) = 1/2
P(sunny, cool) = 1/4
P(rainy, hot) = 1/4
P(rainy, cool) = 0
Answer:
H(X, Y ) = − 1/2 log 1/2 + 1/4 log 1/4 + 1/4 log 1/4 + 0 log 0
𝟑𝟑
= − − 1/2 + −1/2+−1/2 = bits/symbol
𝟐𝟐

2/24/2019 18
Information Theory / CE231 Lecturer Ali M. Alsahlany
Lecture 3: Information and quantification Joint Entropy

Homework 4 xxxx For discrete memory channel the joint probability is tabulated as :.

𝟎𝟎. 𝟐𝟐 𝟎𝟎. 𝟏𝟏 𝟎𝟎. 𝟑𝟑


P(x,y) =
𝟎𝟎. 𝟎𝟎𝟎𝟎 𝟎𝟎. 𝟎𝟎𝟎𝟎 𝟎𝟎. 𝟎𝟎𝟎𝟎
Find H(X), H(Y), and H(X,Y)

Homework 5: Two random variables have joint probability distribution p( x, y ) given

y
p( x, y) 0 1 2
0 3/24 2/24 1/24
x 1 2/24 5/24 2/24
2 6/24 1/24 2/24
Find E(X), E(Y), H(X), H(Y), and H(X,Y)

2/24/2019 19
Information Theory / CE231 Lecturer Ali M. Alsahlany
Lecture 3: Information and quantification Conditional Entropy

Given a pair of random variables (X, Y ), the conditional entropy H(X/Y ) is defined as

= 𝑯𝑯(𝑿𝑿/𝒀𝒀)= -∑𝒎𝒎 𝒏𝒏
𝒋𝒋=𝟏𝟏 ∑𝒊𝒊=𝟏𝟏 𝒑𝒑 (𝒙𝒙𝒊𝒊 , 𝒚𝒚𝒋𝒋 ) 𝒍𝒍𝒍𝒍𝒍𝒍𝟐𝟐 𝒑𝒑(𝒙𝒙𝒊𝒊 /𝒚𝒚𝒋𝒋 )

= 𝑯𝑯(𝒀𝒀/𝑿𝑿)= -∑𝒎𝒎 𝒏𝒏
𝒋𝒋=𝟏𝟏 ∑𝒊𝒊=𝟏𝟏 𝒑𝒑 (𝒙𝒙𝒊𝒊 , 𝒚𝒚𝒋𝒋 ) 𝒍𝒍𝒍𝒍𝒍𝒍𝟐𝟐 𝒑𝒑(𝒚𝒚𝒋𝒋 /𝒙𝒙𝒊𝒊 )

Example 9: For discrete memory channel the joint probability is tabulated as:.
p(x, y) y=0 y=1
x=0 1/2 1/4
x=1 0 1/4

Find joint and conditional entropy H (Y | X )


Answer:
H(X,Y)= -1/2log(1/2) – 1/4log(1/4) – 0log(0) – 1/4log(1/4) = 1.5 bits/symbol
H(Y/X)= -1/2log2/3-1/4log1/3-0log0-1/4log1 = 0.689 bits/symbol

2/24/2019 20
Information Theory / CE231 Lecturer Ali M. Alsahlany
Lecture 3: Information and quantification Conditional Entropy

Chain Rule:

H(X,Y) = H (Y/X) + H (X)


H (Y/X) is the average additional information in Y when you know X

Example 10: Find joint and conditional entropy


p(x, y) y=0 y=1
x=0 1/2 1/4
Answer: x=1 0 1/4
H(X/Y)= H(X,Y) – H(X) = H (1/2,1/4,0,1/4)-H(3/4,1/4) = 0.6989

2/24/2019 21
Information Theory / CE231 Lecturer Ali M. Alsahlany
Lecture 3: Information and quantification Conditional Entropy

Homework 6: A transmitter produces three symbols A, B, C which are related with joint probabilities as shown in
the table below.

k P(k) P(j/k) J
A 11/30 A B C
B 7/12 A 0 4/5 1/5
k
C 1/20 B 1/2 1/2 0
C 1/2 2/5 1/10

Calculated the joint probabilities, the average entropy of a given symbols..

2/24/2019 22
Information Theory / CE231 Lecturer Ali M. Alsahlany
Lecture 3: Information and quantification Mutual Information

Mutual Information 𝑰𝑰 𝑿𝑿; 𝒀𝒀 ∶consider the set of symbols 𝒙𝒙𝟏𝟏 , 𝒙𝒙𝟐𝟐 ,.., 𝒙𝒙𝒏𝒏 . The source may produce 𝒚𝒚𝟏𝟏 , 𝒚𝒚𝟐𝟐 ,.., 𝒚𝒚𝒏𝒏 .
Theoretically, if the noise and jamming is zero then the set x = set y and n = m. however , due to noise and
jamming there will be a conditional probability of 𝒑𝒑(𝒚𝒚𝒋𝒋 /𝒙𝒙𝒋𝒋 ).
𝑇𝑇𝑥𝑥 𝑇𝑇𝑦𝑦
This the statistical average of all the pairs 𝑰𝑰(𝒙𝒙𝒊𝒊 , 𝒚𝒚𝒋𝒋 ), i= 1,2,…, n, j= 1,2, …,m . 𝒙𝒙𝟏𝟏 𝒚𝒚𝟏𝟏
𝒙𝒙𝟐𝟐 Noise 𝒚𝒚𝟐𝟐
. .
𝒎𝒎 𝒏𝒏
𝑰𝑰(𝑿𝑿, 𝒀𝒀)= ∑𝒋𝒋=𝟏𝟏 ∑𝒊𝒊=𝟏𝟏 𝒑𝒑 (𝒙𝒙𝒊𝒊 , 𝒚𝒚𝒋𝒋 ) 𝑰𝑰(𝒙𝒙𝒊𝒊 , 𝒚𝒚𝒋𝒋 ) . .
𝒑𝒑(𝒙𝒙𝒊𝒊 /𝒚𝒚𝒋𝒋 ) . channel .
=∑𝒎𝒎 ∑ 𝒏𝒏
𝒋𝒋=𝟏𝟏 𝒊𝒊=𝟏𝟏 𝒑𝒑 (𝒙𝒙 , 𝒚𝒚
𝒊𝒊 𝒋𝒋 ) 𝒍𝒍𝒍𝒍𝒍𝒍𝟐𝟐 𝒑𝒑(𝒙𝒙 )
𝒊𝒊 𝒙𝒙𝒏𝒏 𝒚𝒚𝒏𝒏
𝒑𝒑(𝒚𝒚 /𝒙𝒙 )
=∑𝒎𝒎 𝒏𝒏 𝒋𝒋 𝒊𝒊
𝒋𝒋=𝟏𝟏 ∑𝒊𝒊=𝟏𝟏 𝒑𝒑 (𝒙𝒙𝒊𝒊 , 𝒚𝒚𝒋𝒋 ) 𝒍𝒍𝒍𝒍𝒍𝒍𝟐𝟐 bits
𝒑𝒑(𝒚𝒚𝒊𝒊 )

Homework 7: Prove that


1. 𝑯𝑯 𝑿𝑿, 𝒀𝒀 = 𝑯𝑯 𝑿𝑿 + 𝑯𝑯 𝒀𝒀 𝑿𝑿 .
2. 𝑰𝑰 𝑿𝑿; 𝒀𝒀 = 𝑯𝑯 𝑿𝑿 − 𝑯𝑯 𝑿𝑿 𝒀𝒀 .

2/24/2019 23
Information Theory / CE231 Lecturer Ali M. Alsahlany
Lecture 3: Information and quantification Mutual Information

Mutual Information 𝑰𝑰 𝑿𝑿; 𝒀𝒀 ∶It is a measure of mutual dependence between variables; it quantifies the amount of
information obtain about random variable from the other.

Example 11: For X data transmitted and received as Y after passing through the channel.
Noise Channel
X P(Y|X) Y
H(Y) is the received data a mix of
part of H(X) from the transmitter
H(X) is the information
with H(Y|X) the noise from the
generated by the source
channel
then sent
H(Y|X) is the added noise
by the channel

H(X|Y) is the data lost 𝑰𝑰 𝑿𝑿; 𝒀𝒀 is the intersection between


through the channel and the transmitted and received data
did not arrived. which represents the data that
survived through the channel.

𝑰𝑰 𝑿𝑿; 𝒀𝒀 = 𝑯𝑯 𝑿𝑿 − 𝑯𝑯 𝑿𝑿 𝒀𝒀 = 𝑯𝑯 𝑿𝑿 + 𝑯𝑯 𝒀𝒀 − 𝑯𝑯(𝑿𝑿, 𝒀𝒀)

2/24/2019 24
Information Theory / CE231 Lecturer Ali M. Alsahlany
Lecture 3: Information and quantification Mutual Information

Example 12: of joint entropy. Let p(x, y) be given by 𝒀𝒀𝟏𝟏 𝒀𝒀𝟐𝟐


Find
𝑿𝑿𝟏𝟏 𝟎𝟎. 𝟓𝟓 𝟎𝟎. 𝟐𝟐𝟐𝟐
(a) The marginal entropies [H(X), H(Y )].
𝑿𝑿𝟐𝟐 𝟎𝟎 𝟎𝟎. 𝟏𝟏𝟏𝟏𝟏𝟏
(b) The system entropy H(X, Y ).
𝑿𝑿𝟑𝟑 𝟎𝟎. 𝟎𝟎𝟎𝟎𝟎𝟎𝟎𝟎 𝟎𝟎. 𝟎𝟎𝟎𝟎𝟎𝟎𝟎𝟎
(c)The noise and losses entropies H(Y | X).
(d) The mutually information between (𝒙𝒙𝟏𝟏 𝒂𝒂𝒂𝒂𝒂𝒂𝒚𝒚𝟐𝟐 ).
(e) The transformation.
(f) Draw a channel model.
Solution:
a. From p(X, Y)
P(x) = [ 0.75 0.125 0.125]
P(y) = [0.5625 0.4375]
𝑵𝑵 𝑴𝑴

𝑯𝑯 𝑿𝑿 = − � 𝒑𝒑 𝑿𝑿𝒊𝒊 𝒍𝒍𝒍𝒍𝒍𝒍𝟐𝟐 𝒑𝒑 𝒙𝒙 𝒊𝒊 , 𝑯𝑯 𝒀𝒀 = − � 𝒑𝒑(𝒀𝒀𝒋𝒋 ) 𝒍𝒍𝒍𝒍𝒍𝒍𝟐𝟐 𝒑𝒑 𝒀𝒀(𝒋𝒋)


𝒊𝒊=𝟏𝟏 𝒋𝒋=𝟏𝟏
H(X) = -[0.75𝒍𝒍𝒍𝒍𝒍𝒍𝟐𝟐 0.75+ 0.125 𝒍𝒍𝒍𝒍𝒍𝒍𝟐𝟐 0.125+0.125𝒍𝒍𝒍𝒍𝒍𝒍𝟐𝟐 0.125] = 1.06127 bits/ symbol
H(Y) = -[0.5𝟔𝟔𝟔𝟔𝟔𝟔𝒍𝒍𝒍𝒍𝒍𝒍𝟐𝟐 0.5𝟔𝟔𝟔𝟔𝟔𝟔 + 0. 𝟒𝟒𝟒𝟒𝟒𝟒𝟒𝟒𝒍𝒍𝒍𝒍𝒍𝒍𝟐𝟐 0. 𝟒𝟒𝟒𝟒𝟒𝟒𝟒𝟒] = 0.9887 bits/ symbol

2/24/2019 25
Information Theory / CE231 Lecturer Ali M. Alsahlany
Lecture 3: Information and quantification Mutual Information

b. H(X, Y) = -∑𝒎𝒎 𝒏𝒏
𝒋𝒋=𝟏𝟏 ∑𝒊𝒊=𝟏𝟏 𝒑𝒑 𝑿𝑿𝒊𝒊 , 𝒀𝒀𝒋𝒋 𝒍𝒍𝒍𝒍𝒍𝒍𝟐𝟐 𝒑𝒑 𝑿𝑿𝒊𝒊 , 𝒀𝒀𝒋𝒋

H(X,Y) = 1.875 bits/symbol


c. 𝑯𝑯(𝒀𝒀/𝑿𝑿) = 𝑯𝑯 𝑿𝑿, 𝒀𝒀 - 𝑯𝑯(𝑿𝑿) = 1.875-1.06127 = 0.81373 bits/symbol
𝑯𝑯(𝑿𝑿/𝒀𝒀) = 𝑯𝑯 𝑿𝑿, 𝒀𝒀 - 𝑯𝑯(𝒀𝒀) = 1.875- 0.9887 = 0.8863
𝒑𝒑(𝑿𝑿𝒊𝒊 ,𝒀𝒀𝒋𝒋 ) 𝒑𝒑(𝑿𝑿𝟏𝟏 /𝒀𝒀𝟐𝟐 ) 𝒑𝒑(𝑿𝑿𝟏𝟏 ,𝒀𝒀𝟐𝟐 )
d. 𝑰𝑰(𝑿𝑿𝟏𝟏 , 𝒀𝒀𝟐𝟐 ) = 𝒍𝒍𝒍𝒍𝒍𝒍𝟐𝟐 , 𝑰𝑰(𝑿𝑿𝟏𝟏 , 𝒀𝒀𝟐𝟐 ) = 𝒍𝒍𝒍𝒍𝒍𝒍𝟐𝟐 but p(𝑿𝑿𝟏𝟏 /𝒀𝒀𝟐𝟐 ) =
𝒑𝒑(𝑿𝑿𝟏𝟏 ) 𝒑𝒑(𝑿𝑿𝒊𝒊 ) 𝒑𝒑(𝒀𝒀𝟐𝟐 )

𝒑𝒑(𝑿𝑿𝟏𝟏,𝒀𝒀𝟐𝟐) 0.25
Then 𝑰𝑰(𝑿𝑿𝟏𝟏, 𝒀𝒀𝟐𝟐) = 𝒍𝒍𝒍𝒍𝒍𝒍𝟐𝟐 = 𝒍𝒍𝒍𝒍𝒍𝒍𝟐𝟐 = -0.3923 bits , that means 𝒀𝒀𝟐𝟐 𝒈𝒈𝒈𝒈𝒈𝒈𝒈𝒈𝒈𝒈 𝒂𝒂𝒂𝒂𝒂𝒂𝒂𝒂𝒂𝒂𝒂𝒂𝒂𝒂𝒂𝒂𝒂𝒂 𝒂𝒂𝒂𝒂𝒂𝒂𝒂𝒂𝒂𝒂 𝑿𝑿𝟏𝟏
𝒑𝒑(𝑿𝑿𝟏𝟏).𝒑𝒑(𝒀𝒀𝟐𝟐) 0.75∗0.4375
e. 𝑰𝑰 𝑿𝑿; 𝒀𝒀 = 𝑯𝑯 𝑿𝑿 − 𝑯𝑯 𝑿𝑿 𝒀𝒀
= 1.06127-0.8863= 0.1749 bits/ symbol
f. To draw channel model we must find p(y/x) matrix from p(x, y)

p(Y/X) = -∑𝑵𝑵
𝒋𝒋=𝟏𝟏 𝑿𝑿𝒊𝒊 , 𝒀𝒀 /𝒑𝒑 𝑿𝑿𝒊𝒊

2/24/2019 26
Information Theory / CE231 Lecturer Ali M. Alsahlany
Lecture 3: Information and quantification Mutual Information

𝒀𝒀𝟏𝟏 𝒀𝒀𝟐𝟐
𝟐𝟐 𝟏𝟏
𝑿𝑿𝟏𝟏 (𝟎𝟎. 𝟓𝟓)/(𝟎𝟎. 𝟕𝟕𝟕𝟕) (𝟎𝟎. 𝟐𝟐𝟐𝟐)/(𝟎𝟎. 𝟕𝟕𝟕𝟕) 𝟑𝟑 𝟑𝟑
𝑿𝑿𝟐𝟐 𝟎𝟎 (𝟎𝟎. 𝟏𝟏𝟏𝟏𝟏𝟏)/(𝟎𝟎. 𝟏𝟏𝟏𝟏𝟏𝟏) = 𝟎𝟎 𝟏𝟏
𝑿𝑿𝟑𝟑 𝟎𝟎. 𝟎𝟎𝟎𝟎𝟎𝟎𝟎𝟎 /(𝟎𝟎. 𝟏𝟏𝟏𝟏𝟏𝟏 (𝟎𝟎. 𝟎𝟎𝟎𝟎𝟎𝟎𝟎𝟎)/(𝟎𝟎. 𝟏𝟏𝟏𝟏𝟏𝟏) 𝟏𝟏 𝟏𝟏
𝟐𝟐 𝟐𝟐
Unit row summation
2/3
𝑿𝑿𝟏𝟏 𝒀𝒀 𝟏𝟏
𝑿𝑿𝟐𝟐 1 1/3
1/3
𝑿𝑿𝟑𝟑 1/2 𝒀𝒀 𝟐𝟐

Homework 8: For the following channel model


p(𝑿𝑿𝟏𝟏 )=0.6 0.8
𝑿𝑿𝟏𝟏 𝒀𝒀𝟏𝟏
Find : 𝒀𝒀
1. H(X)
0.7 0.1 𝟐𝟐
𝑿𝑿𝟑𝟑 𝒀𝒀𝟑𝟑
2. H(Y)
3. Noise and losses Entropy

2/24/2019 27
Information Theory / CE231 Lecturer Ali M. Alsahlany
Lecture 3: Information and quantification Mutual Information

Exercise 3: of joint entropy. Let p(x, y) be given by


p(X, Y) y
Find 0 1
(a) H(X),H(Y ). 0 1/3 1/3
(b) H(X | Y ),H(Y | X). x
1 0 1/3
(c) H(X, Y ).
(d) H(Y ) − H(Y | X).
(e) I(X; Y ) .
(f) Draw a Venn diagram for the quantities in (a) through (e).

Solution:
(a) H(X) =2/3 log 3/2 +1/3 log 3= 0.918 bits = H(Y).
(b) H(X|Y)=1/3H(X|Y=0)+2/3H(X|Y=1)=0.667 bits = H (Y|X).
(c) H(X,Y) = 3*1/3 log 3 = 1.5858 bits.
(d) H(Y)-H(Y|X)=0.251 bits.
(e) I(X;Y)=H(Y)-H(Y|X)=0.251 bits
(f) Venn diagram to illustrate the relationships of entropy and relative entropy

2/24/2019 28

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy