Information Theory
Information Theory
Information Theory
1
WHERE WE ARE IN THE SYLLABUS
2
INTRODUCTION TO INFORMATION THEORY
1. The purpose of communication is to carry information bearing
baseband signals from one place to another over a
communication channel.
2. Information theory is a branch of probability theory which
deals with mathematical modelling and analysis of
communication systems rather than the physical channels.
3. Information theory was invented by scientists studying the
statistical structure of electronic communication systems.
4. Information theory attempts to provide answers to questions like
what is the amount of information is carried?
3
IEEE SOCIETIES – BONUS (1)
IEEE Aerospace and Electronic Systems IEEE Engineering in Medicine and Biology
Society Society
IEEE Antennas & Propagation Society IEEE Geoscience and Remote Sensing Society
IEEE Broadcast Technology Society IEEE Industrial Electronics Society
IEEE Circuits and Systems Society IEEE Industry Applications Society
IEEE Communications Society IEEE Information Theory Society
IEEE Components, Packaging & IEEE Instrumentation and Measurement Society
Manufacturing Technology Society
IEEE Computational Intelligence Society IEEE Intelligent Transportation Systems Society
IEEE Computer Society IEEE Magnetics Society
IEEE Consumer Electronics Society IEEE Nuclear and Plasma Sciences Society
IEEE Control Systems Society IEEE Oceanic Engineering Society
IEEE Dielectrics & Electrical Insulation Society IEEE Photonics Society
IEEE Education Society IEEE Power & Energy Society
IEEE Electromagnetic Compatibility Society IEEE Power Electronics Society
IEEE Electron Devices Society
4
IEEE SOCIETIES – BONUS (2)
IEEE Product Safety Engineering Society
IEEE Professional Communication Society
IEEE Reliability Society
IEEE Robotics and Automation Society
IEEE Signal Processing Society
IEEE Society on Social Implications of Technology
IEEE Systems, Man, and Cybernetics Society
IEEE International Frequency Control Symposium
IEEE Ultrasonics, Ferroelectrics, and Frequency Control Society
IEEE Vehicular Technology Society
5
IEEE INFORMATION THEORY SOCIETY
6
INFORMATION SOURCE
1. An information Source may be viewed as an object that produces random
events.
2. An information Source can be analogue or discrete (digital)
3. A discrete information source has a finite number of symbols as possible
outputs.
Outputs
A sequence of n symbols selected
from the symbol set X
Source
Source
A random generator with a set, 𝑋 = 𝑥1 , 𝑥2 , 𝑥3 , … . 𝑥𝑁
capable of generating N symbols. 7
CLASSIFICATION OF INFORMATION SOURCES
1. Information Sources can be classified
as having memory or being memory
less.
a) Memory Source: The current
symbol depends on previous
symbols
b) Memory-less source: Each symbol
produced is independent of
previous symbols.
2. Discrete Memory-less Source (DMS)
consists of a discrete set of letters or
alphabetic symbols.
8
AMOUNT OF INFORMATION?
9
AMOUNT OF INFORMATION IN A DISCRETE MEMORY-LESS SYSTEM (DMS)
1. The amount of information in an event is related to its
uncertainty.
a) Messages conveying a high probability of occurrence convey
relatively little information.
b) If an event is certain, i.e probability of occurrence is one, then it
conveys zero information.
2. A mathematical measure of information should therefore
satisfy the following axioms:
a) Information should be proportional to the uncertainty of an
outcome
b) Information contained in independent outcomes should add.
10
INFORMATION CONTENT OF A SYMBOL (1)
Assume a Discrete Memoryless Source (DMS) denoted by X
and having an output alphabet {x1,x2,x3,…xn}
The information content of a symbol xi is defined as:
1
𝐼(𝑥𝑖 ) = 𝑙𝑜𝑔𝑏 = −𝑙𝑜𝑔𝑏 𝑃(𝑥𝑖 )
𝑃(𝑥𝑖 )
Where 𝑃 𝑥𝑖 𝑖𝑠 𝑡ℎ𝑒 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑜𝑓 𝑜𝑐𝑐𝑢𝑟𝑎𝑛𝑐𝑒 𝑜𝑓 𝑠𝑦𝑚𝑏𝑜𝑙 𝑥𝑖
Characteristics of Information Content:
a) I(xi) = 0 for P(xi) = 1
b) I(xi) ≥ 0
c) I(xi) ˃ I(xj) if P(xi) < P(xj)
d) I(xi,xj) = I(xi) + I(xj) if xi and xj are independent
11
INFORMATION CONTENT OF A SYMBOL (2)
1
1. 𝐼(𝑥𝑖 ) = 𝑙𝑜𝑔𝑏
𝑃(𝑥𝑖 )
= −𝑙𝑜𝑔𝑏 𝑃(𝑥𝑖 )
12
INFORMATION CONTENT – WORKED EXAMPLE 1
13
INFORMATION CONTENT – WORKED EXAMPLE 1
𝐼 𝑥𝑖 = −𝑙𝑜𝑔2 (𝑃 𝑥𝑖 )
1
𝐼(𝑥1) = − 𝑙𝑜𝑔2 = 1𝑏𝑖𝑡
2
1
𝐼(𝑥2) = − 𝑙𝑜𝑔2 = 2𝑏𝑖𝑡𝑠
4
1
𝐼(𝑥3) = − 𝑙𝑜𝑔2 = 3𝑏𝑖𝑡𝑠
8
1
𝐼(𝑥4) = − 𝑙𝑜𝑔2 = 3𝑏𝑖𝑡𝑠
8
14
INFORMATION CONTENT – WORKED EXAMPLE 2
15
INFORMATION CONTENT - EXAMPLE 2 SOLUTION
There are two levels in a PCM systems, i.e x0 = 0, and x1 = 1
If the levels occur with equal probability as given then
1
𝑃 𝑥𝑖 for level 0 =
2
1
𝑃 𝑥𝑖 for level 1 =
2
Therefore:
1
𝐼 𝑥0 = −𝑙𝑜𝑔2 = 1 𝑏𝑖𝑡 𝑓𝑜𝑟 𝑙𝑒𝑣𝑒𝑙 0
2
1
𝐼 𝑥1 = −𝑙𝑜𝑔2 = 1 𝑏𝑖𝑡 𝑓𝑜𝑟 𝑙𝑒𝑣𝑒𝑙 1
2
16
WHAT IS ENTROPY?
General definition
Entropy is a scientific concept as well as a
measurable physical property that is most
commonly associated with a state of
disorder, randomness, or uncertainty.
𝑚
𝐻 𝑋 = − 𝑃 𝑥𝑖 𝑙𝑜𝑔2 𝑃 𝑥𝑖 𝑏𝑖𝑡𝑠/𝑠𝑦𝑚𝑏𝑜𝑙
𝑖=1
1
(a) If Pr(𝑋 ≠ 2) then H(X) takes the
values indicated in the graph above.
20
LOWER AND UPPER BONDS ON ENTROPY FOR M SYMBOLS
1. H(X) = 0 when only one symbol has probability P(xi)=1 while P(xj)=0 for
j≠i.
1
2. H(X)=1 when 𝑃 𝑥𝑖 = 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑖
𝑚
21
ENTROPY FOR INDEPENTENT SOURCES
𝑃 𝑥𝑖 , 𝑥𝑗 = 𝑃 𝑥𝑖 𝑃 𝑥𝑗
Therefore
1 1 1 1
𝐼 𝑥𝑖 𝑥𝑗 = 𝑙𝑜𝑔𝑏 = 𝑙𝑜𝑔𝑏 = 𝑙𝑜𝑔𝑏 + 𝑙𝑜𝑔𝑏
𝑃(𝑥𝑖 𝑥𝑗 ) 𝑃 𝑥𝑖 𝑃(𝑥𝑗 ) 𝑃(𝑥𝑖 ) 𝑃(𝑥𝑗 )
= I(xi) + I(xj)
22
INFORMATION RATE
𝑅 = 𝑟𝐻 𝑋 𝑏𝑖𝑡𝑠/𝑆𝑒𝑐
23
ENTROPY EXAMPLE 1
24
ENTROPY EXAMPLE 1 SOLUTION
25
INFORMATION RATE EXAMPLE 2
26
INFORMATION RATE EXAMPLE 2
27
INFORMATION RATE EXAMPLE 3
28
INFORMATION RATE EXAMPLE 3 SOLUTION
1 1 3 8 3 8
𝐻 𝑋 = 𝑙𝑜𝑔2 8 + 𝑙𝑜𝑔2 8 + 𝑙𝑜𝑔2 ( ) + 𝑙𝑜𝑔2 ( )=1.8 bits/sec
8 8 8 3 8 3
29
ENTROPY AND INFORMATION RATE EXAMPLE
30
ENTROPY AND INFORMATION RATE EXAMPLE
Entropy,
𝐻 𝑋 = − σ4𝑖=1 𝑃 𝑥𝑖 𝑙𝑜𝑔2 (𝑃(𝑥𝑖 )
1 1 1 1 1 1
𝐻 𝑋 = 𝑙𝑜𝑔2 4 + 𝑙𝑜𝑔2 5 + 𝑙𝑜𝑔2 5 + 𝑙𝑜𝑔2 10 + 𝑙𝑜𝑔2 10 + 𝑙𝑜𝑔2 20 +
1 4 1 5 5 10 10 20
𝑙𝑜𝑔2 20 + 𝑙𝑜𝑔2 20 = 2.84 bits/sec
20 20
The Nyquist rate is 2𝑓𝑚 = 2X 10,000 = 20,000 samples (symbols) per second
31
DISCRETE MEMORYLESS CHANNEL
1. In terms of information theory, a communication channel is a path or
medium through which symbols flow from source to destination.
2. A discrete Memory less channel is a statistical model of a channel through
which m symbols are input and n outputs are produced.
3. The a priori probabilities of the input symbols are assumed to be known.
4. Each possible input/output path is represented by a conditional probability
P(yj│xi)
𝑥1 𝑦1
𝑥2 𝑦2
. .
𝑥𝑖 X P 𝑦𝑖 𝑥𝑖 Y 𝑦𝑗
. .
𝑥𝑚 𝑦𝑛
32
CHANNEL MATRIX
Since each input to the channel results in some output, each row of the matrix
must sum to unity, i.e
𝑃 𝑦𝑗 ⋮ 𝑥𝑖 = 1 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑖
𝑗=1 33
LOSSLESS CHANNEL
34
DETERMINISTIC CHANNEL
1. A deterministic channel is the one with only one none zero element in a row.
2. Since each row has one element, it follows that the element must be unity.
3. When a given source symbol is sent, it is clear which output symbol will be received.
Hence the name - deterministic.
35
NOISELESS CHANNEL
36
BINARY SYMMETRIC CHANNEL
A binary symmetric channel has two inputs (x1=0, x2=1) and two
outputs (y1=0, y2=1)
37