0% found this document useful (0 votes)
227 views

Source Coding Techniques: 1. Huffman Code. 2. Two-Pass Huffman Code. 3. Lemple-Ziv Code

The document describes various source coding techniques: 1. Huffman coding assigns variable-length codes to symbols based on their probabilities to minimize average code length. 2. Two-pass Huffman coding estimates symbol probabilities from the first pass to generate codes for the second pass, since probabilities may be unknown. 3. Lempel-Ziv coding creates a codebook of unique substrings and assigns an index, adapting to the data without prior symbol probabilities. It provides lossless compression and is universal.

Uploaded by

tafzeman891
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
227 views

Source Coding Techniques: 1. Huffman Code. 2. Two-Pass Huffman Code. 3. Lemple-Ziv Code

The document describes various source coding techniques: 1. Huffman coding assigns variable-length codes to symbols based on their probabilities to minimize average code length. 2. Two-pass Huffman coding estimates symbol probabilities from the first pass to generate codes for the second pass, since probabilities may be unknown. 3. Lempel-Ziv coding creates a codebook of unique substrings and assigns an index, adapting to the data without prior symbol probabilities. It provides lossless compression and is universal.

Uploaded by

tafzeman891
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 111

Source Coding Techniques

1. Huffman Code.

2. Two-pass Huffman Code.

3. Lemple-Ziv Code.

4. Fano code.

5. Shannon Code.

6. Arithmetic Code.
Source Coding Techniques

1. Huffman Code.

2. Two-path Huffman Code.

3. Lemple-Ziv Code.

4. Fano Code.

5. Shannon Code .

6. Arithmetic Code.
Source Coding Techniques
1. Huffman Code.

With the Huffman code in the binary case the two least
probable source output symbols are joined together,
resulting in a new message alphabet with one less symbol

1 take together smallest probabilites: P(i) + P(j)


2 replace symbol i and j by new symbol
3 go to 1 - until end

Application examples: JPEG, MPEG, MP3


1. Huffman Code.

ADVANTAGES:
• uniquely decodable code
• smallest average codeword length

DISADVANTAGES:
• LARGE tables give complexity
• sensitive to channel errors
1. Huffman Code.

Huffman is not universal!


it is only valid for one particular type of source!

For COMPUTER DATA data reduction is

lossless→ no errors at reproduction


universal→ effective for different types of data
Huffman Coding: Example

• Compute the Huffman Source Symbol


Code for the source Symbol Probability
shown sk pk
 1  s0 0.1
H (S ) = ( 0.4 ) log2  
 0.4  s1 0.2
 1  s2 0.4
+2 × ( 0.2 ) log2   s3 0.2
 0.2 
s4 0.1
 1 
+2 × ( 0.1) log2  
 0.1 
= 2.12193 ≥ L
Solution A
Source Stage I
Symbol
sk
s2 0.4

s1 0.2

s3 0.2

s0 0.1

s4 0.1
Solution A
Source Stage I Stage II
Symbol
sk
s2 0.4 0.4

s1 0.2 0.2

s3 0.2 0.2

s0 0.1 0.2

s4 0.1
Solution A
Source Stage I Stage II Stage III
Symbol
sk
s2 0.4 0.4 0.4

s1 0.2 0.2 0.4

s3 0.2 0.2 0.2

s0 0.1 0.2

s4 0.1
Solution A
Source Stage I Stage II Stage III Stage IV
Symbol
sk
s2 0.4 0.4 0.4 0.6

s1 0.2 0.2 0.4 0.4

s3 0.2 0.2 0.2

s0 0.1 0.2

s4 0.1
Solution A
Source Stage I Stage II Stage III Stage IV
Symbol
sk
0
s2 0.4 0.4 0.4 0.6
0
s1 0.2 0.2 0.4 0.4 1
0
s3 0.2 0.2 0.2 1
0
s0 0.1 0.2 1

s4 0.1 1
Solution A
Source Stage I Stage II Stage III Stage IV Code
Symbol
sk
0
s2 0.4 0.4 0.4 0.6 00
0
s1 0.2 0.2 0.4 0.4 1
10
0
s3 0.2 0.2 0.2 1
11
0
s0 0.1 0.2 1
010

s4 0.1 1
011
Solution A Cont’d
H ( S ) = 2.12193
Source Symbol Code
Symbol Probability word ck
sk pk
s0 0.1 010
L = 0.4 × 2 + 0.2 × 2
s1 0.2 10
+0.2 × 2 + 0.1 × 3 + 0.1 × 3
s2 0.4 00
= 2.2
s3 0.2 11

H (S ) ≤ L < H ( S ) + 1
s4 0.1 011

THIS IS NOT THE ONLY SOLUTION!


Another Solution B
Source Stage I Stage II Stage III Stage IV Code
Symbol
sk
0
s2 0.4 0.4 0.4 0.6 1
0
s1 0.2 0.2 0.4 0.4 1
01
0
s3 0.2 0.2 0.2 1
000
0
s0 0.1 0.2 1
0010

s4 0.1 1
0011
Another Solution B Cont’d

H ( S ) = 2.12193
Source Symbol Code
Symbol Probability word ck
sk pk
s0 0.1 0010
L = 0.4 × 1 + 0.2 × 2
s1 0.2 01
+0.2 × 3 + 0.1 × 4 + 0.1 × 4
s2 0.4 1
= 2.2
s3 0.2 000

H (S ) ≤ L < H ( S ) + 1
s4 0.1 0011
What is the difference between
the two solutions?
• They have the same average length
• They differ in the variance of the average code
length
K −1

( )
2
σ = ∑ pk l k − L
2

k =0

• Solution A
• s 2=0.16
• Solution B
• s 2=1.36
Source Coding Techniques

1. Huffman Code.

2. Two-pass Huffman Code.

3. Lemple-Ziv Code.

4. Fano Code.

5. Shannon Code.

6. Arithmetic Code.
Source Coding Techniques
2. Two-pass Huffman Code.

This method is used when the probability of symbols in


the information source is unknown. So we first can
estimate this probability by calculating the number of
occurrence of the symbols in the given message then we
can find the possible Huffman codes. This can be
summarized by the following two passes.

Pass 1 : Measure the occurrence possibility of each character in the message

Pass 2 : Make possible Huffman codes


Source Coding Techniques
2. Two-pass Huffman Code.

Example

Consider the message: ABABABABABACADABACADABACADABACAD

0
Source Coding Techniques

1. Huffman Code.

2. Two-path Huffman Code.

3. Lemple-Ziv Code.

4. Fano Code.

5. Shannon Code.

6. Arithmetic Code.
Lempel-Ziv Coding
• Huffman coding requires knowledge of a
probabilistic model of the source
• This is not necessarily always feasible
• Lempel-Ziv code is an adaptive coding
technique that does not require prior
knowledge of symbol probabilities
• Lempel-Ziv coding is the basis of well-known
ZIP for data compression
Lempel-Ziv Coding History

• GIF, TIFF, V.42bis modem compression standard, PostScript Level 2

• 1977 published by Abraham Lempel and Jakob Ziv

• 1984 LZ-Welch algorithm published in IEEE Computer


• Sperry patent transferred to Unisys (1986)
• GIF file format Required use of LZW algorithm

•Universal

•Lossless
Lempel-Ziv Coding Example
0 0 0 1 0 1 1 1 0 0 1 0 1 0 0 1 0 1…

Codebook 1 2 3 4 5 6 7 8 9
Index

Subsequence 0 1

Representation

Encoding
Lempel-Ziv Coding Example
0 0 0 1 0 1 1 1 0 0 1 0 1 0 0 1 0 1…

Codebook 1 2 3 4 5 6 7 8 9
Index

Subsequence 0 1 00

Representation

Encoding
Lempel-Ziv Coding Example
0 0 0 1 0 1 1 1 0 0 1 0 1 0 0 1 0 1…

Codebook 1 2 3 4 5 6 7 8 9
Index

Subsequence 0 1 00 01

Representation

Encoding
Lempel-Ziv Coding Example
0 0 0 1 0 1 1 1 0 0 1 0 1 0 0 1 0 1…

Codebook 1 2 3 4 5 6 7 8 9
Index

Subsequence 0 1 00 01 011

Representation

Encoding
Lempel-Ziv Coding Example
0 0 0 1 0 1 1 1 0 0 1 0 1 0 0 1 0 1…

Codebook 1 2 3 4 5 6 7 8 9
Index

Subsequence 0 1 00 01 011 10

Representation

Encoding
Lempel-Ziv Coding Example
0 0 0 1 0 1 1 1 0 0 1 0 1 0 0 1 0 1…

Codebook 1 2 3 4 5 6 7 8 9
Index

Subsequence 0 1 00 01 011 10 010

Representation

Encoding
Lempel-Ziv Coding Example
0 0 0 1 0 1 1 1 0 0 1 0 1 0 0 1 0 1…

Codebook 1 2 3 4 5 6 7 8 9
Index

Subsequence 0 1 00 01 011 10 010 100

Representation

Encoding
Lempel-Ziv Coding Example
0 0 0 1 0 1 1 1 0 0 1 0 1 0 0 1 0 1…

Codebook 1 2 3 4 5 6 7 8 9
Index

Subsequence 0 1 00 01 011 10 010 100 101

Representation

Encoding
Lempel-Ziv Coding Example
Information bits 0 0 0 1 0 1 1 1 0 0 1 0 1 0 0 1 0 1…
Source encoded bits 0010 0011 1001 0100 1000 1100 1101

Codebook 1 2 3 4 5 6 7 8 9
Index

Subsequence 0 1 00 01 011 10 010 100 101

Representation 11 12 42 21 41 61 62

Source Code 0010 0011 1001 0100 1000 1100 1101


How Come this is Compression?!
• The hope is:
• If the bit sequence is long enough,
eventually the fixed length code words will
be shorter than the length of subsequences
they represent.
• When applied to English text
• Lempel-Ziv achieves approximately 55%
• Huffman coding achieves approximately
43%
Encoding idea Lempel Ziv Welch-LZW

Assume we have just read


a
a segment w from the text.
a is the next symbol. w

If wa is not in the dictionary,


? Write the index of w in the output file.
a
? Add wa to the dictionary, and set w ß a.

? If wa is in the dictionary,
? Process the next symbol with segment wa.
LZ Encoding example
• address 0: a address 1: b address 2: c
Input string: a a b a a c a b c a b c b

a a b a a ca b c a b c b output update

a a aa not in dictionry, output 0 add aa to dictionary 0 aa 3


a a b continue with a, store ab in dictionary 0 ab 4
a a b a continue with b, store ba in dictionary 1 ba 5

a a b a a c aa in dictionary, aac not, 3 aac 6


a a b a a c a 2 ca 7
a a b a a c a b c 4 abc 8
a a b a a c a b c a b 7 cab 9

aabaacabcabcb LZ Encoder 0013247


UNIVERSAL (LZW) (decoder)

1. Start with basic symbol set

2. Read a code c from the compressed file.


- The address c in the dictionary determines the segment w.
- write w in the output file.

3. Add wa to the dictionary: a is the first letter of the next segment


LZ Decoding example
• address 0: a address 1: b address 2: c

Output String:
Input update

a ? Output a 0
a a ! output a determines ? = a, update aa 0 aa 3
a a b . output 1 determines !=b, update ab 1 ab 4

a a b a a . 3 ba 5
a a b a a c . 2 aac 6
a a b a a c a b . 4 ca 7
a a b a a c a b c a . 7 abc 8

0013247 LZ Decoder aabaacabcabcb


Exercise
1. Find Huffman code for the following source
Symbol Probability
h 0.1
e 0.1
l 0.4
o 0.25
w 0.05
r 0.05
d 0.05

2. Find LZ code for the following input

0011001111010100010001001
Source Coding Techniques

1. Huffman Code.

2. Two-path Huffman Code.

3. Lemple-Ziv Code.

4. Fano Code.

5. Shannon Code.

6. Arithmetic Code.
4. Fano Code.

The Fano code is performed as follows:

1. arrange the information source symbols in order of


decreasing probability
2. divide the symbols into two equally probable groups,
as possible as you can

3. each group receives one of the binary symbols


(i.e. 0 or 1) as the first symbol

4. repeat steps 2 and 3 per group as many times


as this is possible.
5. stop when no more groups to divide
4. Fano Code.
Example 1: 1. arrange the information source symbols in order of
decreasing probability

Symbol Probability Fano Code


A 1/4
B 1/4
C 1/8
D 1/8
E 1/16
F 1/16
G 1/32
H 1/32
I 1/32
J 1/32
4. Fano Code.
Example 1: 2. divide the symbols into two equally probable groups,
as possible as you can

Symbol Probability Fano Code


A 1/4
B 1/4
C 1/8
D 1/8
E 1/16
F 1/16
G 1/32
H 1/32
I 1/32
J 1/32
4. Fano Code.
Example 1: 3. each group receives one of the binary symbols
(i.e. 0 or 1) as the first symbol

Symbol Probability Fano Code


A 1/4 0
B 1/4 0
C 1/8 1
D 1/8 1
E 1/16 1
F 1/16 1
G 1/32 1
H 1/32 1
I 1/32 1
J 1/32 1
4. Fano Code.
Example 1: 4. repeat steps 2 and 3 per group as many times
as this is possible.

Symbol Probability Fano Code


A 1/4 0
B 1/4 0
C 1/8 1
D 1/8 1
E 1/16 1
F 1/16 1
G 1/32 1
H 1/32 1
I 1/32 1
J 1/32 1
4. Fano Code.
Example 1: 2. divide the symbols into two equally probable groups,
as possible as you can

Symbol Probability Fano Code


A 1/4 0
B 1/4 0
C 1/8 1
D 1/8 1
E 1/16 1
F 1/16 1
G 1/32 1
H 1/32 1
I 1/32 1
J 1/32 1
4. Fano Code.
Example 1: 3. each group receives one of the binary symbols
(i.e. 0 or 1) as the first symbol

Symbol Probability Fano Code


A 1/4 0 0
B 1/4 0 1
C 1/8 1
D 1/8 1
E 1/16 1
F 1/16 1
G 1/32 1
H 1/32 1
I 1/32 1
J 1/32 1
4. Fano Code.
Example 1:

Symbol Probability Fano Code


A 1/4 0 0
B 1/4 0 1
C 1/8 1
D 1/8 1
E 1/16 1
F 1/16 1
G 1/32 1
H 1/32 1
I 1/32 1
J 1/32 1
4. Fano Code.
Example 1: 2. divide the symbols into two equally probable groups,
as possible as you can

Symbol Probability Fano Code


A 1/4 0 0
B 1/4 0 1
C 1/8 1
D 1/8 1
E 1/16 1
F 1/16 1
G 1/32 1
H 1/32 1
I 1/32 1
J 1/32 1
4. Fano Code.
Example 1: 3. each group receives one of the binary symbols
(i.e. 0 or 1) as the first symbol

Symbol Probability Fano Code


A 1/4 0 0
B 1/4 0 1
C 1/8 1 0
D 1/8 1 0
E 1/16 1 1
F 1/16 1 1
G 1/32 1 1
H 1/32 1 1
I 1/32 1 1
J 1/32 1 1
4. Fano Code.
Example 1: 4. repeat steps 2 and 3 per group as many times
as this is possible.

Symbol Probability Fano Code


A 1/4 0 0
B 1/4 0 1
C 1/8 1 0
D 1/8 1 0
E 1/16 1 1
F 1/16 1 1
G 1/32 1 1
H 1/32 1 1
I 1/32 1 1
J 1/32 1 1
4. Fano Code.
Example 1: 2. divide the symbols into two equally probable groups,
as possible as you can

Symbol Probability Fano Code


A 1/4 0 0
B 1/4 0 1
C 1/8 1 0
D 1/8 1 0
E 1/16 1 1
F 1/16 1 1
G 1/32 1 1
H 1/32 1 1
I 1/32 1 1
J 1/32 1 1
4. Fano Code.
Example 1: 3. each group receives one of the binary symbols
(i.e. 0 or 1) as the first symbol

Symbol Probability Fano Code


A 1/4 0 0
B 1/4 0 1
C 1/8 1 0 0
D 1/8 1 0 1
E 1/16 1 1
F 1/16 1 1
G 1/32 1 1
H 1/32 1 1
I 1/32 1 1
J 1/32 1 1
4. Fano Code.
Example 1:

Symbol Probability Fano Code


A 1/4 0 0
B 1/4 0 1
C 1/8 1 0 0
D 1/8 1 0 1
E 1/16 1 1
F 1/16 1 1
G 1/32 1 1
H 1/32 1 1
I 1/32 1 1
J 1/32 1 1
4. Fano Code.
Example 1: 2. divide the symbols into two equally probable groups,
as possible as you can

Symbol Probability Fano Code


A 1/4 0 0
B 1/4 0 1
C 1/8 1 0 0
D 1/8 1 0 1
E 1/16 1 1
F 1/16 1 1
G 1/32 1 1
H 1/32 1 1
I 1/32 1 1
J 1/32 1 1
4. Fano Code.
Example 1: 3. each group receives one of the binary symbols
(i.e. 0 or 1) as the first symbol

Symbol Probability Fano Code


A 1/4 0 0
B 1/4 0 1
C 1/8 1 0 0
D 1/8 1 0 1
E 1/16 1 1 0
F 1/16 1 1 0
G 1/32 1 1 1
H 1/32 1 1 1
I 1/32 1 1 1
J 1/32 1 1 1
4. Fano Code.
Example 1: 2. divide the symbols into two equally probable groups,
as possible as you can

Symbol Probability Fano Code


A 1/4 0 0
B 1/4 0 1
C 1/8 1 0 0
D 1/8 1 0 1
E 1/16 1 1 0
F 1/16 1 1 0
G 1/32 1 1 1
H 1/32 1 1 1
I 1/32 1 1 1
J 1/32 1 1 1
4. Fano Code.
Example 1: 3. each group receives one of the binary symbols
(i.e. 0 or 1) as the first symbol

Symbol Probability Fano Code


A 1/4 0 0
B 1/4 0 1
C 1/8 1 0 0
D 1/8 1 0 1
E 1/16 1 1 0 0
F 1/16 1 1 0 1
G 1/32 1 1 1
H 1/32 1 1 1
I 1/32 1 1 1
J 1/32 1 1 1
4. Fano Code.
Example 1: 2. divide the symbols into two equally probable groups,
as possible as you can

Symbol Probability Fano Code


A 1/4 0 0
B 1/4 0 1
C 1/8 1 0 0
D 1/8 1 0 1
E 1/16 1 1 0 0
F 1/16 1 1 0 1
G 1/32 1 1 1
H 1/32 1 1 1
I 1/32 1 1 1
J 1/32 1 1 1
4. Fano Code.
Example 1: 3. each group receives one of the binary symbols
(i.e. 0 or 1) as the first symbol

Symbol Probability Fano Code


A 1/4 0 0
B 1/4 0 1
C 1/8 1 0 0
D 1/8 1 0 1
E 1/16 1 1 0 0
F 1/16 1 1 0 1
G 1/32 1 1 1 0
H 1/32 1 1 1 0
I 1/32 1 1 1 1
J 1/32 1 1 1 1
4. Fano Code.
Example 1: 2. divide the symbols into two equally probable groups,
as possible as you can

Symbol Probability Fano Code


A 1/4 0 0
B 1/4 0 1
C 1/8 1 0 0
D 1/8 1 0 1
E 1/16 1 1 0 0
F 1/16 1 1 0 1
G 1/32 1 1 1 0
H 1/32 1 1 1 0
I 1/32 1 1 1 1
J 1/32 1 1 1 1
4. Fano Code.
Example 1: 3. each group receives one of the binary symbols
(i.e. 0 or 1) as the first symbol

Symbol Probability Fano Code


A 1/4 0 0
B 1/4 0 1
C 1/8 1 0 0
D 1/8 1 0 1
E 1/16 1 1 0 0
F 1/16 1 1 0 1
G 1/32 1 1 1 0 0
H 1/32 1 1 1 0 1
I 1/32 1 1 1 1
J 1/32 1 1 1 1
4. Fano Code.
Example 1: 2. divide the symbols into two equally probable groups,
as possible as you can

Symbol Probability Fano Code


A 1/4 0 0
B 1/4 0 1
C 1/8 1 0 0
D 1/8 1 0 1
E 1/16 1 1 0 0
F 1/16 1 1 0 1
G 1/32 1 1 1 0 0
H 1/32 1 1 1 0 1
I 1/32 1 1 1 1
J 1/32 1 1 1 1
4. Fano Code.
Example 1: 3. each group receives one of the binary symbols
(i.e. 0 or 1) as the first symbol

Symbol Probability Fano Code


A 1/4 0 0
B 1/4 0 1
C 1/8 1 0 0
D 1/8 1 0 1
E 1/16 1 1 0 0
F 1/16 1 1 0 1
G 1/32 1 1 1 0 0
H 1/32 1 1 1 0 1
I 1/32 1 1 1 1 0
J 1/32 1 1 1 1 1
4. Fano Code.
Example 1:
5. stop when no more groups to divide

Symbol Probability Fano Code


A 1/4 0 0
B 1/4 0 1
C 1/8 1 0 0
D 1/8 1 0 1
E 1/16 1 1 0 0
F 1/16 1 1 0 1
G 1/32 1 1 1 0 0
H 1/32 1 1 1 0 1
I 1/32 1 1 1 1 0
J 1/32 1 1 1 1 1
4. Fano Code.

Note that: If it was not possible to divide precisely


the probabilities into equally probable
groups, we should try to make the
division as good as possible, as we can
see from the following example.

Example 2:

Symbol Probability Fano Code


T 1/3 00
U 1/3 01
V 1/9 10
W 1/9 110

X 1/9 111
Source Coding Techniques

1. Huffman Code.

2. Two-path Huffman Code.

3. Lemple-Ziv Code.

4. Fano Code.

5. Shannon Code.

6. Arithmetic Code.
5. Shannon Code.

The Shannon code is performed as follows:

1. calculate a series of cumulative probabilities q =∑p


k i , k=1,2,…,n
i =1

2. calculate the code length for each symbol using log( 1\pi ) = li < log ( 1\pi ) + 1

3. write qk in the form c1 2 -1 + c2 2-2 + … + cli 2-li where each ci is either 0 or 1


5. Shannon Code.
Example 3: 1. calculate a series of cumulative probabilities

Symbol Probability qk Length li Shannon Code


A 1/4 + 0
B 1/4 + 1/4
C 1/8 + 1/2
D 1/8 + 5/8
E 1/16 + 3/4
F 1/16 + 13/16
G 1/32 + 7/8
H 1/32 + 29/32
I 1/32 + 15/16
J 1/32 31/32
5. Shannon Code.
Example 3: 2. calculate the code length for each symbol using
log( 1\pi ) = li < log ( 1\pi ) + 1

Symbol Probability qk Length li Shannon Code


A 1/4 0
B 1/4 1/4
C 1/8 1/2
D 1/8 5/8
E 1/16 3/4
F 1/16 13/16
G 1/32 7/8
H 1/32 29/32
I 1/32 15/16
J 1/32 31/32
5. Shannon Code.
Example 3: 2. calculate the code length for each symbol using
log( 1\pi ) = li < log ( 1\pi ) + 1

Symbol Probability qk LengthLog(1/(1/4))


li Shannon Code + 1
= l < Log(1/(1/4))
1
A 1/4 0
2 = l1 < 2 + 1
B 1/4 1/4
C 1/8 1/2
D 1/8 5/8
E 1/16 3/4
F 1/16 13/16
G 1/32 7/8
H 1/32 29/32
I 1/32 15/16
J 1/32 31/32
5. Shannon Code.
Example 3: 2. calculate the code length for each symbol using
log( 1\pi ) = li < log ( 1\pi ) + 1

Log(1/(1/4)) = l1 < Log(1/(1/4)) + 1


Symbol Probability qk Length li Shannon Code
2 = l1 < 2 + 1
A 1/4 0 2
B 1/4 1/4 l1 = 2
C 1/8 1/2
D 1/8 5/8
E 1/16 3/4
F 1/16 13/16
G 1/32 7/8
H 1/32 29/32
I 1/32 15/16
J 1/32 31/32
5. Shannon Code.
Example 3: 2. calculate the code length for each symbol using
log( 1\pi ) = li < log ( 1\pi ) + 1

Symbol Probability qk Length li Shannon Code


A 1/4 0 2
B 1/4 1/4 2
C 1/8 1/2 3
D 1/8 5/8 3
E 1/16 3/4 4
F 1/16 13/16 4
G 1/32 7/8 5
H 1/32 29/32 5
I 1/32 15/16 5
J 1/32 31/32 5
5. Shannon Code.
Example 3:
3. write qk in the form c1 2 -1 + c2 2-2 + … + cli 2-li where each ci is either 0 or 1

Symbol Probability qk Length li Shannon Code


A 1/4 0 2
B 1/4 1/4 2
C 1/8 1/2 3
D 1/8 5/8 3
E 1/16 3/4 4
F 1/16 13/16 4
G 1/32 7/8 5
H 1/32 29/32 5
I 1/32 15/16 5
J 1/32 31/32 5
5. Shannon Code.
Example 3:
3. write qk in the form c1 2 -1 + c2 2-2 + … + cli 2-li where each ci is either 0 or 1

Symbol Probability qk Length li q = c


Shannon
k 2 -1 + c 2-2
1 Code 2

A 1/4 0 2
B 1/4 1/4 2
C 1/8 1/2 3
D 1/8 5/8 3
E 1/16 3/4 4
F 1/16 13/16 4
G 1/32 7/8 5
H 1/32 29/32 5
I 1/32 15/16 5
J 1/32 31/32 5
5. Shannon Code.
Example 3:
3. write qk in the form c1 2 -1 + c2 2-2 + … + cli 2-li where each ci is either 0 or 1

Symbol Probability qk Length li k = c1 2 Code


Shannon
q -1 + c 2-2
2
A 1/4 0 2 0 = c1 2 -1 + c2 2-2
B 1/4 1/4 2
C 1/8 1/2 3
D 1/8 5/8 3
E 1/16 3/4 4
F 1/16 13/16 4
G 1/32 7/8 5
H 1/32 29/32 5
I 1/32 15/16 5
J 1/32 31/32 5
5. Shannon Code.
Example 3:
3. write qk in the form c1 2 -1 + c2 2-2 + … + cli 2-li where each ci is either 0 or 1

Symbol Probability qk Length li q = c


Shannon
k 1 2 -1 + c 2-2
Code 2
A 1/4 0 2 0 = c1 2 -1 + c2 2-2

B 1/4 1/4 2
c1 = 0, c2 = 0
C 1/8 1/2 3
D 1/8 5/8 3
E 1/16 3/4 4
F 1/16 13/16 4
G 1/32 7/8 5
H 1/32 29/32 5
I 1/32 15/16 5
J 1/32 31/32 5
5. Shannon Code.
Example 3:
3. write qk in the form c1 2 -1 + c2 2-2 + … + cli 2-li where each ci is either 0 or 1

Symbol Probability qk Length li Shannon Code


A 1/4 0 2 00
B 1/4 1/4 2
C 1/8 1/2 3
D 1/8 5/8 3 qk = c1 2 -1 + c2 2-2
E 1/16 3/4 4 0 = c1 2 -1 + c2 2-2
F 1/16 13/16 4
c1 = 0, c2 = 0
G 1/32 7/8 5
H 1/32 29/32 5
I 1/32 15/16 5
J 1/32 31/32 5
5. Shannon Code.
Example 3:
3. write qk in the form c1 2 -1 + c2 2-2 + … + cli 2-li where each ci is either 0 or 1

Symbol Probability qk Length li Shannon Code


A 1/4 0 2 00
B 1/4 1/4 2 01
C 1/8 1/2 3 100
D 1/8 5/8 3 101
E 1/16 3/4 4 1100
F 1/16 13/16 4 1101
G 1/32 7/8 5 11100
H 1/32 29/32 5 11101
I 1/32 15/16 5 11110
J 1/32 31/32 5 11111
5. Shannon Code.
Example 3:

Symbol Probability qk Length li Shannon Code


A 1/4 0 2 00
B 1/4 1/4 2 01
C 1/8 1/2 3 100
D 1/8 5/8 3 101
E 1/16 3/4 4 1100
F 1/16 13/16 4 1101
G 1/32 7/8 5 11100
H 1/32 29/32 5 11101
I 1/32 15/16 5 11110
J 1/32 31/32 5 11111
5. Shannon Code.

Note that:

from examples 1 and 3 one may conclude that Fano


coding and Shannon coding produce the same code,
however this is not true in general as we can see
from the following example.

Example

Symbol Probability qk Length li Shannon Code Fano code

W 0.4 0 2 00 0
X 0.3 0.4 2 01 10
Y 0.2 0.7 3 101 110
Z 0.1 0.9 4 1110 111
Source Coding Techniques

1. Huffman Code.

2. Two-path Huffman Code.

3. Lemple-Ziv Code.

4. Shannon Code.

5. Fano Code.

6. Arithmetic Code.
6. Arithmetic Code.
Coding

In arithmetic coding a message is encoded as a number from the


interval [0, 1).

The number is found by expanding it according to the probability of the


currently processed letter of the message being encoded.

This is done by using a set of interval ranges IR determined by the


probabilities of the information source as follows:

IR ={ [0, p1), [p1, p1 + p2), [p1+ p2, p1 + p2 + p3), … [p1 + … + pn-1, p1+ … + pn) }
j

Putting q =∑ p we can write IR = { [0, q1), [q1, q2), …[qn-1, 1) }


j i
i =1

In arithmetic coding these subintervals also determine the proportional


division of any other interval [L, R) contained in [0, 1) into subintervals
IR[L,R] as follows:
6. Arithmetic Code.
Coding

In arithmetic coding these subintervals also determine the proportional


division of any other interval [L, R) contained in [0, 1) into subintervals
IR[L,R] as follows:
IR[L,R] = { [L, L+(R-L) q1), [L+(R-L) q1, L+(R-L) q2), [L+(R-L) q 2, L+(R-L) q3), … , [L+(R-L) Pn-1, L+(R-L) ) }

Using these definitions the arithmetic encoding is determined by the


Following algorithm:

ArithmeticEncoding ( Message )
1. CurrentInterval = [0, 1);
While the end of message is not reached
2. Read letter xi from the message;
3. Divid CurrentInterval into subintervals IRCurrentInterval;
Output any number from the CurrentInterval (usually its left boundary);

This output number uniquely encoding the input message.


6. Arithmetic Code.
Coding

Example Consider the information source

A B C #
0.4 0.3 0.1 0.2

Then the input message ABBC#


has the unique encoding number 0.23608.

As we will see the explanation In the next slides


6. Arithmetic Code.
Coding 2. Read Xi

Example
input message: A B B C #

1. CurrentInterval = [0, 1);

Xi Current interval Subintervals


A [0, 1)
6. Arithmetic Code.
Coding 2. Read Xi

Example
input message: A B B C #

3. Divid CurrentInterval into subintervals IRCurrentInterval;

Xi Current interval Subintervals


A [0, 1)
IR[0,1)= {
[0, 0.4) , [0.4, 0.7),
[0.7, 0.8), [0.8, 1)
j
} q =∑ p
j i
i =1

[L+(R-L) qi, L+(R-L) qi+1)


6. Arithmetic Code.
Coding 2. Read Xi

Example
input message: A B B C #

3. Divid CurrentInterval into subintervals IRCurrentInterval;

Xi Current interval Subintervals


A [0, 1) [0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1)

IR[0,1)= {
[0, 0.4) , [0.4, 0.7),
[0.7, 0.8), [0.8, 1)
}
6. Arithmetic Code.
Coding

Example
input message: A B B C #
No. 1

A B C #
0.4 0.3 0.1 0.2

Xi Current interval Subintervals


A [0, 1) [0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1)
[0, 0.4)
6. Arithmetic Code.
Coding 2. Read Xi

Example
input message: A B B C #

3. Divid CurrentInterval into subintervals IRCurrentInterval;

Xi Current interval Subintervals


A [0, 1) [0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1)
B [0, 0.4)
IR[0,0.4)= {
[0, 0.16) , [0.16, 0.28),
[0.28, 0.32), [0.32, 0.4)
j
} q =∑ p
j i
i =1

[L+(R-L) qi, L+(R-L) qi+1)


6. Arithmetic Code.
Coding 2. Read Xi

Example
input message: A B B C #

3. Divid CurrentInterval into subintervals IRCurrentInterval;

Xi Current interval Subintervals


A [0, 1) [0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1)
B [0, 0.4) [0, 0.16) , [0.16, 0.28), [0.28, 0.32), [0.32, 0.4)

IR[0,0.4)= {
[0, 0.16) , [0.16, 0.28),
[0.28, 0.32), [0.32, 0.4)
}
6. Arithmetic Code.
Coding

Example
input message: A B B C #
No. 2

A B C #
0.4 0.3 0.1 0.2
Xi Current interval Subintervals
A [0, 1) [0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1)
B [0, 0.4) [0, 0.16) , [0.16, 0.28), [0.28, 0.32), [0.32, 0.4)
[0.16, 0.28)
6. Arithmetic Code.
Coding 2. Read Xi

Example
input message: A B B C #

3. Divid CurrentInterval into subintervals IRCurrentInterval;

Xi Current interval Subintervals


A [0, 1) [0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1)
B [0, 0.4) [0, 0.16) , [0.16, 0.28), [0.28, 0.32), [0.32, 0.4)
B [0.16, 0.28) [ 0.16, 0.208) , [0.208, 0.244), [0.244, 0.256), [0.256, 0.28)
6. Arithmetic Code.
Coding

Example
input message: A B B C #
No. 2

A B C #
0.4 0.3 0.1 0.2
Xi Current interval Subintervals
A [0, 1) [0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1)
B [0, 0.4) [0, 0.16) , [0.16, 0.28), [0.28, 0.32), [0.32, 0.4)
B [0.16, 0.28) [ 0.16, 0.208) , [0.208, 0.244), [0.244, 0.256), [0.256, 0.28)

[0.208, 0.244)
6. Arithmetic Code.
Coding 2. Read Xi

Example
input message: A B B C #

3. Divid CurrentInterval into subintervals IRCurrentInterval;

Xi Current interval Subintervals


A [0, 1) [0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1)
B [0, 0.4) [0, 0.16) , [0.16, 0.28), [0.28, 0.32), [0.32, 0.4)
B [0.16, 0.28) [ 0.16, 0.208) , [0.208, 0.244), [0.244, 0.256), [0.256, 0.28)

C [0.208, 0.244) [0.208, 0.2224) , [0.2224, 0.2332), [0.2332, 0.2368), [0.2368, 0.244)
6. Arithmetic Code.
Coding

Example
input message: A B B C #
No. 3

A B C #
0.4 0.3 0.1 0.2
Xi Current interval Subintervals
A [0, 1) [0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1)
B [0, 0.4) [0, 0.16) , [0.16, 0.28), [0.28, 0.32), [0.32, 0.4)
B [0.16, 0.28) [ 0.16, 0.208) , [0.208, 0.244), [0.244, 0.256), [0.256, 0.28)

C [0.208, 0.244) [0.208, 0.2224) , [0.2224, 0.2332), [0.2332, 0.2368), [0.2368, 0.244)

[0.2332, 0.2368)
6. Arithmetic Code.
Coding 2. Read Xi

Example
input message: A B B C #

3. Divid CurrentInterval into subintervals IRCurrentInterval;

Xi Current interval Subintervals


A [0, 1) [0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1)
B [0, 0.4) [0, 0.16) , [0.16, 0.28), [0.28, 0.32), [0.32, 0.4)
B [0.16, 0.28) [ 0.16, 0.208) , [0.208, 0.244), [0.244, 0.256), [0.256, 0.28)

C [0.208, 0.244) [0.208, 0.2224) , [0.2224, 0.2332), [0.2332, 0.2368), [0.2368, 0.244)

# [0.2332, 0.2368) [0.2332, 0.23464) , [0.23464, 0.23572), [0.23572, 0.23608), [0.23608, 0.2368)
6. Arithmetic Code.
Coding 2. Read Xi

Example
input message: A B B C #
No. 3

A B C #
0.4 0.3 0.1 0.2
Xi Current interval Subintervals
A [0, 1) [0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1)
B [0, 0.4) [0, 0.16) , [0.16, 0.28), [0.28, 0.32), [0.32, 0.4)
B [0.16, 0.28) [ 0.16, 0.208) , [0.208, 0.244), [0.244, 0.256), [0.256, 0.28)

C [0.208, 0.244) [0.208, 0.2224) , [0.2224, 0.2332), [0.2332, 0.2368), [0.2368, 0.244)

# [0.2332, 0.2368) [0.2332, 0.23464) , [0.23464, 0.23572), [0.23572, 0.23608), [0.23608, 0.2368)

[0.23608, 0.2368)
6. Arithmetic Code.
Coding 2. Read Xi

Example
input message: A B B C #

# is the end of input message Stop Return current interval [0.23608, 0.2368)

Xi Current interval Subintervals


A [0, 1) [0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1)
B [0, 0.4) [0, 0.16) , [0.16, 0.28), [0.28, 0.32), [0.32, 0.4)
B [0.16, 0.28) [ 0.16, 0.208) , [0.208, 0.244), [0.244, 0.256), [0.256, 0.28)

C [0.208, 0.244) [0.208, 0.2224) , [0.2224, 0.2332), [0.2332, 0.2368), [0.2368, 0.244)

# [0.2332, 0.2368) [0.2332, 0.23464) , [0.23464, 0.23572), [0.23572, 0.23608), [0.23608, 0.2368)

[0.23608, 0.2368)
6. Arithmetic Code.
Coding

Example
input message: A B B C #

# is the end of input message Stop Return current interval [0.23608, 0.2368)

Return the lower bound of the currentinterval as the codeword


of the input message

Input message Codeword

ABBC# 0.23608
6. Arithmetic Code.
Decoding

Arithmetic decoding can be determined by the following algorithm:

ArithmeticDecoding ( Codeword )
0. CurrentInterval = [0, 1);
While(1)
1. Divid CurrentInterval into subintervals IRCurrentInterval;
2. Determine the subintervali of CurrentInterval to which
Codeword belongs;
3. Output letter xi corresponding to this subinterval;
4. If xi is the symbol ‘#’
Return;
5. CurrentInterval = subintervali in IRCurrentInterval;
6. Arithmetic Code.
Decoding

Example Consider the information source

Symbol Probability
A 0.4
B 0.3
C 0.1
# 0.2

Then the input code word 0.23608 can be decoded


to the message ABBC#

As we will see the explanation In the next slides


6. Arithmetic Code.
Decoding

Example input codeword: 0.23608

0. CurrentInterval = [0, 1);

Current interval Subintervals Output


[0, 1)
6. Arithmetic Code.
Decoding

Example input codeword: 0.23608

1. Divid CurrentInterval into subintervals IRCurrentInterval;

Current interval Subintervals Output


[0, 1) IR[0,1)= {
[0, 0.4) , [0.4, 0.7),
[0.7, 0.8), [0.8, 1)
}
j

q =∑ p
j i
i =1

[L+(R-L) qi, L+(R-L) qi+1)


6. Arithmetic Code.
Decoding

Example input codeword: 0.23608

Current interval Subintervals Output


[0, 1) [0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1)

IR[0,1)= {
[0, 0.4) , [0.4, 0.7),
[0.7, 0.8), [0.8, 1)
}
6. Arithmetic Code.
Decoding

Example input codeword: 0.23608

2. Determine the subintervali of CurrentInterval to which Codeword belongs;

0 = 0.23608 < 0.4

Current interval Subintervals Output


[0, 1) [0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1)
6. Arithmetic Code.
Decoding

Example input codeword: 0.23608

2. Determine the subintervali of CurrentInterval to which Codeword belongs;

0 = 0.23608 < 0.4

Current interval Subintervals Output


[0, 1) [0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1)
6. Arithmetic Code.
Decoding

Example input codeword: 0.23608


3. Output letter xi corresponding to this subinterval;
No. 1

A B C #
No. 1
0.4 0.3 0.1 0.2
Current interval Subintervals Output
[0, 1) [0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1) A
6. Arithmetic Code.
Decoding

Example input codeword: 0.23608

4. If xi is the symbol ‘#’

Current interval Subintervals Output


[0, 1) [0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1) A
6. Arithmetic Code.
Decoding

Example input codeword: 0.23608

4. If xi is the symbol ‘#’ NO

Current interval Subintervals Output


[0, 1) [0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1) A
6. Arithmetic Code.
Decoding

Example input codeword: 0.23608

5. CurrentInterval = subintervali in IRCurrentInterval;

Current interval Subintervals Output


[0, 1) [0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1) A
6. Arithmetic Code.
Decoding

Example input codeword: 0.23608

5. CurrentInterval = subintervali in IRCurrentInterval;

Current interval Subintervals Output


[0, 1) [0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1) A
[0, 0.4)
6. Arithmetic Code.
Decoding

Example input codeword: 0.23608

Similarly we repeat the algorithm steps 1 to 5 until the output symbol = ‘#’

Current interval Subintervals Output


[0, 1) [0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1) A
[0, 0.4) [0, 0.16) , [0.16, 0.28), [0.28, 0.32), [0.32, 0.4) B
[0.16, 0.28) [ 0.16, 0.208) , [0.208, 0.244), [0.244, 0.256), [0.256, 0.28) B
[0.208, 0.244) [0.208, 0.2224) , [0.2224, 0.2332), [0.2332, 0.2368), [0.2368, 0.244) C
[0.2332, 0.2368) [0.2332, 0.23464) , [0.23464, 0.23572), [0.23572, 0.23608), [0.23608, 0.2368) #

4. If xi is the symbol ‘#’ Yes Stop


Return the output message: A B B C #

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy