Source Coding Techniques: 1. Huffman Code. 2. Two-Pass Huffman Code. 3. Lemple-Ziv Code
Source Coding Techniques: 1. Huffman Code. 2. Two-Pass Huffman Code. 3. Lemple-Ziv Code
1. Huffman Code.
3. Lemple-Ziv Code.
4. Fano code.
5. Shannon Code.
6. Arithmetic Code.
Source Coding Techniques
1. Huffman Code.
3. Lemple-Ziv Code.
4. Fano Code.
5. Shannon Code .
6. Arithmetic Code.
Source Coding Techniques
1. Huffman Code.
With the Huffman code in the binary case the two least
probable source output symbols are joined together,
resulting in a new message alphabet with one less symbol
ADVANTAGES:
• uniquely decodable code
• smallest average codeword length
DISADVANTAGES:
• LARGE tables give complexity
• sensitive to channel errors
1. Huffman Code.
s1 0.2
s3 0.2
s0 0.1
s4 0.1
Solution A
Source Stage I Stage II
Symbol
sk
s2 0.4 0.4
s1 0.2 0.2
s3 0.2 0.2
s0 0.1 0.2
s4 0.1
Solution A
Source Stage I Stage II Stage III
Symbol
sk
s2 0.4 0.4 0.4
s0 0.1 0.2
s4 0.1
Solution A
Source Stage I Stage II Stage III Stage IV
Symbol
sk
s2 0.4 0.4 0.4 0.6
s0 0.1 0.2
s4 0.1
Solution A
Source Stage I Stage II Stage III Stage IV
Symbol
sk
0
s2 0.4 0.4 0.4 0.6
0
s1 0.2 0.2 0.4 0.4 1
0
s3 0.2 0.2 0.2 1
0
s0 0.1 0.2 1
s4 0.1 1
Solution A
Source Stage I Stage II Stage III Stage IV Code
Symbol
sk
0
s2 0.4 0.4 0.4 0.6 00
0
s1 0.2 0.2 0.4 0.4 1
10
0
s3 0.2 0.2 0.2 1
11
0
s0 0.1 0.2 1
010
s4 0.1 1
011
Solution A Cont’d
H ( S ) = 2.12193
Source Symbol Code
Symbol Probability word ck
sk pk
s0 0.1 010
L = 0.4 × 2 + 0.2 × 2
s1 0.2 10
+0.2 × 2 + 0.1 × 3 + 0.1 × 3
s2 0.4 00
= 2.2
s3 0.2 11
H (S ) ≤ L < H ( S ) + 1
s4 0.1 011
s4 0.1 1
0011
Another Solution B Cont’d
H ( S ) = 2.12193
Source Symbol Code
Symbol Probability word ck
sk pk
s0 0.1 0010
L = 0.4 × 1 + 0.2 × 2
s1 0.2 01
+0.2 × 3 + 0.1 × 4 + 0.1 × 4
s2 0.4 1
= 2.2
s3 0.2 000
H (S ) ≤ L < H ( S ) + 1
s4 0.1 0011
What is the difference between
the two solutions?
• They have the same average length
• They differ in the variance of the average code
length
K −1
( )
2
σ = ∑ pk l k − L
2
k =0
• Solution A
• s 2=0.16
• Solution B
• s 2=1.36
Source Coding Techniques
1. Huffman Code.
3. Lemple-Ziv Code.
4. Fano Code.
5. Shannon Code.
6. Arithmetic Code.
Source Coding Techniques
2. Two-pass Huffman Code.
Example
0
Source Coding Techniques
1. Huffman Code.
3. Lemple-Ziv Code.
4. Fano Code.
5. Shannon Code.
6. Arithmetic Code.
Lempel-Ziv Coding
• Huffman coding requires knowledge of a
probabilistic model of the source
• This is not necessarily always feasible
• Lempel-Ziv code is an adaptive coding
technique that does not require prior
knowledge of symbol probabilities
• Lempel-Ziv coding is the basis of well-known
ZIP for data compression
Lempel-Ziv Coding History
•Universal
•Lossless
Lempel-Ziv Coding Example
0 0 0 1 0 1 1 1 0 0 1 0 1 0 0 1 0 1…
Codebook 1 2 3 4 5 6 7 8 9
Index
Subsequence 0 1
Representation
Encoding
Lempel-Ziv Coding Example
0 0 0 1 0 1 1 1 0 0 1 0 1 0 0 1 0 1…
Codebook 1 2 3 4 5 6 7 8 9
Index
Subsequence 0 1 00
Representation
Encoding
Lempel-Ziv Coding Example
0 0 0 1 0 1 1 1 0 0 1 0 1 0 0 1 0 1…
Codebook 1 2 3 4 5 6 7 8 9
Index
Subsequence 0 1 00 01
Representation
Encoding
Lempel-Ziv Coding Example
0 0 0 1 0 1 1 1 0 0 1 0 1 0 0 1 0 1…
Codebook 1 2 3 4 5 6 7 8 9
Index
Subsequence 0 1 00 01 011
Representation
Encoding
Lempel-Ziv Coding Example
0 0 0 1 0 1 1 1 0 0 1 0 1 0 0 1 0 1…
Codebook 1 2 3 4 5 6 7 8 9
Index
Subsequence 0 1 00 01 011 10
Representation
Encoding
Lempel-Ziv Coding Example
0 0 0 1 0 1 1 1 0 0 1 0 1 0 0 1 0 1…
Codebook 1 2 3 4 5 6 7 8 9
Index
Representation
Encoding
Lempel-Ziv Coding Example
0 0 0 1 0 1 1 1 0 0 1 0 1 0 0 1 0 1…
Codebook 1 2 3 4 5 6 7 8 9
Index
Representation
Encoding
Lempel-Ziv Coding Example
0 0 0 1 0 1 1 1 0 0 1 0 1 0 0 1 0 1…
Codebook 1 2 3 4 5 6 7 8 9
Index
Representation
Encoding
Lempel-Ziv Coding Example
Information bits 0 0 0 1 0 1 1 1 0 0 1 0 1 0 0 1 0 1…
Source encoded bits 0010 0011 1001 0100 1000 1100 1101
Codebook 1 2 3 4 5 6 7 8 9
Index
Representation 11 12 42 21 41 61 62
? If wa is in the dictionary,
? Process the next symbol with segment wa.
LZ Encoding example
• address 0: a address 1: b address 2: c
Input string: a a b a a c a b c a b c b
a a b a a ca b c a b c b output update
Output String:
Input update
a ? Output a 0
a a ! output a determines ? = a, update aa 0 aa 3
a a b . output 1 determines !=b, update ab 1 ab 4
a a b a a . 3 ba 5
a a b a a c . 2 aac 6
a a b a a c a b . 4 ca 7
a a b a a c a b c a . 7 abc 8
0011001111010100010001001
Source Coding Techniques
1. Huffman Code.
3. Lemple-Ziv Code.
4. Fano Code.
5. Shannon Code.
6. Arithmetic Code.
4. Fano Code.
Example 2:
X 1/9 111
Source Coding Techniques
1. Huffman Code.
3. Lemple-Ziv Code.
4. Fano Code.
5. Shannon Code.
6. Arithmetic Code.
5. Shannon Code.
2. calculate the code length for each symbol using log( 1\pi ) = li < log ( 1\pi ) + 1
A 1/4 0 2
B 1/4 1/4 2
C 1/8 1/2 3
D 1/8 5/8 3
E 1/16 3/4 4
F 1/16 13/16 4
G 1/32 7/8 5
H 1/32 29/32 5
I 1/32 15/16 5
J 1/32 31/32 5
5. Shannon Code.
Example 3:
3. write qk in the form c1 2 -1 + c2 2-2 + … + cli 2-li where each ci is either 0 or 1
B 1/4 1/4 2
c1 = 0, c2 = 0
C 1/8 1/2 3
D 1/8 5/8 3
E 1/16 3/4 4
F 1/16 13/16 4
G 1/32 7/8 5
H 1/32 29/32 5
I 1/32 15/16 5
J 1/32 31/32 5
5. Shannon Code.
Example 3:
3. write qk in the form c1 2 -1 + c2 2-2 + … + cli 2-li where each ci is either 0 or 1
Note that:
Example
W 0.4 0 2 00 0
X 0.3 0.4 2 01 10
Y 0.2 0.7 3 101 110
Z 0.1 0.9 4 1110 111
Source Coding Techniques
1. Huffman Code.
3. Lemple-Ziv Code.
4. Shannon Code.
5. Fano Code.
6. Arithmetic Code.
6. Arithmetic Code.
Coding
IR ={ [0, p1), [p1, p1 + p2), [p1+ p2, p1 + p2 + p3), … [p1 + … + pn-1, p1+ … + pn) }
j
ArithmeticEncoding ( Message )
1. CurrentInterval = [0, 1);
While the end of message is not reached
2. Read letter xi from the message;
3. Divid CurrentInterval into subintervals IRCurrentInterval;
Output any number from the CurrentInterval (usually its left boundary);
A B C #
0.4 0.3 0.1 0.2
Example
input message: A B B C #
Example
input message: A B B C #
Example
input message: A B B C #
IR[0,1)= {
[0, 0.4) , [0.4, 0.7),
[0.7, 0.8), [0.8, 1)
}
6. Arithmetic Code.
Coding
Example
input message: A B B C #
No. 1
A B C #
0.4 0.3 0.1 0.2
Example
input message: A B B C #
Example
input message: A B B C #
IR[0,0.4)= {
[0, 0.16) , [0.16, 0.28),
[0.28, 0.32), [0.32, 0.4)
}
6. Arithmetic Code.
Coding
Example
input message: A B B C #
No. 2
A B C #
0.4 0.3 0.1 0.2
Xi Current interval Subintervals
A [0, 1) [0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1)
B [0, 0.4) [0, 0.16) , [0.16, 0.28), [0.28, 0.32), [0.32, 0.4)
[0.16, 0.28)
6. Arithmetic Code.
Coding 2. Read Xi
Example
input message: A B B C #
Example
input message: A B B C #
No. 2
A B C #
0.4 0.3 0.1 0.2
Xi Current interval Subintervals
A [0, 1) [0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1)
B [0, 0.4) [0, 0.16) , [0.16, 0.28), [0.28, 0.32), [0.32, 0.4)
B [0.16, 0.28) [ 0.16, 0.208) , [0.208, 0.244), [0.244, 0.256), [0.256, 0.28)
[0.208, 0.244)
6. Arithmetic Code.
Coding 2. Read Xi
Example
input message: A B B C #
C [0.208, 0.244) [0.208, 0.2224) , [0.2224, 0.2332), [0.2332, 0.2368), [0.2368, 0.244)
6. Arithmetic Code.
Coding
Example
input message: A B B C #
No. 3
A B C #
0.4 0.3 0.1 0.2
Xi Current interval Subintervals
A [0, 1) [0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1)
B [0, 0.4) [0, 0.16) , [0.16, 0.28), [0.28, 0.32), [0.32, 0.4)
B [0.16, 0.28) [ 0.16, 0.208) , [0.208, 0.244), [0.244, 0.256), [0.256, 0.28)
C [0.208, 0.244) [0.208, 0.2224) , [0.2224, 0.2332), [0.2332, 0.2368), [0.2368, 0.244)
[0.2332, 0.2368)
6. Arithmetic Code.
Coding 2. Read Xi
Example
input message: A B B C #
C [0.208, 0.244) [0.208, 0.2224) , [0.2224, 0.2332), [0.2332, 0.2368), [0.2368, 0.244)
# [0.2332, 0.2368) [0.2332, 0.23464) , [0.23464, 0.23572), [0.23572, 0.23608), [0.23608, 0.2368)
6. Arithmetic Code.
Coding 2. Read Xi
Example
input message: A B B C #
No. 3
A B C #
0.4 0.3 0.1 0.2
Xi Current interval Subintervals
A [0, 1) [0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1)
B [0, 0.4) [0, 0.16) , [0.16, 0.28), [0.28, 0.32), [0.32, 0.4)
B [0.16, 0.28) [ 0.16, 0.208) , [0.208, 0.244), [0.244, 0.256), [0.256, 0.28)
C [0.208, 0.244) [0.208, 0.2224) , [0.2224, 0.2332), [0.2332, 0.2368), [0.2368, 0.244)
# [0.2332, 0.2368) [0.2332, 0.23464) , [0.23464, 0.23572), [0.23572, 0.23608), [0.23608, 0.2368)
[0.23608, 0.2368)
6. Arithmetic Code.
Coding 2. Read Xi
Example
input message: A B B C #
# is the end of input message Stop Return current interval [0.23608, 0.2368)
C [0.208, 0.244) [0.208, 0.2224) , [0.2224, 0.2332), [0.2332, 0.2368), [0.2368, 0.244)
# [0.2332, 0.2368) [0.2332, 0.23464) , [0.23464, 0.23572), [0.23572, 0.23608), [0.23608, 0.2368)
[0.23608, 0.2368)
6. Arithmetic Code.
Coding
Example
input message: A B B C #
# is the end of input message Stop Return current interval [0.23608, 0.2368)
ABBC# 0.23608
6. Arithmetic Code.
Decoding
ArithmeticDecoding ( Codeword )
0. CurrentInterval = [0, 1);
While(1)
1. Divid CurrentInterval into subintervals IRCurrentInterval;
2. Determine the subintervali of CurrentInterval to which
Codeword belongs;
3. Output letter xi corresponding to this subinterval;
4. If xi is the symbol ‘#’
Return;
5. CurrentInterval = subintervali in IRCurrentInterval;
6. Arithmetic Code.
Decoding
Symbol Probability
A 0.4
B 0.3
C 0.1
# 0.2
q =∑ p
j i
i =1
IR[0,1)= {
[0, 0.4) , [0.4, 0.7),
[0.7, 0.8), [0.8, 1)
}
6. Arithmetic Code.
Decoding
A B C #
No. 1
0.4 0.3 0.1 0.2
Current interval Subintervals Output
[0, 1) [0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1) A
6. Arithmetic Code.
Decoding
Similarly we repeat the algorithm steps 1 to 5 until the output symbol = ‘#’