0% found this document useful (0 votes)
73 views22 pages

Greedy Algorithms: Part - 4: Huffman Coding

The document discusses the longest increasing subsequence (LIS) problem in dynamic programming. It can be solved using a bottom-up dynamic programming approach by calculating the length of the longest increasing subsequence ending at each index i (LIS[i]) and taking the maximum value. LIS[i] is calculated as 1 plus the maximum of LIS[j] for all j < i where the array element at j is smaller than i. This allows building up the solutions from smaller subproblems.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
73 views22 pages

Greedy Algorithms: Part - 4: Huffman Coding

The document discusses the longest increasing subsequence (LIS) problem in dynamic programming. It can be solved using a bottom-up dynamic programming approach by calculating the length of the longest increasing subsequence ending at each index i (LIS[i]) and taking the maximum value. LIS[i] is calculated as 1 plus the maximum of LIS[j] for all j < i where the array element at j is smaller than i. This allows building up the solutions from smaller subproblems.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Greedy Algorithms

Part - 4 : Huffman Coding


Huffman Codes

• Widely used technique for data compression

• Assume the data to be a sequence of characters

• Looking for an effective way of storing the data

• Binary character code

– Uniquely represents a character by a binary string

2
Two Ways to Compress data

• Fixed-Length Codewords

• Variable-Length Codewords

3
Example
E.g.:

Suppose you have to compress a data file containing


100,000 characters

The data file only has 6 characters (a, b, c, d, e, f)

a b c d e f
Frequency (thousands) 45 13 12 16 9 5

4
Fixed-Length Codes
E.g.: Data file containing 100,000 characters
a b c d e f
Frequency (thousands) 45 13 12 16 9 5

• 3 bits needed

• a = 000, b = 001, c = 010, d = 011, e = 100, f = 101

• Requires: 100,000 X 3 = 300,000 bits

5
Huffman Codes
• Idea:
– Variable Length Codes

– Use the frequencies of occurrence of characters to


build a optimal way of representing each character

6
Constructing a Huffman Code
• Idea:
– Arranging the characters in Ascending order of their
frequencies

f: 5 e: 9 c: 12 b: 13 d: 16 a: 45

– Add Two minimum frequencies and place them to the


right position

7
Example
f: 5 e: 9 c: 12 b: 13 d: 16 a: 45 c: 12 b: 13 14 d: 16 a: 45
0 1
f: 5 e: 9

14 d: 16 25 a: 45 25 30 a: 45
0 1 0 1 0 1 0 1
f: 5 e: 9 c: 12 b: 13 c: 12 b: 13 14 d: 16
0 1
f: 5 e: 9

a: 45 55 0 100 1
0 1
a: 45 55
25 30 0 1
0 1 0 1
25 30
c: 12 b: 13 14 d: 16 0 1 0 1
0 1 c: 12 b: 13 d: 16
14
f: 5 e: 9 0 1
f: 5 e: 9 8
Variable-Length Codes
E.g.: Data file containing 100,000 characters

a b c d e f
Frequency (thousands) 45 13 12 16 9 5
Variable length
0 101 100 111 1101 1100
Codewords

• Assign short codewords to frequent characters and


long codewords to infrequent characters

• ((45 * 1) + (13 * 3) + (12 * 3) + (16 * 3) + (9 * 4) + (5 * 4)) X 1,000


= 224,000 bits 9
Comparison Ration

a b c d e f
Frequency (thousands) 45 13 12 16 9 5
Fixed Length
000 001 010 011 100 101
Codewords
Variable length
0 101 100 111 1101 1100
Codewords
300000 – 224000
X 100
300000

= 25.33%

Saves 20% to 90% cost. 10


Building a Huffman Code
Alg.: HUFFMAN(C) Running time: O(nlgn)
1. n  C 
2. QC O(n)
3. for i  1 to n – 1
4. do allocate a new node z
5. left[z]  x  EXTRACT-MIN(Q)
O(nlgn)
6. right[z]  y  EXTRACT-MIN(Q)
7. f[z]  f[x] + f[y]
8. INSERT (Q, z)
9. return EXTRACT-MIN(Q)
11
Encoding and Decoding
0 100 1 Character Codeword
a: 45 55
0 1 a 0
25 30 b 101
0 1 0 1
c: 12 b: 13 14 d: 16 c 100
0 1
d 111
f: 5 e: 9
e 1101
f 1100

Encode : abcde

Decode: 101100011111011100
12
Dynamic Programming

Introduction
Dynamic Programming
• An algorithm design technique for optimization problems
(similar to divide and conquer)

• Dynamic Programming solves every subproblems just once


and store the answer in a table; next time when the result is
needed , it simply use the data from the table.

• Examines the previously solved subproblems and combines


their solutions to give the best solution for the whole problem.

14
Dynamic Programming

Part – 1: Longest increasing subsequence(LIS)


Longest increasing subsequence(LIS)

The Longest Increasing Subsequence (LIS) problem
is to find the length of the longest subsequence of a
given sequence such that all elements of the
subsequence are sorted in increasing order.

• For example,
{10, 22, 9, 33, 21, 50, 41, 60, 80}

LIS is {10, 22, 33, 50, 60, 80}


Length of LIS = 6
16
Increasing subsequences

• e.g. You have given the following array

9 2 5 3 7 11 8 10 13 6

Now Find The increasing subsequences.

17
Increasing subsequences
• e.g. 9 2 5 3 7 11 8 10 13 6
2 3 7
5 7 10 13 are increasing subsequences.
2 5 7 8 10 13
2 3 7 8 10 13
We want to find a longest one.
So LIS = 2 3 7 8 10 13 OR 2 5 7 8 10 13
And Length of LIS = 6

9 7 11 are not increasing subsequences.


5 3 11 13
18
More Example
• Input : arr[] = {3, 10, 2, 1, 20}
• Output : Length of LIS = 3
• The longest increasing subsequence is 3, 10, 20

• Input : arr[] = {3, 2}


• Output : Length of LIS = 1
• The longest increasing subsequences are {3} and {2}

• Input : arr[] = {50, 3, 10, 7, 40, 80}


• Output : Length of LIS = 4
• The longest increasing subsequence is {3, 7, 40, 80}

19
A naive approach for LIS
• Let LIS[i] be the length of a longest increasing
subsequence ending at position i.
LIS[i] = 1 + max j = 0..i-1{L[j] | aj < ai}
(use a dummy a0 = minimum, and L[0]=0)
i = 1; j = 0 to i-1

Index 0 1 2 3 4 5 6 7 8 9

Input 9 2 5 3 7 11 8 10 13 6

Length

Prev

20
A naive approach for LIS

Index 0 1 2 3 4 5 6 7 8 9

Input 9 2 5 3 7 11 8 10 13 6

Length 1 1 2 2 3 4 4 5 6 3

Prev -1 -1 1 1 2 4 4 6 7 2

The longest increasing subsequence 2, 5, 7, 8, 10, 13


Length of LIS = 6

21
Psudocode or LIS
LIS( int arr[], int n )
• for (i = 0; i < n; i++ )
• lis[i] = 1;

• /* Compute optimized LIS values in bottom up manner */
• for (i = 1; i < n; i++ )
• for (j = 0; j < i; j++ )
• if ( arr[i] > arr[j] && lis[i] < lis[j] + 1)
• lis[i] = lis[j] + 1;

• /* Pick maximum of all LIS values */
• for (i = 0; i < n; i++ )
• if (max < lis[i])
• max = lis[i];

• return max;
22

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy