Chapter 6

Download as pdf or txt
Download as pdf or txt
You are on page 1of 42

CoSc 4082: Computer Vision and Image Processing

Hawassa University, Bensa Daye Campus

Chapter Six:

Image Compression
Kassawmar Mandefro
Department of Computer Science
April, 2024

Compiled By: Kassawmar Mandefro


Introduction and Overview

 The field of image compression continues to grow at


a rapid pace.

 As we look to the future, the need to store and


transmit images will only continue to increase faster
than the available capability to process all the data.

2
Cont….
 Image compression refers to the process of reducing the size
of a digital image file while attempting to minimize the loss of
image quality.
 The primary goal of image compression is to efficiently store
or transmit images using less storage space or bandwidth.
 Image compression involves reducing the size of image data
files, while retaining necessary information.
 Retaining necessary information depends upon the
application.
 Image segmentation methods, which are primarily a data
reduction process, can be used for compression.
3
Cont…
 Compression algorithm development starts with
applications to two-dimensional (2-D) images.
 After the 2-D methods are developed, they are often
extended to video (motion imaging).
 However, we will focus on image compression of single
frames of image data.
 Applications that require image compression are:
1. Internet
2. Businesses
3. Multimedia
4. Satellite imaging
5. Medical imaging

4
Cont…
 There are two primary types of image compression
methods:
1. Lossless compression methods:
 These methods reduce the file size of an image without
sacrificing any image quality.
 These methods exploit redundancy within the image data to
eliminate unnecessary or redundant information while
preserving all the original image details.
 These techniques aim to preserve all the original pixel values
and details of the image.
 Like compression in PNG and GIF formats.
 Examples of lossless compression methods include:
• Run-Length Encoding (RLE)
• Huffman coding
• Arithmetic coding
• Lempel-Ziv-Welch (LZW) compression. 5
2. Lossy compression methods:
Cont…
 These methods reduce the file size of an image by discarding
some image data that are less important.
 This results in some loss of image quality, which may or may
not be noticeable depending on the compression ratio and the
specific characteristics of the image.
 These techniques are commonly used in scenarios where
reducing file size is prioritized over preserving every detail of
the image, such as in multimedia applications, web-based image
sharing, and video streaming.
 Examples of lossy compression methods include:
• JPEG, which is widely used for photographic images, and
• MPEG (Moving Picture Experts Group), used for compressing
video sequences.
6
Data redundancy
 Compression algorithms are developed by taking
advantage of the redundancy that is inherent in image
data.
 Data redundancy refers to the presence of repetitive
information within a dataset, leading to inefficiency in
storage, transmission, or processing.
 Four primary types of redundancy that can be found in
images are:
1. Coding
2. Interpixel
3. Interband
4. Psychovisual redundancy 7
Cont…
1. Coding redundancy
 Occurs when the data used to represent the image is not
utilized in an optimal manner.
 It is due to inefficient encoding of data. For instance, if a
certain pattern or information is repeated in the image, it can
be compressed or encoded more efficiently.
2. Interpixel redundancy
 This redundancy refers to correlations between neighboring
pixels within an image.
 Occurs because adjacent pixels tend to be highly correlated,
in most images the brightness levels do not change rapidly,
but change gradually.
8
3. Interband redundancy Cont…
 In color images, different color channels (e.g., red, green, blue) may
contain redundant information.
 Occurs in color images due to the correlation between bands
within an image – if we extract the red, green and blue bands they
look similar.
4. Psychovisual redundancy
 Not all information in an image is equally important for human
interpretation. Some information is more important to the
human visual system than other.
 This redundancy related to the characteristics of human visual
perception.
 Compression algorithms can remove or reduce information that
humans are less likely to notice, based on principles of human visual
perception. 9
Cont…
 Reducing redundancy is a fundamental goal in image
compression and various other image processing tasks.

 It is essential for efficient storage, transmission, and


processing of image and video data.

 Various compression and processing techniques are


designed to exploit different types of redundancy to
achieve optimal performance in terms of compression
ratio, image quality, and computational complexity.

10
Cont…
 The key in image compression algorithm development is to
determine the minimal data required to retain the
necessary information.

 The compression is achieved by taking advantage of the


redundancy that exists in images.

 If the redundancies are removed prior to compression, for


example with a decorrelation process, a more effective
compression can be achieved.

11
Cont…
 To determine which information can be removed and which
information is important, the image fidelity criteria are used.

 Image fidelity criteria is measures or used to assess the


quality of an image representation compared to the
original or reference image.

 It should be noted that the information required is application


specific, and that, with lossless schemes, there is no need for a
fidelity criteria.

12
Compression System Model

 The compression system model consists of two parts:

1. The compressor
2. The decompressor

 The compressor consists of a preprocessing stage and


encoding stage, whereas the decompressor consists of
a decoding stage followed by a postprocessing stage.
13
Cont…

Fig: Compression System Model


14
Cont…
 Before encoding, preprocessing is performed to prepare the
image for the encoding process, and consists of any number
of operations that are application specific.

 After the compressed file has been decoded,


postprocessing can be performed to eliminate some of the
potentially undesirable artifacts brought about by the
compression process.

 The compressor can be broken into the following stages:


1. Data reduction: Image data can be reduced by gray level
and/or spatial quantization, or can undergo any desired
image improvement (for example, noise removal) process.
15
Cont…
2. Mapping: Involves mapping the original image data into
another mathematical space where it is easier to
compress the data.

3. Quantization: Involves taking potentially continuous data


from the mapping stage and putting it in discrete form.

4. Coding: Involves mapping the discrete data from the


quantizer onto a code in an optimal manner.
 A compression algorithm may consist of all the stages,
or it may consist of only one or two of the stages.
16
Cont…

Fig: The Compressor


17
Cont…
 The decompressor can be broken down into following stages:
1. Decoding: Takes the compressed file and reverses the
original coding by mapping the codes to the original,
quantized values.
2. Inverse mapping: Involves reversing the original mapping
process
3. Postprocessing: Involves enhancing the look of the final image

 This may be done to reverse any preprocessing, for example,


enlarging an image that was shrunk in the data reduction
process.
 In other cases the postprocessing may be used to simply
enhance the image to improve any artifacts from the
compression process itself.
18
Cont…

Fig: The Decompressor


19
Cont…
• The development of a compression algorithm is highly
application specific.

• Preprocessing stage of compression consists of processes


such as enhancement, noise removal, or quantization are
applied.
• The goal of preprocessing is to prepare the image for the
encoding process by eliminating any irrelevant information,
where irrelevant is defined by the application.
• For example, many images that are for viewing purposes only
can be preprocessed by eliminating the lower bit planes,
without losing any useful information.
• The mapping process is important because image data tends
to be highly correlated. 20
Cont…
 Specifically, if the value of one pixel is known, it is highly likely
that the adjacent pixel value is similar.
 By finding a mapping equation that decorrelates the data
this type of data redundancy can be removed.

 Differential coding: Method of reducing data redundancy, by


finding the difference between adjacent pixels and encoding
those values.
 The principal components transform can also be used, which
provides a theoretically optimal decorrelation.

 Color transforms are used to decorrelate data between image


bands. 21
Cont…
 As the spectral domain can also be used for image
compression, so the first stage may include mapping into
the frequency domain where the energy in the image is
compacted into primarily the lower frequency
components

 These methods are all reversible, that is information


preserving, although all mapping methods are not
reversible.

22
Cont…
 Quantization may be necessary to convert the data into digital
form (BYTE data type), depending on the mapping equation
used.

 This is because many of these mapping methods will result in


floating point data which requires multiple bytes for
representation which is not very efficient, if the goal is data
reduction.

 It is important to note that the quantization process is not


reversible, so it is not in the decompression model and also
some information may be lost during quantization.
 Due to quantization process involves rounding or
truncating the original values. 23
Cont…
 The coder in the coding stage provides a one-to-one mapping,
each input is mapped to a unique output by the coder.

 The code can be an equal length code, where all the code
words are the same size, or an unequal length code with
variable length code words.

 In most cases, an unequal length code is the most efficient


for data compression, but requires more overhead in the
coding and decoding stages.

24
Types of data compression

 Lossless Compression:The main mechanisms include:


o Dictionary-based methods like Lempel-Ziv-Welch (LZW) and Run-
Length Encoding (RLE).
o Entropy coding methods like Huffman coding.
o Arithmetic coding.
 Lossy Compression: Key mechanisms include:
o Transform coding: Examples include Discrete Cosine Transform (DCT)
used in JPEG compression and Discrete Wavelet Transform (DWT)
used in JPEG2000.
o Quantization: Reducing the precision of data representation based on
perceptual considerations.

25
LOSSLESS COMPRESSION METHODS
 No loss of data, decompressed image exactly same as
uncompressed image
 Medical images or any images used in courts
 Lossless compression methods typically provide about a 10%
reduction in file size for complex images.
 Lossless compression methods can provide significant
compression for simple images.
 However, lossless compression techniques may be used
for both preprocessing and postprocessing in image
compression algorithms to obtain the extra 10%
compression.

26
Huffman coding
 Huffman coding is a widely used method for lossless data
compression.
 It was developed by David A. Huffman in 1952 while he was a
student at MIT (Massachusetts Institute of Technology).
 It is particularly effective for compressing data with non-uniform
(unequal length code) probability distributions, such as image data
where certain pixel values occur more frequently than others.
 The basic idea behind Huffman coding is to assign variable-length
codes to symbols based on their frequencies, with more frequent
symbols being assigned shorter codes, and less frequent symbols
being assigned longer codes.
 This results is ensuring that the encoded data can be uniquely
decoded. 27
Cont..
 Huffman coding is widely used in various applications,
including image and video compression (e.g., within the JPEG
compression standard), file compression algorithms (e.g., ZIP),
and communication systems.

 It offers efficient compression ratios, especially for data with


skewed symbol frequencies, where some symbols occur much
more frequently than others.

 However, constructing the Huffman tree and storing it along


with the encoded data can add some overhead, particularly
for small datasets or when the symbol frequencies are not
known in advance.
28
Cont…
The situations in using Huffman coding for data compression:
 Frequency Analysis: Determine the frequency of occurrence of
symbols in the input data.
 Building the Huffman Tree: The symbols are then organized into a
binary tree called the Huffman tree. The frequency of each internal
node is the sum of the frequencies of its children.
 Generate Huffman Codes: Once the Huffman tree is constructed,
code words are assigned to each symbol by traversing the tree.
 Encoding: Replace each symbol in the input data with its
corresponding Huffman code.
 Decoding: To decode the encoded data, the Huffman tree used for
encoding is reconstructed.
• The original data can then be reconstructed from the decoded
symbols.
29
Cont…
 The Huffman algorithm can be described in five steps:
1. Find the gray level probabilities for the image by finding
the histogram.
2. Order the input probabilities (histogram magnitudes)
from smallest to largest
3. Combine the smallest two by addition.
4. GOTO step 2, until only one probabilities are left.
5. By working backward along the tree, generate code by
alternating assignment of 0 and 1.
30
Example Cont…
Message = BCCABBDDAECCBBAEDDCC
Length=20

ASCII-8BIT 20*8=160 bits


Character ASCII Code Its binary
• To send a simple message
format
without encoding
A 65 01000001
B 66 01000010
C 67 ..
D 68
E 69 31
Cont…
Fixe size (equal length) code
Message = BCCABBDDAECCBBAEDDCC
001 010 ….
First arrange in alphabet order
After encoded 20*3=60 bits
Character Count CODE
So, our Msg size is 60 bit
A 3 000
B 5 001
C 6 010
D 4 011
E 2 100
20 32
variable size (unequal length) code Cont…
Message = BCCABBDDAECCBBAEDDCC
10 11 11 001 10 10 01 01 …..
• Order by its frequency in increasing order
20 to draw the Huffman tree.
• To get code for each character goes from
9 root to leaf (each character)
0
Character Count CODE
11
5 A 3 001 3*3=9
0 0 B 5 10 5*2=10
C 6 11 6*2=12
2 3 4 5 6 D 4 01 4*2=8
E 2 000 2*3=6
E A D B C
20 Msg size= 45 bits
33
To decode
Cont…
Ecoded Message = 10 11 11 001 10 10 01 01 …..

20
Decode the encoded msg by its Huffman tree
9 we generate:
0 Decoded Message (original I/p) = BCC……
11
5
0 0

2 3 4 5 6
E A D B C
34
Run-Length Coding
 Run-length coding (RLC) works by counting adjacent
pixels with the same gray level value called the run-
length, which is then encoded and stored.

 RLC works best for binary, two-valued, images.

 RLC can also work with complex images that have


been preprocessed by thresholding to reduce the
number of gray levels to two.

35
Cont…
 RLC can be implemented in various ways, but the first step is
to define the required parameters.

 Horizontal RLC (counting along the rows) or vertical RLC


(counting along the columns) can be used.

 In basic horizontal RLC, the number of bits used for the


encoding depends on the number of pixels in a row.
 If the row has 2n pixels, then the required number of bits is n,
so that a run that is the length of the entire row can be
encoded.
 Example:
36
Arithmetic coding
 Arithmetic coding is another widely used method for lossless
data compression.
 offering higher compression efficiency compared to Huffman
coding, especially for data with non-uniform probability
distributions.
 Developed by IBM researchers Jorma Rissanen and Glen G.
Langdon Jr. in the late 1970s, arithmetic coding encodes entire
sequences of symbols into a single floating-point number
between 0 and 1.
 The fractional part of this number represents the encoded
data, and it is gradually refined as each symbol is processed.
37
Cont…
 Arithmetic coding transforms input data into a
single floating point number between 0 and 1.
 As each input symbol (pixel value) is read the
precision required for the number becomes greater.
 As the images are very large and the precision of
digital computers is finite, the entire image must be
divided into small subimages to be encoded.
 Arithmetic coding uses the probability distribution of
the data (histogram), so it can theoretically achieve the
maximum compression specified by the entropy. 38
Cont…
 It works by successively subdividing the interval between
0 and 1, based on the placement of the current pixel value
in the probability distribution.

 In practice, this technique may be used as part of an image


compression scheme, but is impractical to use alone.

 It is one of the options available in the JPEG standard.

39
Bit-plane coding
 Bit-plane coding decomposes an image into multiple bit-
planes, where each bit-plane represents a different level of
significance in the image data.

 In this technique, the most significant bit-plane contains the


coarsest information, while the least significant bit-plane
contains the finest details.

 This method is particularly useful in image compression


algorithms like JPEG 2000, where it enables efficient coding
of image details at different resolutions.

40
Lossy Compression Methods

 Lossy compression methods are required to achieve


high compression ratios with complex images.

 They provides choices between image quality and


degree of compression, which allows the compression
algorithm to be customized to the application.

41
42

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy