Unit 6
Unit 6
Unit 6
IMAGE COMPRESSION
Unit 8
IMAGE COMPRESSION
The information that has less relative importance than other information in normal visual processing is said to be psycho visually redundant. It can be eliminated without significantly impairing the quality of image perception. Existence of psycho visual redundancies should not come as surprise, because human perception of the information in an image normally does not involve quantitative analysis of every pixel value in an image. In general an observer searches for distinguishing features such as edges or textual signs and mentally combines them into unrecognizable groupings. The brain then correlates these groupings with prior knowledge in order to complete the image interpretation process. Psycho visual redundancy is fundamental different from other redundancies. Unlike coding and inter pixel redundancy, psycho visual redundancy is associated with real or quantifiable visual information .Its elimination is possible only because the information itself is not essential for normal visual processing. Since the elimination of psycho visual redundant data results in a loss of quantitative information , it is commonly referred to as quantization. Generally quantization means the mapping of a broad range of input value to a limited number of output values. As it is an irreversible operation quantization results in lossy data compression. Applications: Primarily in transmission and storage of information. Broad cast television. Remote sensing via satellite. Military communications via aircraft. Radar and sonar. Tele conferencing. Computer communications. Facsimile transmissions. Medical images. 8.2.4 FIDELITY CRITERIA: Removal of psycho visually redundant data results in a loss of real or quantitative visual information. Because information of interest may be loss, a reproducible means of quantifying the nature and extent of information loss is highly desirable. Two general classes of criteria are used as the basis for such an assessment. Objective fidelity criteria Subjective fidelity criteria Objective Fidelity Criteria: When the level of information loss can be expressed as a function of original or input image and the compressed and subsequently decompressed output image, it is said to be based on an objective fidelity criterion. A good example is the root mean square error between an input and output image. Let represents an input image and let denote an estimate or approximation of that results from compressing and subsequently decompressing the input. For any value of x and y the error between and can be defined as e(x, y) = (8.1) So that the total error between the two images is (8.2) Where the images are size M x N. The root mean square error rms between f(x,y) and f^(x,y) then is the square root of the squared error averaged over the M*N array or
Unit 8
IMAGE COMPRESSION
(8.3)
A closely related objective fidelity criterion is the mean square signal to noise ration of the compressed decompressed image. If f^(x,y) is considered to be the sum of the original image f(x,y) and a noise signal e(x,y) , the mean-square signal to noise ratio of the output image, denoted SNrms is
(8.4)
The rms value of the signal to noise ratio, denoted is obtained by taking square root of . Subjective Fidelity Criteria: Although objective fidelity criteria offer a simple and convenient mechanism for evaluating information loss, most decompressed image ultimately are viewed by humans consequently measuring image quality by the subjective evaluations of a human observer often is more appropriate this can be accomplished by showing a typical decompressed image to an appropriate cross-section of viewers. Each viewer assigns a grade to the decompressed w.r.t the original image. These grades may be drawn from a scale {-3,-2,-1,0,1,2,3} to represent the subjective evaluations { much worse , worse, slightly worse, the same , slightly better, better much better } respectively. However the scale can of course, be divided into a course of finer bins. Finally based on grades assigned by all examiners, an overall grade is assigned to the decompressed image. Complement of this grade gives an idea of subjective.
Source encoder
Channel encoder
Channel
Channel decoder
Source decoder
f(x, y)
ENCODER Fig.8.1:
The above fig shows a compression system consisting of two distinct structural blocks: Encoder Decoder An input image is fed into the encoder which creates a set of symbols from the input data. After transmission over the channel, the encoded representation is fed to the decoder, where a reconstructed output image is generated. In general, may or may not be an exact replica of . If the system is error for information preserving, if not some level of distortion is present in the reconstructed image. Both the encoder and decoder shown in fig consist of two relatively independent functions of sub blocks. The encoder is made up of a source encoder which removes input redundancies and a channel encoder, which increases the noise immunity of the source encoders output. The decoder includes a channel decoder followed by a source decoder. If the channel between the encoder and decoder is noise free, the channel encoder and decoder are omitted, and the general encoder and decoder become the source encoder and decoder respectively.
Unit 8
IMAGE COMPRESSION
8.3.1 THE SOURCE ENCODER AND DECODER: The source is responsible for reducing or eliminating any coding interpixel or psycho visual redundancies in the input image. The specific application and associated fidelity requirements dictate the best encoding approach to use in any given situation. Normally, the approach can be modelled by a series of three independent operations. As shown in fig below each operation is designed to reduce one of the 3 redundancies fig(b) depicts the corresponding source decoder.
Fig.8.2: Functional Block diagram of a general image compression system Source Decoder: In the first stage of the source encoding process, the mapper transforms the input data into a format to reduce inter pixel redundancies in the input image. This operation is generally is reversible and may or may not reduce directly the amount of data required to represent the image. Run length coding is an example of a mapping that directly results in data compression is this initial stage of the overall encoding process. The representation of an image by set of transform co-efficient in an example of the opposite case. The mapper transforms the image into an array of co-efficient making its inter pixel redundancies more accessible for compression in later stages of the encoding process. The second stage, the quantizer block, reduces the accuracy of the mapper output in accordance with some pre established fidelity criterion. This stage reduces the psycho visual redundancies of the input image. As this operation is irreversible, thus it must be omitted when error free compression is desired. In this final stage, the symbol coder creates a fixed or variable length code to represent the quantizer output and maps the output in accordance with the code. The term source coder distinguishes this coding operation from the overall source encoding process. Variable length code reduces coding redundancy fig(a) shows the source encoding process successive operations but all the 3 operations are not necessarily included in every compression system. The source decoder shown in fig(b) contains only two components: Symbol decoder. Inverse mapper.
Unit 8
IMAGE COMPRESSION
These blocks perform the inverse operation of source encoders symbol encoder and mapper blocks. Because quantization results in irreversible information loss, and inverse quantizer block is not included in the general source decoder model. 8.3.2 THE CHANNEL ENCODER AND DECODER: These blocks are designed to reduce the impact of channel noise by inserting a controlled form of redundancy into source encoded data. As the output of the source encoder contains little redundancy, it would be highly sensitive to transmission noise without the addition of this controlled redundancy. Most useful coding technique is Hamming code devised by R.W HAMMING Hamming Code: No. of parity bits: Where X is no. of message bits P is no. of parity bits Example: X=4 & P=2 [X+P+1=7 & The above condition fails so X=4 & P=3 [X+P+1=8 & Therefore total no. of bits are 4+3=7 bit code Location of parity bits: Parity bits are located in the positions are numbered corresponding to ascending power of 2 i.e., = 1, 2, 4, 8, 16 Bit 7 Bit 6 Bit 5 Bit 4 Bit 3 Bit 2 Bit 1 ] Hence the condition satisfies ]
Where Bit designation Bit location Binary location number Information bit Parity bits Assignment of P 7
is parity bits,
6 110
5 101
4 100
3 011
2 010
1 001
111
1 0
1 0 1
checks all the bit locations that have 1s in the same location in binary location numbers.
Unit 8
IMAGE COMPRESSION
If by location code
Unit 8
IMAGE COMPRESSION
1 every other node forms anode at the higher level and the dame probability is associated with it. 1(d) repeat steps 1(b) & 1(c) until we reach at a single node, namely root node. So, we have constructed a binary tree. However, a speciality of this tree is that at each level only one node splits into two nodes. Step: assign binary code words to each gray level depending on its probability. 2(a) assigns a null code to the root node. 2(b) whenever a node splits into two nodes, the code corresponding to the right child is constructed by appending a 1 to the code at the said node and for left child a 0 is appended. 2 code at any that have single child is assigned to the child node. 2(d) repeat steps 2(b) and 2(c) until terminal nodes are reached. Binary code word assigned to each node should now be used to represent corresponding gray level. It is clear that assigned code words are unequal length. Example: { } gray level frequency {0.1, 0.4, 0.06, 0.1, 0.04, 0.3} probabilities
Fig.8.4: Huffman code assignment procedure Note: pixel intensities 0.4, 0.3, 0.1, 0.1, 0.06, 0.04 probabilities of pixel
Unit 8
IMAGE COMPRESSION
Root node will be 1. 8.4.3 ARITHMETIC CODING: This coding generates non block codes, which can be traced to the work of a one-to-one correspondence between source symbols and code words does not exists. Instead an entire sequence of source symbol is assigned a single arithmetic codeword. The code word itself defines an interval of real numbers between 0 &1. As the number of symbols in the message increases, the interval used to represent the interval becomes larger. Each symbol of the messages reduces the size of the interval in accordance with its probability of occurance because the technique does not require each source symbol translate into an integral number of code symbol ,it achieves the bound established by the noiseless coding theorem. As the length of the sequence being coded increases, the resulting arithmetic code approaches the bound established by the noiseless coding theorem. The two factor cause coding performance to fall short of the bound. The addition of the end of the message indicator that is needed to separate one massage from another. The use of finite precision arithmetic. Practical implementation of arithmetic coding address the later problem by introducing a scaling strategy and a sounding strategy. The scaling strategy renormalizes each subinterval to the[0,1] range before sub dividing it in accordance with the symbol probabilities. The rounding strategy guarantees that the truncation associated with finite precision arithmetic do not present the coding sub intervals from being represented accurately. Examples; source symbol probabilities as 0.4, 0.5, 0.1.
Fig.8.5: Arithmetic coding example Encoding initial [0.4 0.9] [0.4+[0.9-0.4]*0.4, 0.4+[0.9-0.4]*0.9] [0.6, 0.85] [0.6+[0.85-0.6]*0.4, 0.6+[0.85-0.6]*0.9] [0.7,0.825] [0.7+[0.825-0.7]*0.9, 0.7+[0.825-0.7]*1] [0.8125, 0.825] The code is Decode First Code = = = 0.825
8
Unit 8
IMAGE COMPRESSION
[0.5=0.9-0.4 ] range from the tabular information Second eliminate the effect of Code = Third eliminate the effect of Code = = 0.9
Fourth = 0 [where 0.9 min value of code ] 8.4.4 LZW CODING: Reduces images inter pixel redundancies. Lempel-ziv-welch coding. Assigns fixed length code word. Its uses the principle shannons first theorem states that nth extension of a zero-memory source can be coded with average bits per source symbol than the non extended source itself. Main imaging format used in graphic inter change format (GIF), tagged image file format (TIFF), and the portable document format (PDF). For 8-bits monochrome images, 0-255 are the 256 words in the dictionary sequential image pixels are assigned the next code i.e. 256 for 255-255 pair of a bit word , 512 word dictionary is employed for (8+8)bits a 9 bit code is sent.
Unit 8
IMAGE COMPRESSION
Fig.8.7: LZW coding example Total bits (original) = 4*4*8=128bits Total bits (decoder) = 10*9 = 90 bits Note: compression ratio = = 8.4.5BIT PLANE CODING: Reduces an image inter pixel redundancies, Process the image bit plane individually Concept in bit plane coding is decomposing a multi level image into a series of binary images and compressing each binary image. Bit Plane Decomposition: A simple method of decomposing the image into a collection of binary images is to separate the m-coefficients of the polynomial into m 1bit planes. 0th -order bit plane is generated by collecting bits while(m-1)st order bit plane contains Am-1 bits. Disadvantage is small changes in gray level can have a significant impact on the complexity of the bit planes. Ex:127 & 128 Alternative decomposition is to first represent the image by an m-bit gray code
Successive code words differ in only one bit position small changes in gray level are less likely to affect all m-bit planes gray coded bit planes are less complex than the corresponding binary bit planes. 8.4.6 LOSSLESS PREDICTIVE CODING: Based on the eliminating the inter pixel redundancies a closely spaced pixels by extracting and coding only the new information in each pixel. The new information of a pixel is defined as the difference between the actual and predicted values of that pixel.
10
Unit 8
IMAGE COMPRESSION
Fig.8.8: A Lossless predictive coding model: (a) encoder; (b) decoder As each successive pixel of the input image fn is introduced to encoder, the predictor generates the anticipated value of that pixel based on some number of past inputs. The output of predictor is then redundant to nearest integer, f^m used to form the difference or prediction error. Which is coded using a variable length code to generate the next element of the compressed data stream. The decoder reconstructs en from the received variable length code words and perform the inverse operation. Various local, global and adaptive methods can be used to generate fn^. In most cases, the prediction is formed by a linear combination of m previous pixels. i.e. = round[ ] Where m is the order of linear predictor, For i=1,2,----,m are prediction coefficients. 8.4.7 RUN LENGTH ENCODING: Run length encoding (RLE) is very simple form of data compression in which runs of data (i.e sequences in which the same data value occurs in many consecutive data elements) are stored as a single data value and count , rather than the original run. This is most useful on data that contains many such runs for example: simple graphic images such as line drawings. Ex:
Fig.8.9: image frame Consider a screen contain lines on a constant white back ground. So there will be many long runs of white pixels in the blank space and many short runs of black pixels. Let us take a hypothetical single scan line, with B representing a black pixel and W representing white. Code: WWBBBBBWWWWWBBBBBWWWWWWWWWWB If we apply a simple run length code to the above hypothetical scan line we tet the following:
11
Unit 8
IMAGE COMPRESSION
Code: 2W5B5W5B10WB Interpret this as 2(two) white, four blacks, five whites, four blacks, 10 whites , one black etc. The run length code for single line is 28(original) characters in only 12, the actual format used the storage of images is generally binary rather than ASCII character like this, but the principle remains the same every binary data files can be compressed with this method file format specifications often dictate repeated bytes in files as padding space. Suppose if we have a string of English like MYDOGHASFLEAS. It would be encoded 1M1Y1D1O1G1H1A1S1F1L1E1A1S We have represented 13 bytes with 26 bytes achieving a compression ratio of [0.5]. The actually data is expanded by two. So there is a requirement of better method. The method is which we can represent unique strings of data as the original strings and run length encode only repetitive data. This is done with a special prefix character to flag runs. Runs are then represented as the special character followed by the count followed by the data. If we use a + as special prefix character, we can encode the following string. ABCDDDDDDDDDEEEEEEEEE is as ABC+8D+9E Achieving a compression ratio 19bytes/9 bytes =2.11 compression ratio. Since it takes three bytes to encode a run of data , it make sense to encode only runs of 3 or longer otherwise , if we expanding our data. When the special prefix character is found in the source data, it must encode our character as a run length 1. Since this will expand our data by a factor of 3, we want to pick a character that occurs infrequently for our prefix character[59]. The mac paint image file format uses run length encoding, combining the prefix character with the count byte fig(a). It has two types of data strings with corresponding prefix bytes. One encodes run of repetitive data. The other encoded strings of unique data. The two data strings look like those shown in fig(a) pattern to be Count byte 1 Bit 6-0Count byte 1 to 72 bytes of unique data 0 Bit 6-0Fig.8.10 MAC Painting Encoding Format: The MSB bit of the prefix byte determines if the string that follows is repeating data or unique data. If the bit is set, that byte stores the count (in twos complement) of how many times to repeat the next data byte. If the bit is not set, that byte plus one is the number of how many of the following bytes are unique and can be copied verbation to the output. Only seven bits are used for the count. The width of an original mac painting images is 576 pixels, so runs are therefore limited to 72 bytes. The PCX file format run length encodes the separate planes of an image. It sets the two MSB if there is a run. This leaves 6 bits, limiting the count of 63. Other image file formats that use Data byte Data byte repeated Data byte
12
Unit 8
IMAGE COMPRESSION
run length encoding RLC and GEM. The TIFF and TGA file format specifications allow for optional run length encoding of image data. Run length encoding works very well for images with solid backgrounds like cartoons. For natural images, it doesnt work as well. Also because run length encoding capitalizes on characters repeating more than 3 times it does not work with English text. A method that would achieve better results is one that users fewer bits to represent the most frequently occurring data. Data that occurs less frequently would require more bits. The RLC is the idea behind Huffman coding.
Fig.8.11: A Lossy predictive coding model: (a) encoder; (b) decoder. Where fn^ is the prediction formed by a linear combination of m previous pixels. This closed configurations prevents error build up at the decoders output. Delta modulation is a simple but well known form of lossy predictive coding. It uses a one step delay function as a predictor and a 1-bit quantizer giving a1 bit representation of the signal.
13
Unit 8
IMAGE COMPRESSION
Fig.8.11: A Block transform coding system; (a) encoder; (b)decoder. This coding achieves large compression than predictive methods. Any distortion due to quantization and channel errors gets distributed, during inverse transformation over the entire image, predictive coding has much lower complexity both in terms of memory and number of operations. 8.5.3 WAVELET CODING: It is based on the idea that the coefficients of a transform that de-correlates the pixels of an image can be coded more efficiently than the original pixel themselves. If the wavelets pack most of the important visual information is quantized into small no. of coefficients, the remaining coefficients can be quantized coarsely or truncated to zero with little image distortion.
14
Unit 8
IMAGE COMPRESSION
FIGURE 8.11 A Wavelet coding system:(a) encoder;(b)decoder. A Wavelet Coding System: The above fig shows the wavelet coding system. To encode a image, an analyzing wavelet, and min decomposition level, J-P, are selected and used to compute the images discrete wavelet transform. If the wavelet has a complimentary scaling function , the fast wavelet transform can be used. In either case the computed transform converts a large portion of the original image to horizontal, vertical and diagonal decomposition coefficients with zero mean and laplacian like distributions. Since many of the computed coefficients carry little visual information they can be quantized and coded to min inter coefficient and coding redundancy. The quantization can be adapted to exploit any positional correlations across the p decomposition levels. Decoding is accomplished by inverting the encoding operations with the exception of quantization which cannot be reversed exactly. Wavelet transforms are both computationally efficient and inherently local, subdivision of original image is unnecessary. The removal of the sub division steps eliminate the blocking artifact that characterizing DCT-based approximations at higher compression ratios.
15