C All
C All
C All
1.6. Digitization
1
2
3- Chapter Threee: Image Restoration
3.1 Introduction
3.2. Noise
4- Chapter Four: Image Compression
4.1 Introduction
4.2 Compression System Model
4.3. LOSSLESS COMPRESSION
METHODS 5- Chapter Five: Transformation
5.1 Colors Transforms
5.2. Discrete Transforms
5.3 Wavelet Transform
3
Introduction to Computer Vision
and Image Processing
What Is Digital Image Processing?
An image may be defined as a two-dimensional function, f(x, y)
,where x and y are spatial (plane) coordinates, and the amplitude of (f)
at any pair of coordinates (x, y) is called the intensity or gray level of
the image at that point.
When x, y, and the intensity values of f are all finite, discrete
quantities, we call the image a digital image.
Note that a digital image is a representation of a two-dimensional
image as a finite set of digital values composed of a finite number of
elements, each of which has a particular location and value. These
elements are called picture elements, image elements, , and pixels.
Pixel is the term used most widely to denote the elements of a digital
image
4
nerves that contains the pathways for visual information to travel from
the receiving sensor (the eye) to the processor (the brain).
5
1.1 Computer Imaging
Can be defined an acquisition and processing of visual information
by computer. Computer representation of an image requires the
equivalent of many thousands of words of data, so the massive amount of
data required for image is a primary reason for the development of many
sub areas with field of computer imaging, such as image compression and
segmentation .Another important aspect of computer imaging involves
the ultimate ―receiver‖ of visual information in some case the
human visual system and in some cases the human visual system and
in others the computer itself.
Computer imaging can be separate into two primary categories:
(In computer vision application the processed images output for use by
a computer, whereas in image processing applications the output
images are for human consumption).
These two categories are not totally separate and distinct. The
boundaries that separate the two are fuzzy, but this definition allows
us to explore the differences between the two and to explore the
difference between the two and to understand how they fit together
(Figure 1.1).
Computer imaging can be separated into two different but
overlapping areas.
7
3. The field of Law Enforcement ( )انقٌَٕا ُحفٍذand security in an active area
for computer vision system development, with application ranging
from automatic identification of fingerprints to DNA analysis.
.Imaging Infrared 4 ( صراشعت ححج انحًزاء
ٕ )
1. Image restoration.
2. Image enhancement.
3. Image compression.
8
1.3.1 Image Restoration
b. Restored image
9
a. image with poor contrast b. Image enhancement by contrast
Figure (1.3) Image Enhancement
1.3.3 Image Compression
10
1.4. Image Processing Applications
Image processing systems are used in many and various types of
environments, such as:
1. Medical community has many important applications for image
processing involving various type diagnostics imaging , as an example
Magnetic Resonance Imaging (MRI)scanning, that allows the medical
professional to look into the human body without the need to cut it
open, CT scanning, X-RAY imaging .
2. Computer – Aided Design (CAD), which uses tools from image
processing and computer graphics, allows the user to design a new
building or spacecraft and explore it from the inside out.
3. Virtual Reality is one application that exemplifies ( ًٌ ) ثمfuture
possibilities
4. Machine/Robot vision: Make robot able to see things , identify them ,
identify the hurdles ) ( انعقباث
11
1.6. Digitization
The process of transforming a standard video signal into
digital image. This transformation is necessary because the standard
video signal in analog (continuous) form and the computer requires a
digitized or sampled version of that continuous signal. The analog
video signal is turned into a digital image by sampling the continuous
signal at affixed rate. In the figure below we see one line of a video
signal being sampled (digitized) by instantaneously measuring the
voltage of the signal
(amplitude) at fixed intervals in time.
The value of the voltage at each instant is converted into a number that
is stored, corresponding to the brightness of the image at that point.
Note that the image brightness of the image at that point depends
on both the intrinsic properties of the object and the lighting
conditions in the scene.
12
―When we have the data in digital form, we can use the software to
process the data‖. The digital image is 2D- array as:
In above image matrix, the image size is (N×N) [matrix dimension] then:
...............
Ng= 2 m (1)
Where Ng denotes the number of gray levels m, where m is the no. of bits
contains in digital image matrix.
Example :If we have (6 bit) in 128 * 128 image .Find the no. of gray
levels to represent it ,then find the no. of bit in this image?
Solution:
13
1.7 Image Resolution
Pixels are the building blocks of every digital image. Clearly
defined squares of light and color data are stacked up () مكدسة
next to one another both horizontally and vertically. Each picture
element (pixel for short) has a dark to light value from 0 (solid black)
to 255 (pure white).
That is, there are 256 defined values. A gradient (ٍيم, )االَحدار َسبتis the
gradual transition from one value to another in sequence.
The resolution has to do with ability to separate two adjacent
pixels as being separate, and then we can say that we can resolve the
two. The concept of resolution is closely tied to the concepts of spatial
frequency. There are three types of image resolution:
1- Vertical resolution: the number of M rows in the image (image
scan line).
2- Horizontal resolution: the number of N columns in the image.
3- Spatial frequency resolution: It is represent by the multiplication
(M x N) and its closely tied to the concept of spatial frequency that
refers to how rapidly the signal is changing in space, and the signal
has two values for brightness 0 and maximum. If we use this signal for
one line (row) of an image and then repeat the line down the entire
image, we get an image of vertical stripes. If we increase this
frequency the strips get closer and closer together, until they finally
blend together as shown in figure below, note that the higher the
resolution the more details (high frequency).
14
Figure (1.4) Resolution and Spatial frequency.
15
A display with 240 pixel columns and 320 pixel rows would
generally be said to have a resolution of 240×320.
Resolution can also be used to refer to the total number of pixels
in a digital camera image. For example, a camera that can create
images of 1600x1200 pixels will sometimes be referred to as a 2
megapixel resolution camera since 1600 x 1200 = 1,920,000
pixels, or roughly 2 million pixels. Where a megapixel (that is, a
million pixels) is a unit of image sensing capacity in a digital
camera. In general, the more megapixels in a camera, the better
the resolution when printing an image in a given size.
Below is an illustration of how the same image might appear at
different pixel resolutions, if the pixels were poorly rendered as sharp
squares (normally, a smooth image reconstruction from pixels would
be preferred, but for illustration of pixels, the sharp squares make the
point better).
An image that is 2048 pixels in width and 1536 pixels in height has a
total of 2048×1536 = 3,145,728 pixels or 3.1 megapixels. One could
refer to it as 2048 by 1536 or a 3.1-megapixel image.
16
1.8. Image Representation
18
1.8.3. Color image.
Color image can be modeled as three band monochrome image data,
where each band of the data corresponds to a different color. The
actual information stored in the digital image data is brightness
information in each spectral band. When the image is displayed, the
corresponding brightness information is displayed on the screen by
picture elements that emit light energy corresponding to that particular
color. Typical color images are represented as red, green, and blue or
RGB images. Using the 8-bit monochrome standard as a model, the
corresponding color image would have 24 bit/pixel – 8 bit for each
color bands (red, green and blue). The following figure we see a
representation of a typical RGB color image.
IR(r,c) IG(r,c) IB(r,c)
19
Figure (1.8): a color pixel vector consists of the red, green and
blue pixel values (R, G, B) at one given row/column pixel
coordinate ( r , c).
The RGB color model used in color CRT monitor, in this model
,Red,Grren and Blue are added together to get the resultant color
White.
For many applications, RGB color information is transformed into
mathematical space that decouples the brightness information from the
color information.
The lightness is the brightness of the color, and the hue is what we
normally think of as ―color‖ and the hue (ex: green, blue, red,
and orange). The saturation is a measure of how much white is in the
color (ex: Pink is red with more white, so it is less saturated than a
pure red). Most people relate to this method for describing color.
20
Example: ―a deep, bright orange‖ would have a large intensity
(―bright‖), a hue of ―orange‖, and a high value of
saturation (―deep‖).we can picture this color in our minds, but if we
defined this color in terms of its RGB components, R=245,
G=110 and B=20, most people have no idea how this color appears.
Modeling the color information creates a more people oriented way of
describing the colors.
1.9. Multispectral images.
Multispectral images typically contain information outside the normal
human perceptual range. This may include infrared ( ححج
)انحًزاء,ultraviolet (ٍجت فٕقC) انُبفس, X-ray, acoustic or radar data. These are
not images because the information represented is not directly visible
by the human system. Source of these types of image include satellite
systems, underwater sonar systems and medical diagnostics imaging
systems.
21
1.10. Digital Image File Format
Why do we need so many different types of image file format?
• The short answer is that there are many different types of images and
application with varying requirements.
• A more complete answer, also considers market share proprietary
information, and a lack of coordination within the imaging industry.
Many image types can be converted to one of other type by easily
available image conversion software. Field related to computer
imaging is that computer graphics.
22
The differences between vector and raster graphics are:
3- Because vector graphics are not made of pixels, the images can be
scaled to be very large without losing quality. Raster graphics, on the
other hand, become "blocky," since each pixel increases in size as the
image is made larger.
23
Many image types can be converted to one of other type by easily
available image conversion software. Field related to computer
imaging is that computer graphics.
The most the type of file format falls into category of bitmap images.
In general, these types of images contain both header information and
the raw pixel data. The header information contains information
regarding:
1. The number of rows (height)
2. The number of columns (Width)
3. The number of bands.
4. The number of bit per pixel.
5. The file type.
6. Additionally, with some of the more complex file formats, the
header may contain information about the type of compression used
and other necessary parameters to create the image, I(r, c).
24
independently of the display device (such as a graphics adapter),
especially on Microsoft Windows and OS/2 operating systems.
The BMP file format is capable of storing 2D digital images of
arbitrary width, height, and resolution, both monochrome and color, in
various color depths, and optionally with data compression, alpha
channels, and color profiles. BMP files are an historic (but still
commonly used) file format for the operating system called
"Windows". BMP images can range from black and white (1 bit per
pixel) up to 24 bit colour (16.7 million colours). While the images can
be compressed, this is rarely used in practice and won't be discussed in
detail here.
Structure
A BMP file consists of either 3 or 4 parts as shown in the following
diagram
25
Header
The header consists of the following fields. Note that we are assuming
short int of 2 bytes, int of 4 bytes, and long int of 8 bytes.
Information
The image info data that follows is 40 bytes in length, structure given
below. The fields of most interest below are the image width and
height, the number of bits per pixel (should be 1, 4, 8 or 24), the
number of planes (assumed to be 1 here), and the compression type
(assumed to be 0 here).
26
The compression types supported by BMP are listed below:
0 - no compression
1- 8 bit run length
encoding 2 - 4 bit run length
encoding 3 - RGB bitmap
2. TIFF (Tagged Image File Format) and GIF (Graphics
Interchange Format):
It is one of the most popular and flexible of the current public domain
raster file formats. They are used on World Wide Web (WWW). GIF
files are limited to a maximum of 8 bits/pixel and allows for a type of
compression called LZW. The GIF image header is 13 byte long &
contains basic information.
27
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
28
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
29
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
30
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
2. Data Reduction:
Involves either reducing the data in the spatial domain or transforming
it into another domain called the frequency domain, and then
extraction features for the analysis process.
3. Features Analysis:
The features extracted by the data reduction process are examine and
evaluated for their use in the application.
After preprocessing we can perform segmentation on the image in the
spatial domain or convert it into the frequency domain via a
mathematical transform. After these processes we may choose to filter
the image. This filtering process further reduces the data and allows us
to extract the feature that we may require for analysis.
31
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
32
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
33
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
The first two pixels in the first row are averaged (8+4)/2=6, and this
number is inserted between those two pixels. This is done for every
pixel pair in each row.
𝖥 8 6 4 6 81
6 6 6 6 6
4 6 8 6 4
I 6 5.5 = 6 5 5.5 = 6 6 I
[ 8 5 2 5 8]
34
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
Solution:
a. First-Order Hold.
Size of original image is: 3x3
Size of the resultant image is: 5x5
(150+50)/2=100 (50+30)/2=40 (40+100)/2=70
(100+80)/2=90 (120+60)/2=90 (60+60)/2=60
Image with row expanded
35
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
b. Zero-Order Hold
The original image size is 3x3.
After enlargement, the size will be 6x6
36
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
¼ ½ ¼
½ 1 ½
¼ ½ ¼
37
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
In this process, the output image must be put in a separate image array
called a buffer, so that the existing values are not overwritten during
the convolution process.
If we call the convolution mask M (r, c) and the image I (r, c), the
convolution equation is given by:
38
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
39
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
Example: Enlarge the following sub image three times its original size
using k method.
Solution:
The size of the result image will be k(N-1) +1, where k=3
So the sub image with size 2x3 will be
3(2-1) +1=4 row and 3(3-1) +1=7 column (4 x 7)
For column:
1: 130-160=30 K=3 then 30/3=10
k-1=3-1=2 no. of column to be inserted
130+10*1=140, 130+10*2=150
2: 160-190=30/3=10 160+10 *1=170, 160+10*2=180
3: 160-190=30/3=10 160+10 *1=170, 160+10*2=180
4: 190-220=30/3=10 190+10 *1=200, 190+10*2=210
The same thing repeated for the row. The resultant image is:
40
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
41
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
Example
if you have a sub image of size 2 * 2, zoom the sub image to twice (2)
times
using zero older hold method.
I= [ 1 ]
3 4
1 1 2 2
I‘ = [ 3 3 4 4 ] 𝑅𝑜𝑤 𝑤i𝑠𝑛𝑔 𝑒 𝑧𝑜𝑜𝑚i𝑛𝑔
1 1 2
2
1 1 2 2 ] 𝑐𝑜𝑙𝑢𝑚𝑛 𝑤i𝑠𝑛𝑔 𝑒 𝑧𝑜𝑜𝑚i𝑛𝑔
I’’= [
3 3 4 4
3 3 4 4
Example
Enlarge the following sub image to three times its original size using the
15 30 15
general method of zooming techniques. 𝐼 = [ ]
30 15 30 2×3
Sol:
K=3, K-1=2
Row wise zooming:
- Take the first two adjacent pixels. Which are 15 and 30.
- Subtract 15 from 30: 30-15 = 15.
- Divide 15 by k: 15/k = 15/3 = 5------ OP
- Add OP (5) to the lower number: 15 + 5 = 20.
- Add OP (5) to 20 again: 20 + 5 = 25.
Now repeat this step for the next two adjacent pixels. It is shown in the
first table.
After inserting the values, you have to sort the inserted values in
ascending order, so there remains a symmetry between them. It is
shown in table 2.
42
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
Table 1:
15 20 25 30 20 25 15
30 20 25 15 20 25 30
Table 2
15 20 25 30 25 20 15
30 25 20 15 20 25 30
20 21 21 25 21 21 20
25 22 22 20 22 22 25
30 25 20 15 20 25 30
43
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
45
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
Example: A white square ANDed with an image will allow only the
portion of the image coincident with the square to appear in the output
image with the background turned black; and a black square ORd with
an image will allow only the part of the image corresponding to the
black square to appear in the output image but will turn the rest of the
47
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
AND ^
This operation can be used to find the similarity white regions of two
different images (it required two images).
g (x, y) = a (x, y) ^ b (x, y)
Example: A logic AND is performed on two images, suppose the two
corresponding pixel values are (111)10is one image and (88)10 in the
second image. The corresponding bit strings are:
48
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
Exclusive OR
This operator can be used to find the differences between white regions
of two different images (it requires two images).
g(x,y) =a (x,y) b(x,y)
NOT
This operator can be performed on grey-level images, it‘s applied on only
one
image, and the result of this operation is the negative of the original
image.
g(x,y) = 255- f (x,y)
(111)10 011011112
AND
(88)10 010110002
01001000
49
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
Example: Find the output image for the AND logical operation
between the following sub images:
50 10 1110 10 10 6010
𝐴=[ ], 𝐵=[ ]
1810 2210 3310 13010
2.5 Image Restoration: Image restoration methods are used to
improve the appearance of an image by application of a restoration
process that use mathematical model for image degradation.
50
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
In typical image, the noise can be modeled with one of the following
distribution:
1. Gaussian (―normal‖) distribution.
2. Uniform distribution.
3. Salt _and _pepper distribution.
Image Noise.
2.5.2. Noise Removal using Spatial Filters:
Spatial filtering is typically done for:
1. Remove various types of noise in digital images.
2. Perform some type of image enhancement.
[These filters are called spatial filter to distinguish them from frequency
domain filter].
The three types of filters are:
1. Mean filters.
2. Median filters (order filter).
3. Enhancement filters.
51
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
Mean and median filters are used primarily to conceal or remove noise,
although they may also be used for special applications. For instance,
a mean filter adds ―softer‖ look to an image. The enhancement filter
high lights edge and details within the image.
Spatial filters are implemented with convolution masks. Because
convolution mask operation provides a result that is weighted sum of
the values of a pixel and its neighbors, it is called a linear filter.
Overall effects the convolution mask can be predicated based on the
general pattern. For example:
• If the coefficients of the mask sum to one, the average brightness of
the image will be retained.
• If the coefficients of the mask sum to zero, the average brightness will
be lost and will return a dark image.
• If the coefficients of the mask are alternatively positive and negative,
the mask is a filter that returns edge information only.
• If the coefficients of the mask are all positive, it is a filter that will
blur the image.
The mean filters, are essentially averaging filter. They operate on local
groups of pixel called neighborhoods and replace the center pixel with
an average of the pixels in this neighborhood. This replacement is
done with a convolution mask such as the following 3×3 mask
Arithmetic mean filter smoothing or low-pass filter.
1⁄9 1⁄9 1⁄9
1⁄9 1⁄9 1⁄9
1⁄9 1⁄9 1⁄9
52
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
Note that the coefficient of this mask sum to one, so the image brightness
will be retained, and the coefficients are all positive, so it will tend to
blur the image. This type of mean filter soothes out local variations
within an image, so it essentially a low pass filter. So a low filter can
be used to attenuate image noise that is composed primarily of
high frequencies components.
The median filter is a nonlinear filter (order filter). These filters are
based on as specific type of image statistics called order statistics.
Typically, these filters operate on small sub image, ―Window‖, and
replace the center pixel value (similar to the convolution process).
Order statistics is a technique that arranges the entire pixel in
sequential order, given an N×N window (W) the pixel values can be
ordered from smallest to the largest.
I1 ≤ I2 ≤ I3...............................................< IN
Where I1, I2, I3..........., IN are the intensity values of the subset of pixels in
the image.
53
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
5 5 6
3 4 5
3 4 7
Solution:
1. sort the value in order of size (3,3,4,4,5,5,5,6,7);
2. then we select the middle value, in this case it is 5.
3. The middle value 5 is then placed in center location.
Note: the outer rows and columns are not replaced. Moreover, these
―wasted‖ rows and columns are often filled with zeros (or cropped off
54
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
the image). For example, with 3X3 mask, we lose one outer row and
column, a 5X5 mask we lose two rows and columns.
The maximum and minimum filters: are two order filters that can be
used for elimination of salt- and-pepper noise.
The maximum filter selects the largest value within an ordered
window of pixels‘ values and replaces the central pixel with the
lightest one
The minimum filter selects the smallest value within an ordered
window of pixels‘ values and replaces the central pixel with the
darkest one in the ordered window
The minimum filters work best for salt- type noise, and the maximum
filters work best for pepper-type noise.
The procedure of minimum and maximum filter:
- Minimum and maximum filter order filters can be defined to select a
specific pixel rank within the ordered set. For example, we may find
for certain type of pepper noise that selecting the second highest
values works better than selecting the maximum value.
2.6. Image Quantization
Image quantization is the process of reducing the image data by removing
some of the detail information by mapping group of data points to a
single point. This can be done by:
1. Gray Level reduction (reduce pixel values I (r, c).
2. Spatial reduction (reduce the spatial coordinate (r, c).
The simplest method of gray-level reduction is Thresholding.
55
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
The Procedure:
- We select a threshold gray _level
- and set everything above that value equal to ―1‖ and everything below
the threshold equal to ―0‖.
NOTE: This effectively turns a gray_level image into a binary (two
level) image
Application: is often used as a preprocessing step in the extraction of
object features, such as shape, area, or perimeter. Also, the gray _level
reduction is the process of taking the data and reducing the number of
bits per pixel. This can be done very efficiency by masking the
lower bits via an AND operation. Within this method, the
numbers of bits that are masked determine the number of gray
levels available.
Example:
We want to reduce 8_bit information containing 256 possible gray level
values down to 32 gray levels possible values.
This can be done by applying the (AND) logical operation for each 8-bit
value with the bit string 1111000. this is equivalent to dividing by
eight (23), corresponding to the lower three bits that we are masking
and then shifting the result left three times. [Gray _level in the image
0-7 are mapped to 0, gray level in the range 8-15 are mapped to 8 and
so on].
56
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
We can see that by masking the lower three bits we reduce 256 gray
levels to 32 gray levels: 256 ÷ 8= 32
The general case requires us to mask k bits, where 2 k is divided into the
original gray-level range to get the quantized range desired. Using this
method, we can reduce the number of gray levels to any power of 2:
2,4,6,8, 16, 32, 64 or 128.
• Image quantization by masking to 128 gray levels, this can be done by
ANDing each 8-bit value with bit string 11111110(21).
57
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
Original Image
58
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
59
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
The vertical axis represents brightness, and the horizontal axis shows the
spatial coordinates.
The rapid change in brightness characterizes an ideal edge. In the figure
(b) we see
the representation of real edge, which change gradually. This gradual
change is a minor form
of blurring caused by:
• Imaging devices
• The lenses
• Or the lighting and it is typical for real world (as opposed to
computer-generated) images.
2.7.1. Edge Detection masks
1- Sobel Operator: The Sobel edge detection masks look for edges in
both the horizontal and vertical directions and then combine this
information into a single metric. These two masks are as follows:
Row Mask Column Mask
−1 −2 −1 −1 0 1
[ 0 0 0] [ −2 0 2]
1 2 1 −1 0 1
The Procedure:
These masks are each convolved with the image.
At each pixel location we now have two numbers: S1, corresponding
to the result form the row mask and S2 from the column mask.
Use S1, S2 to compute two matrices,
60
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
2 2
Edge Magnitude (EM) = √𝑆1 + 𝑆2
Example: determine the edges of the following sub image using Sobel
edge detector.
1 2 2 1
[ 1 1 0 3
2 4 1 5 ]
2 1 2 0
−1 −2 −1 −1 0 1
[ 0 0 0] [ −2 0 2]
1 2 1 −1 0 1
Solution:
62
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
The Procedure:
These masks are each convolved with the image.
At each pixel location we find two numbers: P1 corresponding to the
result from the row mask and P2 from the column mask.
Use P1, P2 to determine two metrics, the edge magnitude and edge
direction (angle of orientation of the edge), which are defined as
follows:
3- Kirsch Compass Mask: the Kirsch edge detection masks are called
compass mask s because they are defined by taking a single mask and
rotating it to the eight major compass orientations :
North, north-east, east, south-east, south, south-west, and west and
northwest edges in an image. The masks are defined as follows:
−3 −3 5 −3 5 5 5 5 5
−3 0 5 −3 0 5 −3 0 −3
−3 −3 5 −3 −3 −3 −3 −3 −3
K1 K2 K3
5 5 −3 5 −3 −3 −3 −3 −3
5 0 −3 5 0 −3 5 0 −3
−3 −3 −3 5 −3 −3 5 5 −3
K4 K5 K6
−3 −3 −3 −3 −3 −3
−3 0 −3 −3 0 5
5 5 5 −3 5 5
K7 K8
63
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
These masks differ from the Laplacian type previously described in that
the center coefficients have been decreased by one. So, if we are only
interested in edge information, the sum of the coefficients should be
zero. If we want to retain most of the information the coefficient
should sum to a number greater than zero. Consider an extreme
example in which the center coefficient value will depend most
65
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
subtracts each of the pixels next to the center of the n×n area (where n
s usually 3) from the center pixel.
The result is the maximum of the absolute value of these subtractions.
Subtraction in a homogenous region produces zero and indicates an
absence of edges.
A high maximum of the subtractions indicates an edge. This is a
quick operator since it performs only subtraction- eight operations per
pixel and no multiplication.
This operator then requires thresholding.
66
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
This operator finds the absolute value of the difference between the
opposite pixels, the upper left minus the lower right, upper right minus
the lower left, left minus right, and top minus bottom. The result is the
maximum absolute value. as in the homogeneity case,
This operator requires thresholding. However, it is quicker than the
homogeneity operator since it uses four integer subtractions as against
eight subtractions in homogeneity operator per pixel.
Example: Shown below is how the two operators detect the
edge: Consider an image block with centre pixel intensity 5,
1 2 3
4 5 6
Output of homogeneity operator is:
7 8 9
Max of {| 5-1 |, | 5-2 |, | 5-3 |, | 5-4 |, | 5-6 |, | 5-7 |, | 5-8 |, | 5-9 | } = 4
67
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
68
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
3.2. Noise
Noise is any undesired information that contaminates an image. Noise
appears in image from a variety of source. The digital image a
acquisition process, which converts an optical image into a continuous
electrical signal that is then sampled is the primary process by which
noise appears in digital images.
At every step in the process there are fluctuations ( ) حذبذبcaused by
natural phenomena ( )ظاْ ِزthat add a random value to exact brightness
value for a given pixel. In typical image the noise can be modeled
with one of the following distribution:
1. Gaussian (―normal‖) distribution.
2. Uniform distribution.
3. Salt _and _pepper distribution.
69
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
70
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
Note that the coefficient of this mask sum to one, so the image
brightness will be retained, and the coefficients are all positive, so it
will tend to blur the image . This type of mean filter smooth out local
71
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
a.
72
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
73
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
74
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
75
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
4- Mean filter
g(1,1)= 1/9(100+30+20+15+130+220+30+200+40)=87.2 ≅ 87
g(1,2)= 1/9(30+20+40+130+220+30+200+40+120)=92.2 ≅ 92
g(1,3)= 1/9(20+40+15+220+30+40+40+120+20)=60.5≅ 61
g(2,1)= 1/9(15+130+220+30+200+40+15+50+20)=80
g(2,2)= 1/9(130+220+30+200+40+120+50+20+60)=96.6≅ 97
g(2,3)= 1/9(220+30+40+40+120+20+20+60+50)=66.6≅ 67
76
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
Solution:
g(1,1)= (100*0)+(30*-1)+(20*0)+(15*-1)+(130*5)+(220*-1)+(30*0)+(200*-
1)+(40*0)=185
g(1,2)= (30*0)+(20*-1)+(40*0)+(130*-1)+(220*5)+(30*-1)+(200*0)+(40*-
1)+(120*0)=880
=255
g(1,3)= (20*0)+(40*-1)+(15*0)+(220*-1)+(30*5)+(40*-1)+(40*0)+(120*-
1)+(20*0)=-270 =0
g(2,1)= (15*0)+(130*-1)+(220*0)+(30*-1)+(200*5)+(40*-1)+(15*0)+(50*-
1)+(20*0)=750=255
g(2,2)= (130*0)+(220*-1)+(30*0)+(200*-1)+(40*5)+(120*-1)+(50*0)+(20*-
1)+(60*0)=-360=0
g(2,3)= (220*0)+(30*-1)+(40*0)+(40*-1)+(120*5)+(20*-1)+(20*0)+(60*-
1)+(50*0)=450=255
g(1,1)= (100*0)+(30*1)+(20*0)+(15*0)+(130*1)+(220*0)+(30*0)+(200*-
1)+(40*0)=-40=0
g(1,2)= (30*0)+(20*-1)+(40*0)+(130*-1)+(220*5)+(30*-1)+(200*0)+(40*-
1)+(120*0)=880
77
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
=255
g(1,3)= (20*0)+(40*-1)+(15*0)+(220*-1)+(30*5)+(40*-1)+(40*0)+(120*-
1)+(20*0)=-200=0
g(2,1)= (15*0)+(130*-1)+(220*0)+(30*-1)+(200*5)+(40*-1)+(15*0)+(50*-
1)+(20*0)=-380=0
g(2,2)= (130*0)+(220*-1)+(30*0)+(200*-1)+(40*5)+(120*-1)+(50*0)+(20*-
1)+(60*0)=700 =0
g(2,3)= (220*0)+(30*-1)+(40*0)+(40*-1)+(120*5)+(20*-1)+(20*0)+(61*-
1)+(50*0)=499=255
78
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
EXAMPLE (4-1):
The original image is 256 * 256 pixels, single-band (gray-scale), 8 bits
per pixel. This file is 65,536 bytes (64k). After compression, the image
file is 6,554 bytes.
The compression ratio is:
C = 65536/6554 = 9.999 = 10. This can also be written as 10:1.
This is called a "10 to 1 compression" or a "10 times compression," or it
can be stated as "compressing the image to 1/10 its original size."
Another way to state the compression is to use the terminology of bits per
pixel. For an N * N image
79
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
EXAMPLE (4-2):
Using the preceding example, with a compression ratio of 65,536/6,554
bytes, we want to express this as bits per pixel. This is done by:
- First: finding the number of pixels in the image: 256 x 256 = 65,536
pixels.
- Second: find the number of bits in the compressed image file:
(6,554 bytes) * (8 bits/byte) = 52,432 bits.
- Third: find the bits per pixel by taking the ratio: 52,432/ 65,536 = 0.8
bits/pixel.
The amount of data required for digital images is enormous. For example,
a single 512 x 512, 8-bit image requires 2,097,152 bits for storage. If we
wanted to transmit this image over the World Wide Web, it would
probably take minutes for transmission—too long for most people to
wait.
EXAMPLE (4-3)
To transmit an RGB (color) 512 x 512, 24-bit (8 bits/pixel/color) image
via modem at 28.8 kbaud (kilobits/second), it would take about
EXAMPLE (4-4)
To transmit a digitized color 35mm slide scanned at 3,000 * 2,000 pixels,
and 24 bits, at 28.8 kb and would take about:
80
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
2- Lossy methods: because they allow a loss in the actual image data, so
the original uncompressed image cannot be created exactly from the
compressed file. For complex images these techniques can achieve
compression ratios of 100 or 200 and still retain high-quality visual
information. For simple images or lower-quality results, compression
ratios as high as 100 to 200 can be attained.
Compression algorithms are developed by taking advantage of the
redundancy that is inherent in image data. Three primary types of
redundancy can be found in images:
1) coding.
81
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
2) interpixel,
3) psychovisual redundancy.
82
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
83
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
84
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
EXAMPLE (4-5):
Let L = 8, meaning that there are 3 bits/pixel in the original image.
Now, lets say that the number of pixels at each gray-level value is
equal (they have the same probability), that is:
Po = Pi = … = P7 = 1/8
Now, we can calculate the entropy as follows:
This tells us that the theoretical minimum for lossless coding for this
image is 3 bits/pixel. In other words, there is no code that will provide
better results than the one currently used (called the natural code,
since 0002 = 0, 00 12 = 1, 0102 = 2, ..., 1112 = 7). This example
illustrates that the image with the most random distribution of gray
levels, a uniform distribution, has the highest entropy.
EXAMPLE (4-6):
Let L = 8, thus we have a natural code with 3 bits/pixel in the original
image. Now let's say that the entire image has a gray level of 2, so
𝑝2=1, 𝑝0= 𝑝1= 𝑝3= 𝑝4= 𝑝5= 𝑝6=𝑝7= 0
And the entropy is
This tells us that the theoretical minimum for coding this image is 0
bits/pixel, because the gray-level value is known to be 2. To code the
entire image, we need only one value. This is called the certain event;
it has a probability of 1.
The two preceding examples illustrate the range of the
entropy: 0 ≤ Entropy ≤ log2(L)
The examples also illustrate the information theory perspective on
information and randomness. The more randomness that exists in an
image, the more evenly distributed the gray levels, and the more bits
per pixel are required to represent the data.
85
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
86
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
EXAMPLE (4-7):
We have an image with 2 bits/pixel, giving four possible gray levels.
The image is 10 rows by 10 columns.
Step 1: we find the histogram for the image. This is shown in Figure
(5.2-a), where we see that gray level 0 has 20 pixels, gray level 1 has
30 pixels, gray level 2 has 10 pixels, and gray level 3 has 40 pixels
with the value. These are converted into probabilities by normalizing
to the total number of pixels in the image.
Step 2: the probabilities are ordered as in Figure (5.2-b).
Step 3: we combine the smallest two by addition.
Step 4: repeats steps 2 and 3, where we reorder (if necessary) and add
the two smallest probabilities as in Figure (5.2-d). This step is
repeated until only two values remain.
87
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
is assigned to the 0.6 branch, and a 1, to the 0.4 branch. In Figure (5.3-
b) the assigned 0 and 1 are brought back along the tree, and wherever
a branch occurs the code is put on both branches. Now (Figure 5.3-c)
we assign the 0 and 1 to the branches labeled 0.3, appending to the
existing code. Finally (Figure 5.3-d), the codes are brought back one
more level, and where the branch splits another assignment of 0 and 1
occurs (at the 0.1 and 0.2 branch). Now we have the Huffman code for
this image as shown in Table (4.2-1).
89
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
90
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
EXAMPLE (4-9): find the Huffman codes for the following symbols
with the following probability.
91
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
With 0 level, what is required in coding is only to send the length until
a black pixel and so on.
- The first step is to define the required parameters. We can use either
horizontal RLC, counting along the rows, or vertical RLC, counting
along the columns. In basic horizontal RLC the number of bits used
for the coding depends on the number of pixels in a row. If the row
has 2n pixels, then the required number of bits is n, so that a run that is
the length of the entire row can be coded.
EXAMPLE (4-10)
A 256 x 256 image requires 8 bits, since 28 = 256.
EXAMPLE (4-11)
A 512 x 512 image requires 9 bits, since 29 = 512.
- The next step is to define a convention for the first RLC number in a
row—does it represent a run of 0's or 1's? Defining the convention for
the first RLC number to represent 0's, we can look at the following
example.
EXAMPLE (4-12)
The image is an 8 x 8 binary image, which requires 3 bits for each run-
length coded word. To apply RLC to this image, using horizontal
RLC:
92
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
Note that in the second and seventh rows, the first RLC number is 0,
since we are using the convention that the first number corresponds to
the number of zeros in a run.
Gray level run-length code: In this method, two parameters are used
to characterize the run. The pair (G, L) correspond to the gray-level
value G and the run-length L. It works by starting at the first point in
the upper left corner and looking at the first pixel and its following
neighbors across the line, determines how many following pixels have
the same brightness as the first one. According to that a new code will
93
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
94
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
95
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
example, the color (0, 0,255) is a pure blue tone. Its complementary
color is (255-0,255-0,255-255), or (255, 255, 0), which is a pure
yellow tone. Blue and Yellow are complementary colors, and they are
mapped to opposite corners of the cube. The same is true for red and
cyan, green and magenta, and black and white. Adding a color to its
complement gives white. It is noticed that the components of the
colors at the corners of the cube have either zero or full intensity. As
we move from one corner to another along the same edge of the cube,
only one of its components changes value. For example, as we move
from the Green to the Yellow corner, the Red component changes
from 0 to 255. The other two components remain the same. As we
move between these two corners, we get all the available tones from
Green to Yellow (256 in all). Similarly, as one moves from the Yellow
to the Red corner, the only component that changes are the Green, and
one gets all the available shades from Yellow to Red. This range of
similar colors is called gradient.
Although we can specify more than 16 million colors, we can‘t have
more than 256 shades of Gray. The reason is that a gray tone,
including the two extremes (black and white), is made up of equal
values of all the three primary colors. This is seen from the RGB cube
as well. Gray shades lie on the cube‘s diagonal that goes from black to
white. As we move along this path, all three basic components change
value, but they are always equal. The value (128,128,128) is a mid-
gray tone, but the values (129,128,128) aren‘t gray tones, although
they are too close for the human eye to distinguish. That's why it is
wasteful to store grayscale pictures using 16-million color True Color
file formats. A 256-color file format stores a grayscale just as
96
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
97
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
The HSV image may be computed from RGB image using different
transformation. Some of them are as follows:
The simplest form of HSV transformation is:
H= tan [3(G-B)/(R-G) +(R-B)]
98
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
S= 1-(min(R, G, B)/V)
V= (R+G+B/3)
However, the hue (H) becomes undefined when saturation S=0.
The most popular form of HSV transformation is shown next,
where the r,g,b values are first obtained by normalizing each
pixel such that
r=(R/(R+G+B)), g=(G/(R+G+B)), b=(B/(R+G+B))
Accordingly, the H, S and V value can be computed as:
99
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
mapped data from one color space to another color space with a one-
to-one correspondence between a pixel in the input and the output.
Here, we are mapping the image data from the spatial انًكاٍَت
(time) domain spectral انطٍفتdomain, where all the pixels in the input
(spatial domain) contribute to each value in the output (frequency
domain) as illustrated in the figure (6.3) and (6.4) below:
100
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
Figure (5.4) All pixel in the input image contribute to each value in the
output image color transform use a single-pixel to single-pixel
mapping.
101
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
The Fourier transform is the most well-known, and the most widely
used transform. It was developed by Baptiste Joseph Fourier (1768-
1830) to explain the distribution of temperature and heat conduction
( صم
ٍ ) حزاري ٕح. Since that time the Fourier transform has found
ْ ) analysis in mechanical
numerous uses, including vibration (االخشاس
engineering, circuit analysis in electrical engineering, and here in
computer imaging. In mathematics, a Fourier series decomposes
periodic functions or periodic signals into the sum of a (possibly
infinite) set of simple oscillating حخأرجحfunctions, namely sine
and cosine (or complex exponentials).
This transform allows for decomposition of an image into weighted
sum of 2-d sinusoidal جٍبتterms. Assuming an N×N image, the
equation for the 2-D discrete Fourier
In this case, F(u, v) is also complex, with the real part corresponding
to the cosine terms and the imaginary part corresponding to the sine
102
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
The magnitude of a sinusoid is simply its peak value, the phase, data
contain information about where objects are in an image. After we
perform the transform, if we want to get our original image back, we
need to apply the inverse transform. The inverse Fourier transform is
given by:
103
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
Where
104
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
105
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
106
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
Haar Wavelet:
The Haar transform works mainly on pairs of data items from the
signal begin processed and execute two step of calculation, the first
step is:
Li= 1/ √2 ( X2i+X2i+1)
Which result an approximation (average) of the two data items, the
second is:
Hi= 1/ √2 ( X2i+X2i+1)
Which also results an approximation (difference) of the two data
items Where
X is the signal data with length N
L is the Low sub –band with length N/2
H is the High sub-band with length N/2
i = 0…….N/2
The invers Haar transform formula is:
X2i= 1/ √2 ( Li+Hi)
X2i+1= 1/ √2 ( Li-Hi)
The Haar wavelet transform has a number of advantages:
It is conceptually simple.
It is fast.
It is memory efficient, since it can be calculated in place without
a temporary Array.
It is exactly reversible without the edge effects that are a problem with
other Wavelet
Figure (6.6) example of applying 3-level of Wavelet Transform
The attracting features of the Haar transform is:
Fast and simple for implementation.
Efficient memory because it can be calculated without
temporary array.
107
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
Solution:
ُسأخذ كم َقطٍخ يخجٔارٍح ن ًٕعدٌٍ َٔقٕو ًبج ًٓعأق ًس ًٓخاlow )L( اٌلجاد جشء ال-
2 عهى
- انصححًَٔٓم انكسز عُد ٌاجاد ُانخٍجت
ٍ َأخذ فقظ انجشء:الحظتCي.
)137+134)/2=135.5=135 (129+131)/2=130
(135+141)/2=138 (133+132)/2=132.5=132
(138+134)/2=136 (134+131)/2=132.5=132
(135+129)/2=132 (133+131)/2=132
ُسأخذ كم َقطٍخ يخجٔارٍح ن ًٕع ٌٍد َٕٔقو بطزحًٓ ا
( )high Hالٌجاد جشء ال-
2 ٔق ًسخًٓا عهى
137-134)/2=1.5=1 (129 -131)/2=1
(135 -141)/2= - 3 (133 - 132)/2=0.5=0
(138 -134)/2= 2 (134 -131)/2=1.5=1
(135 -129)/2= 3 (133 - 131)/2=1
108
Baghdad university Fourth Year
College of education Lecturer: Dr. Hussein L. Hussein
Image processing
اَ -لحع اٌ ا ٕ
نصرة َاًقسج انى جشٍئ- .اًCنزحهت
صرة ُاناحجت نخق ٍٓسًا انى اربعت
انثٍَات قَ ٕو بخكزار َفس ان ًعٍهت انسابقت عهى ان ٕ
اجشاء ٔن ٍك
بأخذ َقطٍخ نصفٍ ٔكانخاًن:
)(LL )(HL
(135+138)/2=136.5=136 (1+(-3))/2= -1
(136+132)/2=134 (2+3)/2= 2.5=2
(130+132)/2=131 (-1+0)/2= - 0.5=0
(132+133)/2=132 (1+1)/2=1
)(LH )(HH
(135 -138)/2= -1.5= -1 (1-(-3))/2= 2
(136 -132)/2= 2 (2-3)/2= -0.5=0
(130 - 132)/2= -1 (-1- 0)/2= - 0.5=0
(132-133)/2= 0 (1-1)/2=-0.5=0
109