DIP 15EC72 Notes

Digital Image Processing 15EC72
||Jai Sri Gurudev||

Sri Adichunchanagiri Shikshana Trust(R)
SJB INSTITUTE OF TECHNOLOGY
BGS HEALTH AND EDUCATION CITY
Kengeri, Bengaluru-560060
Department of Electronics and Communication Engineering
B.E., VII Semester, Electronics &

Communication Engineering
[As per Choice Based Credit System (CBCS) scheme]
Digital Image Processing

(15EC72)
DEPARTMENT OF ELECTRONICS AND COMMUNICATION
SJB INSTITUTE OF TECHNOLOGY

BENGALURU -560060
Faculty incharge: Dr. Mahantesh K, Associate Professor, ECE, SJBIT
Dept. of ECE/SJBIT Page 1

DIGITAL IMAGE PROCESSING

B.E., VII Semester, Electronics & Communication Engineering
[As per Choice Based Credit System (CBCS) scheme]
Subject Code 15EC72 IA Marks 20
Number of Lecture 04 Exam Marks 80
Hours/Week
Total Number of 50 (10 Hours / Exam Hours 03
Lecture Hours Module)
CREDITS – 04
Course Objectives: The objectives of this course are to:
 Understand the fundamentals of digital image processing.
 Understand the image transform used in digital image processing.
 Understand the image enhancement techniques used in digital image processing.
 Understand the image restoration techniques and methods used in digital image
processing.
 Understand the Morphological Operations and Segmentation used in digital image
processing.
Module – 1 RBT
Level
Digital Image Fundamentals: What is Digital Image Processing?, Origins of Digital
Image Processing, Examples of fields that use DIP, Fundamental Steps in Digital
Image Processing, Components of an Image Processing System, Elements of Visual
L1, L2
Perception, Image Sensing and Acquisition, Image Sampling and Quantization, Some
Basic Relationships Between Pixels, Linear and Nonlinear Operations.
[Text: Chapter 1 and Chapter 2: Sections 2.1 to 2.5, 2.6.2]
Module – 2
Spatial Domain: Some Basic Intensity Transformation Functions, Histogram
Processing, Fundamentals of Spatial Filtering, Smoothing Spatial Filters, Sharpening
Spatial Filters
Frequency Domain: Preliminary Concepts, The Discrete Fourier Transform (DFT) L1, L2,
of Two Variables, Properties of the 2-D DFT, Filtering in the Frequency Domain, L3
Image Smoothing and Image Sharpening Using Frequency Domain Filters, Selective
Filtering.
[Text: Chapter 3: Sections 3.2 to 3.6 and Chapter 4: Sections 4.2, 4.5 to 4.10]
Module – 3
Restoration: Noise models, Restoration in the Presence of Noise Only using Spatial
Filtering and Frequency Domain Filtering, Linear, Position- Invariant Degradations,
L1, L2,
Estimating the Degradation Function, Inverse Filtering, Minimum Mean Square Error L3
(Wiener) Filtering, Constrained Least Squares Filtering.

[Text: Chapter 5: Sections 5.2, to 5.9]

Module – 4
Color Image Processing: Color Fundamentals, Color Models, Pseudocolor Image
Processing.
Wavelets: Background, Multi-resolution Expansions. L1, L2,
Morphological Image Processing: Preliminaries, Erosion and Dilation, Opening and L3
Closing, The Hit-or-Miss Transforms, Some Basic Morphological Algorithms.
[Text: Chapter 6: Sections 6.1 to 6.3, Chapter 7: Sections 7.1 and 7.2, Chapter 9: Sections 9.1 to 9.5]
Module – 5
Segmentation: Point, Line, and Edge Detection, Thresholding, Region-Based
Segmentation, Segmentation Using Morphological Watersheds. L1, L2,
Representation and Description: Representation, Boundary descriptors. L3
[Text: Chapter 10: Sections 10.2, to 10.5 and Chapter 11: Sections 11.1 and 11.2]
Course Outcomes: At the end of the course students should be able to:
 Understand image formation and the role human visual system plays in perception of gray
and color image data.
 Apply image processing techniques in both the spatial and frequency (Fourier) domains.
 Design image analysis techniques in the form of image segmentation and to evaluate the
Methodologies for segmentation.
 Conduct independent study and analysis of Image Enhancement techniques.
Question paper pattern:
 The question paper will have ten questions.
 Each full question consists of 16 marks.
 There will be 2 full questions (with a maximum of Three sub questions) from each
module.
 Each full question will have sub questions covering all the topics under a module.
 The students will have to answer 5 full questions, selecting one full question from each
module.
Text Book:
Digital Image Processing- Rafel C Gonzalez and Richard E. Woods, PHI 3rd Edition 2010.
Reference Books:
1. Digital Image Processing- S.Jayaraman, S.Esakkirajan, T.Veerakumar, Tata McGraw Hill
2014.
2. Fundamentals of Digital Image Processing-A. K. Jain, Pearson 2004.

Module – 1 RBT
Level
Digital Image Fundamentals: What is Digital Image Processing?, Origins of Digital
Image Processing, Examples of fields that use DIP, Fundamental Steps in Digital
Image Processing, Components of an Image Processing System, Elements of Visual
L1,
Perception, Image Sensing and Acquisition, Image Sampling and Quantization, Some
L2
Basic Relationships Between Pixels, Linear and Nonlinear Operations.
[Text: Digital Image Processing- Rafel C Gonzalez and Richard E. Woods
Chapter 1 and Chapter 2: Sections 2.1 to 2.5, 2.6.2]
1.1 What is Digital Image Processing?
An image may be defined as a two-dimensional function, f (x,y), where x and y are spatial
(plane) coordinates. The amplitude of f at any pair of coordinates (x,y) is called the intensity
or gray level of the image at that point.
When x, y, and f are all finite, discrete quantities, we call the image a digital image.
The field of digital image processing refers to processing digital images by means of a digital
computer.
A digital image is composed of a finite number of elements, each of which has a
location and value. These elements are called pixels.
Unlike humans, who are limited to the visual band of electromagnetic (EM) spectrum,
imaging machines cover almost the entire EM spectrum, ranging from gamma to radio
waves.
There is no general agreement regarding where image processing stops and other
related areas, such as image analysis and computer vision, starts.
Although there are no clear-cut boundaries in the continuum from image processing at
one end to computer vision at the other, one useful paradigm is to consider three types of
processes in this continuum:
A low-level process is characterized by the fact that both its inputs and outputs are
images.
A mid-level process is characterized by the fact that its inputs generally are images,
but its outputs are attributes extracted from those images.
The higher-level processes include object recognition, image analysis, and performing
the cognitive functions associated with vision.
1.2 Origins of Digital Image Processing
One of the first applications of digital images was in the newspaper industry, when pictures
were first sent by submarine cable between London and New York.
Introduction of the Bartlane cable picture transmission system in the early 1920s reduced
the time to transport a picture across the Atlantic from more than one week to less than three
hours. Examples of fields that use DIP

Some of the initial problems in

improving the visual quality of these
early digital pictures were related to
the selection of printing procedures
and the distribution of intensity
levels.
The printing method used to
obtain Figure 1.1 was abandoned
toward the end of 1921 in favour of a
technique based on photographic
reproduction made from tapes
perforated at the telegraph receiving
terminal.
The early Bartlane systems were capable of coding images in five levels of gray. This
capability was increased to 15 levels in 1929.
The history of digital image processing is tied to the development of the digital computer.
The first computers powerful enough to carry out meaningful image processing tasks
appeared in the early 1960s.
In parallel with space

applications, digital image processing
techniques were used in medical
imaging, remote Earth resources
observations, and astronomy in the
late 1960s and early 1970s.
The invention in the early 1970s
of computerized axial tomography
(CAT), also called computerized
tomography (CT), is one of the most
important events in the application of
image processing in medical
diagnosis.
The field of image processing has
grown vigorously since 1960s, and the image processing techniques now are used in a broad
range of applications.

Other than the processing intended for human interpretation, another important area of
applications of digital image processing is in solving problems dealing with machine
perception.
Typical problems in machine perception that routinely utilize image processing
techniques are automatic character recognition, industrial machine vision, military
recognizance, processing of fingerprints, and many other tasks.
The continuing decline in the ratio of computer price to performance and the expansion of
networking and communication bandwidth via World Wide Web and the Internet have
created unprecedented opportunities for continued growth of digital image processing.
1.3 Examples of fields that use DIP
One of the simplest ways to develop a basic understanding of the extent of image
processing applications is to categorize images according to their source. The principal
energy source for images in use today is the electromagnetic (EM) energy spectrum.
Electromagnetic waves can be conceptualized as propagating sinusoidal waves of varying
wavelengths, or they can be though of as a stream of massless particles traveling in a
wavelike pattern and moving at the speed of light. Each massless particle contains a certain
amount (or bundle) of energy.
Gamma-Ray Imaging
Major uses of imaging based on gamma rays
include nuclear medicine and astronomical
observations. Images are produced from
emissions collected by gamma ray detectors.
X-Ray Imaging
X-rays are among the oldest sources of EM radiation used for imaging.

Imaging in the Ultraviolet Band

Applications of ultraviolet ―light‖ are varied. They include lithography, industrial
inspection, microscopy, lasers, biological imaging, and astronomical observations.
Figure 1.8 shows some examples of ultraviolet imaging.

Imaging in the Visible and Infrared Bands

The infrared band often is used in conjunction with visual imaging. Figure 1.9 shows some
examples of images obtained with a light microscope.
Another major area of visual processing is remote sensing, which includes several
bands in the visual and infrared regions of the spectrum. Table 1.1 shows the so-called
thematic bands in NASA‘s LANDSAT satellite.
The primary function of LANDSAT is to obtain and transmit images of the Earth from
space for purposes of monitoring environmental conditions of the planet.
Figure 1.10 shows one image for each of the spectrum bands in Table 1.1.

Weather observation and prediction also are major applications of multi-spectrum imaging
from satellites.
Figure 1.12 and Figure 1.13 show

an application of infrared
imaging. These images are part
of the Nighttime Lights World
data set, which provides a global
inventory of human settlements.
A major area of imaging in the visual spectrum is in an automated visual inspection of

manufactured goods.

Imaging in the Microwave Band

The dominant application of imaging in the microwave band is radar. The unique feature of
imaging radar is its ability to collect data over virtually any region at any time, regardless of
weather or ambient lighting conditions.
Imaging in the Radio Band

The major applications of imaging in the
radio band are in medicine and
astronomy. In medicine, radio waves are
used in magnetic resonance imaging
(MRI).

Examples in which Other Imaging Modalities Are Used

Although imaging in the EM spectrum is dominant by far, there are a number of other
imaging modalities that also are important. Imaging using ―sound‖ finds application in
geological exploration, industry, and
medicine.
In Figure 1.19, the arrow
points to a hydrocarbon (oil and/or
gas) trap. This target is brighter than
the surrounding layers because the
change in the target region is larger.
The best known ultrasound
imaging applications are in medicine,
especially in obstetrics, where unborn
babies are imaged to determine the
health of their development.
1.4 Fundamental Steps in Digital Image Processing
Image acquisition is the first process shown in Fig., Note that acquisition could be as simple
as being given an image that is already in digital form. Generally, the image acquisition stage
involves preprocessing, such as scaling.
Image enhancement is among the simplest and most appealing areas of digital image
processing. Basically, the idea behind enhancement techniques is to bring out detail that is
obscured, or simply to highlight certain features of interest in an image. A familiar example
of enhancement is when we increase the contrast of an image because ―it looks better.‖ It is
important to keep in mind that enhancement is a very subjective area of image processing.
Image restoration is an area that also deals with improving the appearance of an image.

However, unlike enhancement, which is subjective, image restoration is objective, in the

sense that restoration techniques tend to be based on mathematical or probabilistic models of
image degradation. Enhancement, on the other hand, is based on human subjective
preferences regarding what constitutes a ―good‖ enhancement result.
Color image processing is an area that has been gaining in importance because of the
significant increase in the use of digital images over the Internet.
Wavelets are the foundation for representing images in various degrees of resolution.
Compression, as the name implies, deals with techniques for reducing the storage required to
save an image, or the bandwidth required to transmit it. Although storage technology has
improved significantly over the past decade, the same cannot be said for transmission
capacity. This is true particularly in uses of the Internet, which are characterized by
significant pictorial content. Image compression is familiar (perhaps inadvertently) to most
users of computers in the form of image file extensions, such as the jpg file extension used in
the JPEG (Joint Photographic Experts Group) image compression standard.
Morphological processing deals with tools for extracting image components that are
useful in the representation and description of shape.
Segmentation procedures partition an image into its constituent parts or objects. In
general, autonomous segmentation is one of the most difficult tasks in digital image
processing. A rugged segmentation procedure brings the process a long way toward
successful solution of imaging problems that require objects to be identified individually. On
the other hand, weak or erratic segmentation algorithms almost always guarantee eventual
failure. In general, the more accurate the segmentation, the more likely recognition is to
succeed.
Representation and description almost always follow the output of a segmentation
stage, which usually is raw pixel data, constituting either the boundary of a region (i.e., the
set of pixels separating one image region from another) or all the points in the region itself. In
either case, converting the data to a form suitable for computer processing is necessary. The
first decision that must be made is whether the data should be represented as a boundary or as
a complete region. Boundary representation is appropriate when the focus is on external
shape characteristics, such as corners and inflections. Regional representation is appropriate
when the focus is on internal properties, such as texture or skeletal shape. In some
applications, these representations complement each other. Choosing a representation is only
part of the solution for transforming raw data into a form suitable for subsequent computer
processing. A method must also be specified for describing the data so that features of
interest are highlighted. Description, also called feature selection, deals with extracting
attributes that result in some quantitative information of interest or are basic for
differentiating one class of objects from another.
Recognition is the process that assigns a label (e.g., ―vehicle‖) to an object based on
its descriptors. We conclude our coverage of digital image processing with the development
of methods for recognition of individual objects.

1.5 Components of an Image Processing System
As recently as the mid-1980s, numerous models of image processing systems being sold
throughout the world were rather substantial peripheral devices that attached to equally
substantial host computers. Late in the 1980s and early in the 1990s, the market shifted to image
processing hardware in the form of single boards designed to be compatible with industry
standard buses and to fit into engineering workstation cabinets and personal computers. In
addition to lowering costs, this market shift also served as a catalyst for a significant number of
new companies whose specialty is the development of software written specifically for image
processing.
Although large-scale image processing systems still are being sold for massive imaging
applications, such as processing of satellite images, the trend continues toward miniaturizing and
blending of general-purpose small computers with specialized image processing hardware. Figure
3 shows the basic components comprising a typical general-purpose system used for digital image
processing. The function of each component is discussed in the following paragraphs, starting
with image sensing.
With reference to sensing, two elements are required to acquire digital images. The first is
a physical device that is sensitive to the energy radiated by the object we wish to image. The
second, called a digitizer, is a device for converting the output of the physical sensing device into
digital form. For instance, in a digital video camera, the sensors produce an electrical output
proportional to light intensity. The digitizer converts these outputs to digital data.
Specialized image processing hardware usually consists of the digitizer just

mentioned, plus hardware that performs other primitive operations, such as an arithmetic
logic unit (ALU), which performs arithmetic and logical operations in parallel on entire
images. One example of how an ALU is used is in averaging images as quickly as they are
digitized, for the purpose of noise reduction. This type of hardware sometimes is called a

front-end subsystem, and its most distinguishing characteristic is speed. In other words, this
unit performs functions that require fast data throughputs (e.g., digitizing and averaging video
images at 30 framess) that the typical main computer cannot handle.
The computer in an image processing system is a general-purpose computer and can
range from a PC to a supercomputer. In dedicated applications, some times specially
designed computers are used to achieve a required level of performance, but our interest here
is on general-purpose image processing systems. In these systems, almost any well-equipped
PC-type machine is suitable for offline image processing tasks.
Software for image processing consists of specialized modules that perform specific
tasks. A well-designed package also includes the capability for the user to write code that, as
a minimum, utilizes the specialized modules. More sophisticated software packages allow the
integration of those modules and general-purpose software commands from at least one
computer language.
Mass storage capability is a must in image processing applications. An image of size

1024*1024 pixels, in which the intensity of each pixel is an 8-bit quantity, requires one
megabyte of storage space if the image is not compressed. When dealing with thousands, or
even millions, of images, providing adequate storage in an image processing system can be a
challenge. Digital storage for image processing applications falls into three principal
categories: (1) short-term storage for use during processing, (2) on-line storage for relatively
fast re-call, and (3) archival storage, characterized by infrequent access. Storage is measured
in bytes (eight bits), Kbytes (one thousand bytes), Mbytes (one million bytes), Gbytes
(meaning giga, or one billion, bytes), and Tbytes (meaning tera, or one trillion, bytes). One
method of providing short-term storage is computer memory. Another is by specialized
boards, called frame buffers, that store one or more images and can be accessed rapidly,
usually at video rates (e.g., at 30 complete images per second).The latter method allows
virtually instantaneous image zoom, as well as scroll (vertical shifts) and pan (horizontal
shifts). Frame buffers usually are housed in the specialized image processing hardware unit
shown in Fig.3.Online storage generally takes the form of magnetic disks or optical-media
storage. The key factor characterizing on-line storage is frequent access to the stored data.
Finally, archival storage is characterized by massive storage requirements but infrequent need
for access. Magnetic tapes and optical disks housed in ―jukeboxes‖ are the usual media for
archival applications.
Image displays in use today are mainly color (preferably flat screen) TV monitors.
Monitors are driven by the outputs of image and graphics display cards that are an integral
part of the computer system. Seldom are there requirements for image display applications
that cannot be met by display cards available commercially as part of the computer system. In
some cases, it is necessary to have stereo displays, and these are implemented in the form of
headgear containing two small displays embedded in goggles worn by the user.
Hardcopy devices for recording images include laser printers, film cameras, heat-
sensitive devices, inkjet units, and digital units, such as optical and CD-ROM disks. Film

provides the highest possible resolution, but paper is the obvious medium of choice for
written material. For presentations, images are displayed on film transparencies or in a digital
medium if image projection equipment is used. The latter approach is gaining acceptance as
the standard for image presentations.
Networking is almost a default function in any computer system in use today.
Because of the large amount of data inherent in image processing applications, the key
consideration in image transmission is bandwidth. In dedicated networks, this typically is not
a problem, but communications with remote sites via the Internet are not always as efficient.
Fortunately, this situation is improving quickly as a result of optical fiber and other
broadband technologies.
1.6 Elements of Visual Perception
Although the field of digital image processing is built on a foundation of mathematical

formulations, human intuition and analysis play a central role in the choice of one technique
versus another, which often is made based on subjective and visual judgments. We focus our
interest in the mechanics and parameters related to how images are formed and perceived by
humans.
• Eye is nearly a sphere.

• 20 mm diameter.
• Three membranes
enclosed:
– Cornea & Sclera
– Choroid
– Retina
• Cornea – tough
transparent tissue.
• Sclera – continuation with
cornea opaque membrane that
encloses remainder of optic
globe.
• Choroid – contains
network of blood vessels – serves
as major source of nutrition to
eye. Divided into
– Ciliary body
– Iris (contracts or expands to control amount of light enters the eye)
– Lens – made up of concentric layers of fibrous cells and suspended by fibres
• Lens absorbs about 8% of the visible light spectrum (higher at shorter wavelengths)
• Retina is the inner most membrane, which covers the posterior surface
• When the eye is properly focused, light from an object is imaged on the retina

• There is a distribution of discrete light receptors over retina surface. 2 types: cones
and rods
• Cones (6-7 million) are mainly around the central part called fovea and sensitive to
color
• Rods (75-150 million) are distributed wider and are sensitive to low illumination
levels
• We perceive fine detail

with cones (called photopic or
bright-light vision)
• Rods give a general, overall
picture of the entire field of view
(called scotopic or dim-light
vision)
• Fovea is a circular
indentation of about 1.5mm in
diameter
• But, considering it as a
square sensor array of 1.5mm x 1.5mm will be more useful for us
• medium resolution CCD imaging chip can have same number of elements in a
receptor array no larger than 5mm x 5mm
Note: Such a comparison is superficial but, human eye‘s ability to resolve detail is
comparable to current electronic imaging devices.
Image of Formation in the Eye

In the human eye, the distance between the lens and the imaging region (the retina) is fixed,
and the focal length needed to achieve proper focus is obtained by varying the shape of the
lens.
In an ordinary photographic camera, the converse is true. The lens has a fixed focal length,
and focusing at various distances is achieved by varying the distance between the lens and the
imaging plane.

Interesting Fact!!!
How to obtain dimension of the image formed on the retina?
Sol:
w.r.t above example:
Let h denote height of the object in retinal image.
Based on the geometry of above fig.,
15/100=h/17
Therefore, h=2.55mm
Brightness Adaptation and Discrimination

Since digital images are
displayed as a discrete set of
intensities, the eye‘s ability to
discriminate between
different intensity levels is an
important issue.
The range of light
intensity levels adapted by
human visual system is
enormous, on the order of
1010, from the scotopic
threshold to the glare limit.
Experimental evidence
indicates that subjective
brightness is a logarithmic
function of the light intensity
incident on the eye.
The total range of distinct intensity levels the eye can discriminate simultaneously is rather
small when compared with the total adaptation range.
For any given set of conditions, the current sensitivity level of the visual system is
called the brightness adaptation level, for example, as Ba shown in Figure 2.4. It represents
the range of subjective brightness that the eye can perceive when adapted to this level.
Another important issue is the ability of the eye to discriminate between changes in
light intensity at any specific adaptation level.
Figure 2.5 shows the idea of a basic experiment to determine the human visual system
for brightness discrimination.
An increment of illumination, ΔI , is added

to the field in the form of a short-duration
flash. If ΔI is not bright enough, the subject
says ―no‖. As_I gets stronger, the subject
may give a positive response of ―yes‖
Finally, when ΔI becomes strong enough,
the subject will give a positive response of
―yes‖ all the time. Let ΔIc denote the
increment of illumination discriminable
50% of the time with background
illumination I . It is called the Weber ratio.

A small value of ΔIc /I means that a small change in intensity is discriminable. It represents
―good‖ brightness discrimination. A plot of logΔIc /I as a function of log I has the general
shape shown in Figure 2.6.
Figure 2.6 shows that brightness

discrimination improves significantly
as background illumination
increases. Two phenomena
demonstrate that perceived
brightness is not a simple function of
intensity.
Figure 2.7 (a) shows the fact that the visual

system tends to undershoot or overshoot
around the boundary of regions of different
intensities.
As shown in Figure 2.7 (c), although
the intensity of the stripes is constant, we
actually perceive a brightness pattern that is
strongly scalloped near the boundaries.
Another phenomenon, called
simultaneous contrast, is related to the fact
that a region‘s perceived brightness does not
depend simply on its intensity.
In Figure 2.8, all the center squares have exactly the same intensity, though they appear to
the eye to become darker as the background gets lighter.

Other examples of human perception phenomena are optical illusions, in which the eye fills
in non-existing information or wrongly perceives geometrical properties of objects.
Light and the Electromagnetic Spectrum
In 1666, Sir Isaac Newton discovered that when a beam of sunlight is passed through a glass
prism, the emerging beam of light is consists of a continuous spectrum of colors from violet
at one end to red at the other.
As Figure 2.10 shows that the range of colors we perceive in visible light represents a very
small portion of the electromagnetic spectrum. Wavelength λ and frequency ν are related by
the expression .
λ =c/v
where c is the speed of light ( 2.998×108m/s ). The energy of the various components of the
electromagnetic spectrum is given by E = hν. where h is Planck‘s constant.
Electromagnetic wave can be visualized as propagating sinusoidal waves with wavelengthλ ,

or they can be thought of as a stream of massless particles traveling in a wavelike pattern.
Each massless particle contains a bundle of energy, which is called a photon.
Regarding to (2.2-2), energy is proportional to frequency, so the higher-frequency

electromagnetic phenomena carry more energy per photon.
Light is a particular type of electromagnetic radiation that can be sensed by human
eye. The visible band of the electromagnetic spectrum spans the range from about 0.43 μm
(violet) to about 0.79 μm (red).
The colors that human perceive in an object are determined by nature of the light reflected
from the object.
Light that is void of color is called monochromatic light. The only attribute of
monochromatic light is its intensity or amount. Because the intensity is perceived to vary
from black to white, the term gray level is used to denote monochromatic intensity.
The terms intensity and gray level are used interchangeably.

The range of measured values of monochromatic light from black to white is usually
called the gray scale, and monochromatic images are frequently referred to as gray-scale
images.
1.7 Image Sensing and Acquisition
Figure 2.12 shows the three principal sensor arrangements used to transform illumination
energy into digital images.
Incoming energy is transformed into a voltage by the combination of input electrical power
and sensor material that is responsive to the particular type of energy being detected.
The output voltage waveform is the response of the sensor(s), and a digital quantity is
obtained from each sensor by digitizing its response.
Image Acquisition Using a Single Sensor

In order to generate a 2-D image using a single
sensor, there has to be relative displacements in
both the x - and y- directions between the sensor
and the area to be imaged.
Figure 2.13 shows an arrangement used
in high-precision scanning.
Other similar mechanical arrangements
use a flat bed, with the sensor moving in two
linear directions.
Another example of imaging with a

single sensor places a laser source coincident with the sensor. Moving mirrors are used to
control the outgoing beam in a scanning pattern and to direct the reflected laser signal onto
the sensor.
Image Acquisition Using Sensor Strips

As Figure 2.12 (b) shows, an in-
line arrangement of sensors in
the form of a sensor strip is used
much more than a single sensor.
Figure 2.14 (a) shows
the type of arrangement used in
most flat bed scanners.
Sensor strips mounted in
a ring configuration are used in
medical and industrial imaging
to obtain cross-sectional images
of 3-D objects, as Figure 2.14
(b) shows.
Image Acquisition Using Sensor Arrays

Figure 2.12 (c) shows individual sensors arranged in the form of a 2-D array. This is the
predominated arrangement found in digital cameras. Since the sensor array is two-
dimensional, its key advantage is that a complete image can be obtained by focusing the
energy pattern onto the surface of the array. The principal manner in which array sensors are
used is shown in Figure 2.15.

A Simple Image Formation Model

We denote images by two-dimensional functions of the form f (x,y) . The value of f at
coordinates (x,y) is a positive scalar quantity whose physical meaning is determined by the
source of the image.
The function f (x,y) must be nonzero and finite
0 < f (x,y) < ∞
The function f (x,y) may be characterized by two components:
1. The amount of source illumination incident on the scene being viewed;
2. The amount of illumination reflected by the objects in the scene.
These are called illumination and reflectance components and are denoted by i(x,y) and r(x,y).
These two functions combine to form f (x,y) :
f (x,y) = i(x,y)r(x,y)
where
0 < i(x,y) < ∞
and
0 < r(x,y) < 1,
Which means that reflectance is bounded by 0 (total absorption) and 1 (total reflectance).
1.8 Image Sampling and Quantization
To create a digital image, we need to convert the continuous sensed data into digital form.
This involves two processes: sampling and quantization.
Basic Concepts in Sampling and Quantization

To convert an image to digital form, we have to sample the function in both coordinates and
in amplitude. Digitizing the coordinate values is called sampling. Digitizing the amplitude
values is called quantization. Figure 2.16 shows the basic idea behind sampling and
quantization.

When a sensing array is used for image acquisition, there is no motion and the number of
sensors in the array establishes the limits of sampling in both directions.
Representing Digital Images

By applying sampling and quantization, we can convert a continuous image function of two
continuous variables, f (s,t) , into a digital image f (x,y) , which contains M rows and N
columns. (x,y) are discrete coordinates: x = 0,1,2,...,M −1 and y = 0,1,2,...,N −1 .
In general, the value of the image at any coordinates (x,y) is denoted by f (x,y) , where x
and y are integers.
The section of the real plane spanned by the coordinates of an image is called the spatial
domain.
Figure 2.18 shows three basic
ways to represent f (x,y) . The
representations in Figure 2.18
(b) and (c) are the most useful.
Image displays allow us to view
results at a glance, and
numerical arrays are used for
processing and algorithm
development.
In equation form, we
write the representation of an M
×N numerical array as

In some discussions, we use a more traditional matrix notation to denote a digital image as its
elements:
Due to storage and quantizing hardware considerations, the number of intensity levels
typically is an integer power of 2:
L=2k
We assume that the discrete levels are equally spaced and that they are integers in the
interval [ 0,L −1] .
We define the dynamic range of an imaging system to be the ratio of the maximum
measurable intensity to the minimum detectable intensity level in the system. As a rule, the
upper limit is determined by saturation and the lower limit by noise.
Closely associated with the concept of dynamic range is image contrast, which is defined as
the difference in intensity between the highest and lowest intensity levels in an image. The
number of bits required to store a digitized image is
b = M ×N ×k . When M = N, becomes b = N2k.

Spatial and Intensity Resolution

Spatial resolution is a measure of the smallest discernible detail in an image.
• Spatial resolution: Measure of the smallest discernible detail in an image.

• Varying MxN or N (if M=N)
[Sampling is a principal factor of determining the spatial resolution of an image.]
• Gray Level Resolution: Smallest discernible change in gray level.

• Note - it is a highly subjective process.
• Varying ―k‖
Example of
Spatial
resolution -
down to size
Varying ―N‖
Example of
Spatial
resolution -
upsampled

Example of Gray-level resolution - down to size Varying ―k‖
Varying “N” & “k” simultaneously

Huang [1965] attempted experiment to quantify by varying N & k on images shown in fig.
Experiment:
1. N & k were varied
2. Observers were asked to rank them according to their subjective quality
3. Results were summarized in the form of ―isopreference curves‖

• Curves tends to become more vertical as details

in the image increases.
• Images with large amount of detail needs only
few gray levels.
• For Face and cameraman images:
As N increases – number of gray level decreases
Aliasing and Moire Patterns
• Shannon sampling theorem [Bracewell A995)]

• If the function is sampled at a rate equal to or greater than twice its highest frequency,
it is possible to recover completely the original
function from its samples.
• If the function is under-sampled, then a
phenomenon called aliasing corrupts the sampled
image.
• However, aliasing is always present in a
sampled image.
• The effect of aliased frequencies can be seen
under the right conditions in the form of Moire
patterns.
• A Moire pattern, caused by a breakup of the
periodicity, is seen in Fig. 2.24 as a 2-D sinusoidal
(aliased) waveform (which looks like a corrugated tin
roof) running in a vertical direction
Figure shows two identical periodic patterns of
equal-equally spaced vertical bars, rotated in opposite directions and then superimposed on
each other by multiplying the two images.
A Moire pattern, caused by a breakup of the periodicity, is seen in Fig.
Zooming and Shrinking Digital Images

Sampling and quantization can be concluded using Zooming and Shrinking
1. Zooming may be viewed as over sampling
2. Shrinking may be viewed as under sampling.
Zooming requires two steps: the creation of new pixel locations, and the assignment
of gray levels to those new locations.

Nearest neighbor interpolation:

• Increase the size of an image e.g 1.5 times 500X500 = 750X750
• Lay an imaginary grid on the original image
• Assign the nearest pixel gray level to the new pixel in overlay grid
Pixel replication:
Pixel replication is applicable when we want to increase the size of an image an
integer number of times.
We can duplicate each column. This doubles the image size in the horizontal
direction.
Then, we duplicate each row of the enlarged image to double the size in the vertical
direction.
Image shrinking is done in a similar manner as just described for zooming. The equivalent
process of pixel replication is row-column deletion. For example, to shrink an image by one-
half, we delete every other row and column.
1.9 Some Basic Relationships Between Pixels
1.9.1 Neighbors of a Pixel
4 – Neighbors

D – Neighbors
8 – Neighbors
N8 (P) = N4 (P) + ND (P)
1.9.2 Adjacency, Connectivity, Regions, and Boundaries
Let V be the set of gray-level values used to define adjacency. e.g. V = {1}
4-adjacency: Two pixels p and q with values from V are 4-adjacent if q is in the set N4 (P).
8-adjacency: Two pixels p and q with values from V are 8-adjacent if q is in the set N8 (P).
m-adjacency (mixed adjacency):
Two pixels p and q with values from V are m-adjacent if
1. q is in N4(p),or
2. q is in ND(p) and the set N4(p) ∩ N4(q) has no pixels whose values are from V.
A {digital} path (or curve) from pixel p with coordinates (x, y) to pixel q with
coordinates (s, t) is a sequence of distinct pixels with coordinates.
Connected set:
Let S represent a subset of pixels in an image. Two pixels p and q are said to be
connected in S if there exists a path between them consisting entire pixels in S
Region: Region R is a subset of pixels in an image and if it is a connected set.

Boundary: The boundary (also called border or contour) of a region R is the set of pixels in
the region that have one or more neighbors that are not in R.
1.9.3 Distance Measures
Let pixels be p, q, and z, with coordinates (x, y), (s, t), and (v, w), respectively,
D is a distance Junction or metric if:
(a) D(p, q) ≥ 0 (D{p,q)=0 iff p = q),
(b) D(p, q) = D(q, p) and
(c) D(p, z) ≤ D(p, q) +D(q, z).
1.9.4 Image Operations on a Pixel Basis
e.g. ―Dividing one image by another,"
• Division is carried out between corresponding pixels in the two images.
1.10 Linear and Nonlinear Operations.
For any two images f and g and any two scalars a and b,
Linear – if it satisfies above equation, Non – Linear, If it doesn‘t

Example 1. Consider the image segment shown. Let V={0, 1} and compute the lengths of the
shortest 4-, 8-, and m-path between p and q. If a particular path does not exist between
these two points, explain why.
Sol.
• When V = {0,1}, 4-path does not exist between p and q because it is impossible to get
from p to q by traveling along points that are both 4-adjacent and also have values from
V. Fig. a shows this condition; it is not possible to get to q.
• The shortest 8-path is shown in Fig. b its length is 4.
• The length of the shortest m- path (shown dashed) is 5.
• Both of these shortest paths are unique in this case.
Example 2:

Example 3: Define 4-, 8- and m-adjacency. Compute the lengths of the shortest 4-, 8- and m-
path between p and q in the image segment shown in Fig. by considering V = {2, 3, 4}
(p)  (q) : 3 – 2 – 3 – 4 – 2

Module – 2 RBT
Level
Spatial Domain: Some Basic Intensity Transformation Functions, Histogram
Processing, Fundamentals of Spatial Filtering, Smoothing Spatial Filters, Sharpening
Spatial Filters
Frequency Domain: Preliminary Concepts, The Discrete Fourier Transform (DFT)
L1, L2,
of Two Variables, Properties of the 2-D DFT, Filtering in the Frequency Domain,
L3
Image Smoothing and Image Sharpening Using Frequency Domain Filters, Selective
Filtering.
Chapter 3: Sections 3.2 to 3.6 and Chapter 4: Sections 4.2, 4.5 to 4.10]
Image Enhancement in Spatial Domain
Spatial domain refers to the image plane itself, and image processing methods in this
category are based on direct manipulation of pixels in an image.
Two principal categories of spatial processing are intensity transformations and spatial
filtering.
Intensity transformations operate on single pixels of an image for the purpose of contrast
manipulation and image thresholding.
Spatial filtering deals with performing operations, such as image sharpening, by working
in a neighbourhood of every pixel in an image.
2.1Some Basic Intensity Transformation Functions
Generally, spatial domain techniques are more efficient computationally and require less
processing resources to implement. The spatial domain processes can be denoted by the
expression
g(x, y) = T[ f (x, y)]
Where f(x, y) is the input image, g(x, y) is the output image, and T is an operator on f
defined over a neighbourhood of point (x, y). The operator can apply to a single image or to a
set of images.

Typically, the neighbourhood is rectangular, centered on (x, y), and much smaller than the
image.
Example:
Suppose that the neighbourhood is a square of size 3×3 and the operator T is defined
as ―compute the average intensity of the neighbourhood.‖
At an arbitrary location in an image, say (10, 15), the output g(10, 15) is computed as
the sum of f(10, 15) and its 8-neighbourhood is divided by 9.
The origin of the neighbourhood is then moved to the next location and the procedure
is repeated to generate the next value of the output image g.
The smallest possible neighbourhood is of size 1×1.
Intensity transformations are among the simplest of all image processing

techniques. We use the following expression to indicate a transformation
s = T(r)
Where T is a transformation that maps a pixel value r into a pixel value s.

 Image Negatives
The negative of an image with intensity levels in the range [0, L -1] is obtained by using the
negative transformation shown in Figure 3.3, which is given by
s = L -1- r
The negative transformation can be used to enhance white or gray detail embedded in dark
regions of an image.
 Log Transformations
The general form of the log transformations is
s = c log(1+ r)
where c is a constant, and r ≥ 0
The log transformation maps a narrow range of low intensity values in the input into a
wider range of output levels. We use the transformation of this type to expend the values of
dark pixels in an image while compress the higher-level values.
The opposite is true of the inverse log transformation.
Figure 3.5(a) shows a Fourier spectrum with values in the range 0 to 1.5106 .
Figure 3.5(b) shows the result of applying (3.2-2) to the spectrum values, which will
rescale the values to a range of 0 to 6.2, and displaying the results with an 8-bit system.
 Power-Law (Gamma) Transformations

Power-law transformations have the basic form
s cr
Where c and are positive constants. A variety of devices used for image capture, printing,
and display according to a power-law. By convention, the exponent in the power-law
equation is referred to as gamma.

Unlike the log function, changing the value of  will obtain a family of possible
transformations. As shown in Figure 3.6, the curves generated with values of  > 1 have
exactly the opposite effect as those generated with values of  < 1.
The process used to correct these power-law response phenomena is called gamma
correction.
Gamma correction is important if displaying an image accurately on a computer

screen is of concern.
Gamma correction has become increasingly important as the use of digital images
over the Internet has increased.
In addition to gamma correction, power-law transformations are very useful for
general-purpose contrast manipulation.

Piecewise-Linear Transformation Functions
A complementary approach to the abovementioned methods is to use piecewise linear

functions.
 Contrast stretching
One of the simplest piecewise linear functions is a contrast stretching transformation.
Contrast-stretching transformation is a process that expands the range of intensity
levels in an image so that it spans the full intensity range of the recording medium or display
device.
 Intensity-level slicing
Highlighting a specific range of intensities in an image often is of interest. The
process, often called intensity-level slicing, can be implemented in several ways, though
basic themes are mostly used. One approach is to display in one value all the values in the
range of interest and in another all other intensities, as shown in Figure 3.11 (a).
Another approach is based on the transformation in Figure 3.11(b), which brightens (or
darkens) the desired range of intensities but leaves all other intensities levels in the image
unchanged.

Figure 3.12 (b) shows the result of using a transformation of the form in Figure 3.11 (a),
with the selected band near the top of the scale, because the range of interest is brighter than
the background.
Figure 3.12 (c) shows the result of using the transformation in Figure 3.11 (b) in which a
band of intensities in the mid-gray region around the mean intensity was set to black, while
all other intensities were unchanged.
 Bit-plane slicing
Instead of highlighting intensity-level ranges, we could highlight the contribution made to
total image appearance by specific bits.
Figure 3.13 shows an 8-bit image, which can be considered as being composed of eight 1-bit
planes, with plane 1 containing the lowest-order bit of all pixels in the image and plane 8 all
the highest-order bits.

Note that each bit plane is a binary image. For example, all pixels in the border have values 1
1 0 0 0 0 1 0, which is the binary representation of decimal 194. Those values can be viewed
in Figure 3.14 (b) through (i). Decomposing an image into its bit planes is useful for
analysing the relative importance of each bit in the image.
2.2Histogram Processing
The histogram of a digital image with intensity levels in the range [0, L -1] is a discrete
function h(rk) = nk , where rk is the kth intensity value and nk is the number of pixels in the
image with intensity rk.
It is common practice to normalize a histogram by diving each of its components by
the total number of pixels in the image, denoted by MN, where M and N are the row and
column dimensions of the image.
A normalized histogram is given by
P(rk) can be seen as an estimate of the probability of occurrence of intensity level rk in an

image. The sum of all components of a normalized histogram is equal to 1.
Histograms are the basic for numerous spatial domain processing techniques.
Example:
Figure 3.16, which is the pollen image of Figure 3.10 shown in four basic intensity
characteristics: dark, light, low contrast, and high contrast, shows the histograms
corresponding to these image.

2.2.1 Histogram Equalization
We consider the continuous intensity values and let the variable r denote the intensities of an
image. We assume that r is in the range [0, L 1].
We focus on transformations (intensity mappings) of the form
s T(r) 0 r L 1
that produce an output intensity level s for every pixel in the input image having intensity r.
Assume that
(a) T(r) is a monotonically increasing function in the interval 0 r L 1, and
(b) 0  T(r)  L -1 for 0  r L -1.
In some formations to be discussed later, we use the inverse
r (s) 0 s L 1
From Figure 3.17 (a), we can see that it is possible for multiple values to map to a single
value and still satisfy these two conditions, (a) and (b). That is, a monotonic transformation
function can perform a one-to-one or many-to-one mapping, which is perfectly fine when
mapping from r to s.
However, there will be a problem if we want to recover the values of r uniquely from
the mapped values.
As Figure 3.17 (b) shows, requiring that T(r) be strictly monotonic guarantees that the
inverse mappings will be single valued. This is a theoretical requirement that allows us to
derive some important histogram processing techniques.
Prove that result of applying the transformation to all intensity levels „r‟. The resulting
intensities „s‟ have a uniform PDF independently of the form of the PDF of the r‟s
Solution:
The intensity levels in an image may be viewed as random variables in the interval [0, L -1].
A fundamental descriptor of a random variable is its probability density function (PDF).

 (1)
A transformation function of particular importance in image processing has the form
 (2)
RHS in equation 2 is recognized as cumulative distribution function of random variable r

Since PDFs always are positive, the transformation function satisfies  condition (a)
Upper limit in equation is r = (L -1) , the integral evaluates to 1, satisfies  condition (b)
Differentiating equation wrt ‗dr‘ we get:
Substituting this result in equation 1

Histogram equalization transformation
Continuous case: Discrete values:

r k
s  T (r )  ( L  1)  pr ( w)dw sk  T (rk )  ( L  1) pr (rj )
0
j 0
Example1. (Refer to Class Notes)

Given 64X64 image having 3 bit gray scales, perform histogram equalization and draw
the histogram of image before and after equalization. Intensity distributions are shown
in Table.
Solution:
3-bit image ( L = 8) of size 64´64 pixels (MN = 4096 )
The histogram of our hypothetical image is sketched in

Figure
By using Trasformation equation, we can obtain values of

the histogram equalization function:

Example 2:
For a given 4X4 image having 0 – 9 gray scales, perform histogram equalization and draw the
histogram of image before and after equalization. 4X4 image is shown in Fig.
Solution:
0 – 9 levels => L=10 levels
r n P=n/16 C_P L-1*C_P Round up values-

Cumulative 9*C_P Equalized Intensity values
2 6 0.375 0.375 3.375 3
3 5 0.312 0.687 6.183 6
4 4 0.25 0.937 8.433 8
5 1 0.062 1 9 9
Equalized values 2 3 (6 pixels), 36 (5 pixels), 48 (4 pixels), 59 (1 pixel)

2 3 3 2 3 6 6 3
4 2 4 3 8 3 8 6
3 2 3 5 6 3 6 9
2 4 2 4 3 8 3 8
Given Equalized
7 7
6 6
5 5
4 4
3 3
2 2
1 1
0 0
2 3 4 5 3 6 8 9
2.2.2 Histogram Matching
Histogram matching (histogram specification)—A processed image has a specified histogram

Let pr (r ) and pz ( z ) denote the continous probability
density functions of the variables r and z. pz ( z ) is the
specified probability density function.
Let s be the random variable with the probability
r
s  T (r )  ( L  1)  pr ( w)dw
0
Obtain a transformation function G

z
G ( z )  ( L  1)  pz (t )dt  s
0
Histogram Matching: Discrete Cases
• Obtain pr(rj) from the input image and then obtain the values of sk, round the value to
the integer range [0, L-1].
k
( L  1) k
sk  T (rk )  ( L  1) pr (rj )  nj
j 0 MN j 0
• Use the specified PDF and obtain the transformation function G(zq), round the value
to the integer range [0, L-1].

• Mapping from sk to zq
q
G( zq )  ( L  1) pz ( zi )  sk
i 0
zq  G 1 (sk )
Note: Refer to the Problems in Class Work
2.2.3 Local Histogram Processing:
Procedure:
1. Define a neighborhood and move its center from pixel to pixel

2. At each location, the histogram of the points in the neighborhood is computed. Either
histogram equalization or histogram specification transformation function is obtained.
3. The function is used to Map the intensity of the pixel centered in the neighborhood
4. Move to the next location and repeat the procedure
2.2.4 Using Histogram Statistics for Image Enhancement

So far, image histogram used for enhancement.
We can also use- statistical parameters obtained from the histogram
Mean value of r:
L 1
M 1 N 1
m   ri p(ri ) 1
i 0

MN
  f ( x, y )
x 0 y 0
Moment of r about its mean:

L 1
un (r )   (ri  m)n p(ri )
i 0
Variance
L 1
 2  u2 (r )   (ri  m) 2 p(ri )
i 0
M 1 N 1
1
    f ( x, y )  m 
2
MN x 0 y 0
Local average intensity

Local variance
L 1
L 1
msxy   ri psxy (ri )
 s2   (ri  ms ) 2 ps (ri )
xy xy xy i 0
i 0
sxy denotes a neighborhood

 E f ( x, y ), if msxy  k0 mG and k1 G   sxy  k2 G

g ( x, y )  
 f ( x, y ), otherwise
mG : global mean;  G : global standard deviation

k0  0.4; k1  0.02; k2  0.4; E  4
2.2.5 Enhancement using Arithmetic/logic operations

• Pixel by pixel operation between two or more images (except NOT operator)
• Addition/subtraction operations can be done sequentially or parallely (simultaneously)
• AND, OR, NOT – performed pixel by pixel.
• Other logical operators can be implemented using above operators.
• Pixels are represented in binary values e.g. 8 bit binary data
• E.g NOT operations on 00000000  11111111
AND and OR used for masking
• Masking also known as – Region of Interest

processing
• Masking is used to isolate an area for processing
• Highlights the area and differentiate from the

rest
Arithmetic operations
• Addition and subtraction – widely used

for image enhancement
• Division – same as multiplication of

image by its reciprocal
Image subtraction
Note: higher order bit-planes carry significant

amount of viusally relevant information
g{x,y)=f{x,y)-h{x,y)

Image Averaging:
• g(x, y) be the noisy image formed by the addition of noise η(x, y) to an original image
f(x, y)
• Noise is uncorrelated and has zero average value at each pair of coordinates, Then
Is equal to enhanced image

(original image)
Note:
1. As K increases  variability (noise) decreases
2. approaches f(x, y) as number of noisy images (k) increases in averaging
process
2.3Fundamentals of Spatial Filtering

• Some operations work with the values of neighbourhood pixels.
• Create a Mask – sub image , filter, kernel, template or window
• Move mask from point to point in an image
• Result / response of mask = sum of products of the mask coefficients with the
corresponding pixels directly under the mask.
So, A spatial filter consists of

(a) a neighborhood, and
(b) a predefined operation (mask)
Linear spatial filtering of an image of size MxN with a filter of size mxn is given
by the expression

a b
g ( x, y )    w(s, t ) f ( x  s, y  t )
s  a t  b
m  2a  1; n  2b  1
a, b are non-negative values producing

Masks of odd size. E.g. 3, 5, 7, 9
m = no. of rows in mask
n = no. of columns in mask
2.4Smoothing Spatial Filters
Smoothing filters are used for

• Blurring
• Noise reduction
Blurring is used in removal of small details and bridging of small gaps in lines or curves
2 ways of Smoothing spatial filters
1. linear filters
2. Non-linear filters.
2.4.1 Linear filter:
– Filter/Mask is Average of the pixels in the neighbourhood of the mask

– Idea is to, replace value of every pixel in an image by the average of its
neighbourhood defined by the mask
– Reduces sharp transitions in the intensities
• Two types of masks (linear filters):
– Averaging filter
– Weighted averaging filter
Averaging filter:
– Also known as low pass filter
– Filter coefficients are equal is called Box Filter
Weighted Averaging filter:
– Pixels multiplied by different coefficients
– Center pixel more weightage
– Other pixels are inversely weighted (distance from center of mask)

2.4.2 Order-statistic (Nonlinear) Filters

Nonlinear
— Based on ordering (ranking) the pixels contained in the filter mask
— Replacing the value of the center pixel with the value determined by the
ranking result
• Edges are better preserved
E.g., median filter, max filter, min filter
Min filter:
• Filter selects the smallest value in window and replaces the center
• Enhances dark areas of image
Max filter:
• Replaces the center by largest value
• Enhances bright areas
Median Filter:
• Sort the neighbouring pixels
• Replaces center by median value
2.5Sharpening Spatial Filters
Foundation:
► Highlights fine details and enhance

details that is blurred
► The first-order derivative of a one-

dimensional function f(x) is the
difference
f
 f ( x  1)  f ( x)
x
► The second-order derivative of f(x)
as the difference
2 f
 f ( x  1)  f ( x  1)  2 f ( x)
x 2

• First order derivatives 

– Produces thicker edges
– Stronger response to a gray level step (…..7…..)
• Second order derivatives 
– Stronger response to fine detail (such as thin lines and isolated points)
– Produces double response at step changes in gray level (….7 -7..).
2.5.1 Sharpening Spatial Filters: Laplace Operator
Sharpened image = original image – laplacian operator
The second-order isotropic derivative operator is the Laplacian for a function (image) f(x,y)
2 f 2 f
2 f  
x 2 y 2
2 f
 f ( x  1, y )  f ( x  1, y )  2 f ( x, y )
x 2
2 f
 f ( x, y  1)  f ( x, y  1)  2 f ( x, y )
y 2
 2 f  f ( x  1, y )  f ( x  1, y )  f ( x, y  1)  f ( x, y  1)
- 4 f ( x, y )
Image sharpening in the way of using the Laplacian:
2.5.2 Unsharp Masking and Highboost Filtering

► Unsharp masking
Sharpen images by subtracting blurred version of an
image from image itself
e.g., printing and publishing industry
► Steps
1. Blur the original image
2. Subtract the blurred image from the original (Resulting
difference image – called as mask)
3. Add the mask to the original

First, we obtain the mask:

mask(x,y)=f(x,y)- f_blur(x,y)
Then we add weighted portion of the mask to

original image:
g(x, y) = f(x, y) + k * mask(x, y)

Where k>=0
If K = 1  Unsharp Masking
If k > 1  Highboost Filtering
2.5.3 Image Sharpening based on First-Order Derivatives

For function f ( x, y ), the gradient of f at coordinates ( x, y )
is defined as
 f 
 g x   x 
f  grad( f )      
 g y   f 
 y 
The magnitude of vector f , denoted as M ( x, y)
Gradient Image M ( x, y )  mag(f )  g x 2  g y 2
M ( x, y) | z8  z5 |  | z6  z5 |
gy
gx
Roberts Cross-gradient Operators Sobel Operators

M ( x, y) | z9  z5 |  | z8  z6 | M ( x, y ) | ( z7  2 z8  z9 )  ( z1  2 z2  z3 ) |
 | ( z3  2 z6  z9 )  ( z1  2 z4  z7 ) |

Image Enhancement in Frequency Domain
Joseph Fourier (21 March 1768 – 16 May 1830) was a French

mathematician and physicist best known for initiating the investigation of
Fourier series and their application to problems of heat transfer.
One of the most important Fourier‘s contributions states that any
periodic function can be expressed as the sum of sines and/or cosines of
different frequencies, each multiplied by a different coefficient. We now
call this sum a Fourier series.
It does not matter how complicated the function is, if it is periodic
and satisfies some mild mathematical conditions, it can be represented by
Fourier series.
Even functions that are not periodic (but
whose area under the curve is finite) can be
expressed as the integral of sines and/or
cosines multiplied by a weighting function.
The formulation in this case is the Fourier
transform, and its utility is even greater than
the Fourier series in many theoretical and
applied disciplines.
One of the most important characteristics of
these representations is that a function,
expressed in either a Fourier series or
transform, can be reconstructed completely via
an inverse process with no loss of information.
This characteristic allows us to work in the
Fourier domain and then return to the original
domain of the function without losing any
information.
The initial application of Fourier‘s ideas
was in the field of heat diffusion. The advent of
digital computers and the ―discovery‖ of a fast
Fourier transform (FFT) algorithm in the early 1960s revolutionized the field of signal
processing.
We will show that Fourier techniques will provide a meaningful and practical way to
study and implement a host of image processing approaches.
2.6 Preliminary Concepts
2.6.1 Complex numbers
A complex number, C , is defined as

C = R + jI
where R and I are real numbers, and j is an imaginary number equal to √−1 .
The conjugate of a complex number C , denoted C∗ , is defined as
C∗ = R − jI
Sometimes, it is useful to represent complex numbers in polar coordinates,
C = C (cos θ + j sinθ )

where |C| = √(R2 + I 2)is the length of the vector extending from the origin of the complex
plane to the point (R, I ) , and θ is the angle between the vector and the real axis.
we have the following familiar representation of complex numbers in polar

coordinates
C = C ejθ
θ(u) = arctan[I(u)/R(u)]
2.6.2 Fourier Series
A function f (t) of a continuous variable t that is periodic with period, T , can be

expressed as the sum of sines and cosines multiplied by appropriate coefficients. The sum,
known as a Fourier series, has the form.
2.6.3 Impulses and Their Sifting Property
Central to the study of linear systems and the Fourier transform is the concept of an
impulse and its sifting property. A unit impulse of a continuous variable t located at t = 0 ,
denoted δ(t) , is defined as
It also satisfies the identity
An impulse has the sifting property with respect to integration
A more general statement of the sifting property (wrf to integration)
The unit discrete impulse, δ(x), is defined as

Sifting property for discrete variables is given by
General Form:
2.6.4 The Fourier Transform of Functions of One Continuous Variable
Fourier transform of a continuous function f (t)
Therefore, Fourier transform may be written as
Using Euler‘s formula
Inverse Fourier transform may be written as
2.6.5 Convolution
The convolution of f(t) & h(t) is defined as:
By a few steps, we can find the so-called Fourier transform pair:

2.7The Discrete Fourier Transform (DFT) of Two Variables

2.7.1 The 2-D Impulse and its Shifting property
 The impulse, δ(t, z) , of two continuous variables, t and z,
2-D impulse exhibits the sifting property under integration, and given by
More generally for an impulse located at ( t0, z0)
 For discrete variables x and y , the 2-D discrete impulse δ(x, y)
For an impulse located at (x0, y0) , as shown in Figure
2.7.2 The 2-D Continuous Fourier Transform Pair

Let f (t, z) be a continuous function of two continuous variables
2-D continuous Fourier transform pair is given by:

Example: Obtaining the 2-D Fourier transform of a simple function
2.7.3 Two-Dimensional Sampling and Sampling Theorem
Sampling in 2-D can be modeled using the sampling function (2-D impulse train):
Where T and Z are the separations between samples along the taxis
and z-axis.

Aliasing in Images:
There are two principal manifestations of aliasing in images:
• Spatial aliasing, which is due to under-sampling;
• Temporal aliasing, which is related to time intervals between images in a sequence of
images.
Assignment
2.7.4 The 2-D Discrete Fourier Transform and Its Inverse

2.8Properties of the 2-D DFT

2.8.1 Relationships between Spatial and Frequency Intervals
• f (t, z) continuous function
• f (x,y) sampled digital image consisting (M ×N) samples taken in the t and z-
directions.
• ΔT and Δ Z separations between samples
• The corresponding discrete, frequency domain variables are given by
2.8.2 Translation and Rotation
The Fourier transform pair satisfies the following translation properties:
Equation indicates that rotating f (x,y) by an angle θ0 will rotate F(u,v) by the same angle.
Conversely, rotating F(u,v) will rotate f (x,y) by the same angle.
2.8.3 Periodicity
The 2-D Fourier transform and its inverse are infinitely periodic in the u and v directions:

2.8.4 Fourier Spectrum and Phase Angle

• Since the 2-D DFT is complex in general, it can be expressed in polar form:

2.8.5 The 2-D Convolution Theorem

1–D Convolution
2–D Circular Convolution
2.9Filtering in the Frequency Domain

• Filtering Technique in frequency domain based on
– Modifying DFT to achieve specific objectives
– Compute IDFT to get back to image domain
e.g.
We have access to magnitude (spectrum) and the phase angle
• Visual analysis of the Magnitude (spectrum)
• Phase component – not useful
• Spectrum – provides gross characteristics of the image

Two notable features: strong edges Vertical component that is off-axis

that run approximately at ±45 slightly to the left is caused due to
degrees and two white oxide the edges of the white oxide
protrusions. protrusions
Given a digital image f (x, y) , of size M ×N, filtering Equation is given by:
– One of the simplest filters we can construct is A filter H(u,v)

– 0 at the center of the transform and
– 1 elsewhere
(This filter will reject the DC term and pass all other terms)
SEM image of damaged IC Filtered image by Setting F(M/2, N/2) to 0

in the Fourier Transform
Lowpass filter:
• High Frequencies  Due to Sharp transitions (edges & Noise)
• Filter H(u,v) should attenuate high frequencies
• low frequencies  would blur an image
Highpass filter:
– Enhances sharp details
– But Causes, reduction in contrast

Steps for Filtering in the Frequency Domain

2.10 Image Smoothing

Smoothing (blurring) is achieved in the frequency domain by lowpass filtering.
2.10.1 Ideal Lowpass Filters

Passes all frequencies within a circle of radius D0 from the origin and ―cuts off‖ all
frequencies outside this circle is called an ideal lowpass filter (ILPF)

• ILPF in the frequency domain looks like a box filter

• Corresponding spatial filter has the shape of a sinc function.
• The center lobe of the sinc is the principal cause of blurring,
• While the outer, smaller lobes are mainly responsible for ringing.
2.10.2 Butterworth Lowpass Filters

(b) – (f) results

obtained with
cut off
frequencies at
radii 10, 30,
60, 160 & 460
2.10.3 Gaussian Lowpass Filters

Where σ = D0 measure of spread about the center,
2.11 Image Sharpening Using Frequency Domain Filters

Sharpening is achieved in the frequency domain by Highpass filtering.
• A highpass filter is obtained from a given lowpass filter using the equation
Ideal, Butterworth, and Gaussian highpass filters


The Laplacian in the Frequency Domain

• The Laplacian was used for image enhancement in the spatial domain.
• Laplacian can yield equivalent results using frequency domain techniques.
• The Laplacian can be implemented in the frequency domain using the filter

Homomorphic Filtering:
• This technique uses illumination-reflectance model in its operation.

• The first component is the amount of source illumination incident on the scene being
viewed i(x,y).
• The second component is the reflectance component of the objects on the scene r(x,y).
f (x, y) = i(x, y)r(x, y)
• The intensity of i(x, y) changes slower than r(x, y).
• i(x, y) is considered to have more low frequency components than r(x, y)
• Homomorphic filtering technique aims to reduce the significance of i(x, y) by

reducing the low frequency components of the image.
Implemented using five stages:
STAGE 1: Take a natural logarithm of both sides to decouple i(x, y) and r(x, y)
components and apply transforms.
STAGE 2: Use the Fourier transform to transform the image into frequency domain:
Where Fi(u, v) and Fr(u, v) are the Fourier transforms of lni(x, y) and lnr(x, y)
respectively.
STAGE 3: High pass the Z(u, v) by means of a filter function H(u, v) in frequency
domain, and get a filtered version S(u, v) as the following:
STAGE 4: Take an inverse Fourier transform to get the filtered image in the spatial
domain:
STAGE 5: The filtered enhanced image g(x, y) can be obtained by using the
following equations:
– In stage 1, natural log (ln) was considered, so filtered image is given as:
[Illumination and reflectance are not separable, but their approximate locations in the
frequency domain may be located.]

2.12 Selective Filtering
Process specific bands of frequencies or small regions of the rectangle, which are called
bandreject or bandpass filters.
Bandreject and Bandpass Filters
Notch Filters
A notch filter will reject (or pass) frequencies in a predefined neighbourhood.
Notch reject filters are constructed as products of highpass filters whose centers have
been translated to the centers of the notches
A notch pass filter is obtained from a notch reject filter by

Module – 3
Restoration: Noise models, Restoration in the Presence of Noise Only using Spatial
Filtering and Frequency Domain Filtering, Linear, Position- Invariant Degradations,
Estimating the Degradation Function, Inverse Filtering, Minimum Mean Square Error
(Wiener) Filtering, Constrained Least Squares Filtering.
Chapter 5: Sections 5.2, to 5.9]
3.1 Restoration
The principal goal of restoration techniques is to improve an image in some

predefined sense.
Although there are areas of overlap, image enhancement is largely a subjective
process, while restoration is for the most part an objective process.
Restoration attempts to recover an image that has been degraded by using a priori
knowledge of the degradation phenomenon. Thus, restoration techniques are oriented toward
modelling the degradation and applying the inverse process in order to recover the original
image.
The restoration approach usually involves formulating a criterion of goodness that
will yield an optimal estimate of the desired result, while enhancement techniques are
heuristic procedures to manipulate an image in order to take advantage of the human visual
system.
Some restoration techniques are best formulated in the spatial domain, while others
are better suited for the frequency domain.
A Model of the Image Degradation/Restoration Process
Figure shows an image degradation/restoration process. The degraded image in the

spatial domain is given by:
g(x, y) = h(x, y)★f (x, y) + h(x, y)
Where h(x, y) is the spatial representation of the degradation function and ―★‖
indicates convolution. Therefore, we can have the frequency domain representation by:
G(u, v) = H(u, v)F(u, v) + N(u, v)

3.2 Noise models
The principal sources of noise in digital images arise during image acquisition and/or
transmission.
3.2.1 Spatial and Frequency Properties of Noise
In the spatial domain, we are interested in the parameters that define the spatial
characteristics of noise, and whether the noise is correlated with the image.
Frequency properties refer to the frequency content of noise in the Fourier sense.
Noise is independent of spatial coordinates and it is uncorrelated with respect to the
image itself.
3.2.2 Some Important Noise Probability Density Functions
Gaussian noise
Because of its mathematical tractability in both the spatial and frequency domains, Gaussian
(normal) noise models are used frequently in practice.
The probability density function (PDF) of a Gaussian random variable, z , is given by
where z represents intensity, is the mean (average) value of z, and σ is its standard
deviation.
Rayleigh noise
The probability density function of Rayleigh noise is given by

Erlang (gamma) noise

The probability density function of Erlang noise is given by
where a > 0 and b is a positive integer. The mean and variance of this density are given by
Exponential noise
The PDF of exponential noise is given by
Where a > 0. The mean and variance of this density are given by
Uniform noise
The PDF of uniform noise is given by
The mean and variance of this density function are given by
Impulse (salt-and-pepper) noise
The PDF of impulse noise is given by
If b > a , intensity b appears as a light dot in the image. Conversely, intensity a will appear
like a dark dot.
If either Pa or Pb is zero, the impulse noise is called unipolar.
If neither Pa nor Pb is zero, and especially if they are approximately equal, the
impulse noise values will resemble salt-and-pepper granules randomly distributed over the
image.

Example: Noisy images and their histograms
Periodic Noise
Periodic noise in an image arises typically from electrical or electromechanical interference
during image acquisition.
The periodic noise can be reduced significantly via frequency domain filtering,

Estimation of Noise Parameters
The parameters of periodic noise can be estimated by inspection of the Fourier spectrum of
the image.
Periodic noise tends to produce frequency spikes, which are detectable even by visual
analysis.
In simplistic cases, it is also possible to infer the periodicity of noise components
directly from the image.
Automated analysis is possible if the noise spikes are either exceptionally
pronounced, or when knowledge is available about the general location of the frequency
components of the interference.
It is often necessary to estimate the noise probability density functions for a particular
imaging arrangement.
When images already generated by a sensor are available, it may be possible to
estimate the parameters of the probability density functions from small patches of reasonably
constant background intensity.
The simplest use of the data from the image strips is for calculating the mean and
variance of intensity levels. Let S denote a stripe and ps (zi ) , i = 0,1,2,...,L -1, denote the
probability estimates of the intensities of the pixels in S , then the mean and variance of the
pixels in S are
Note:
 The shape of the histogram identifies the closest probability density function match.
 The Gaussian probability density function is completely specified by these two
parameters.
 For the other shapes discussed previously, we can use the mean and variance to solve
the parameters a and b.
 Impulse noise is handled differently because the estimate needed is of the actual
probability of occurrence of the white and black pixels.
3.3 Restoration in the Presence of Noise Only using Spatial Filtering and
Frequency Domain
When the only degradation present in an image is noise,
g(x,y) = h(x,y)★f (x,y) + η(x,y) and G(u,v) = H(u,v)F(u,v) + N(u,v)
Become
g(x,y) = f (x,y) + η (x,y) and G(u,v) = F(u,v) + N(u,v)
Since the noise terms are unknown, subtracting them from g(x,y ) or G(u,v ) is not a
realistic option.
In the case of periodic noise, it usually is possible to estimate N (u, v) from the
spectrum of G(u, v) .

3.3.1 Mean Filters (Linear Filter)
Arithmetic mean filter

Let Sxy represent the set of coordinates in a subimage window of size mxn , centered at (x,y) .
The arithmetic mean filter computes the average value of the corrupted image g(x,y) inSxy .
The value of the restored image ̂ at point (x,y) is the arithmetic mean computed in
the region Sxy :
Geometric mean filter

Using a geometric mean filter, an image is restored by
A geometric mean filter achieves smoothing comparable to the arithmetic mean filter, but it
tends to lose less image detail in the process.
Harmonic mean filter

The harmonic mean filter is given by the expression
Works well for some types of noise like Gaussian noise and salt noise, but fails for pepper
noise.
Contraharmonic mean filter
The contra-harmonic mean filter yields a restored image based on the expression
Where Q is called the order of the filter.
The contraharmonic mean filter is well suited for reducing or

eliminating the effects of salt-and-pepper noise.
 For positive values of Q, it eliminates pepper noise.

 For negative values of Q, it eliminates salt noise.
 When Q = 0, the contraharmonic mean filter reduces to the arithmetic mean filter.
 When Q = -1, the contraharmonic mean filter becomes the harmonic mean filter.
Illustration of mean filters
PTO


The positive-order filter did a better job of cleaning the background, at the expense of
slightly thinning and blurring the dark areas.
The opposite was true of the negative-order filter.
In general, the arithmetic and geometric mean filters are suited for random noise like
Gaussian or uniform noise.
The contraharmonic mean filter is well suited for impulse noise, with the
disadvantage that it must be known whether the noise is dark or light in order to select Q.
Figure shows some results of choosing the wrong sign for Q
3.3.2 Order-Statistic Filters (Non Linear Filter)
Order-statistic filters are spatial filters whose response is based on ordering (ranking) the
values of the pixels contained in the image area encompassed by the filter.
Median filter
The best-known order-statistic filter is the median filter, which will replace the value
of a pixel by the median of the intensity levels in the neighbourhood of that pixel:
The median filters are particularly effective in the presence of both bipolar and
unipolar impulse noise.
Max and min filters

The max and min filters are defined as 
The max filter is useful for finding the brightest points in an image, while the min
filter can be used for finding the darkest points in an image.

Midpoint filter
The midpoint filter computes the midpoint between the maximum and minimum
values in the area encompassed by the filter:
The midpoint filter works best for random distributed noise, like Gaussian or uniform noise.
Alpha-trimmed mean filter

Suppose that we delete the d /2 lowest and the d /2 highest intensity values of g(s,t) in
Sxy . Let gr (s,t) represent the remaining mn -d pixels, an alpha-trimmed mean filter is given
by
 When d = 0 , the alpha-trimmed mean filter is reduced to the arithmetic mean filter.
 If d = mn -1 , the alpha-trimmed mean filter becomes a median filter.
Illustration of order-statistic filters

3.3.3 Adaptive Filters
Adaptive, local noise reduction filter
The simplest statistical measures of a random variable are its mean and variance,
which are reasonable parameters for an adaptive filter.
The mean gives a measure of average intensity in the region over which the mean is
computed, and the variance gives a measure of contrast in that region.
The response of a filter, which operates on a local region Sxy, at any point (x,y) is to be
based on four quantities:
(a) g(x,y) , the value of the noisy image at (x,y) ;
(b) σn2,, the variance of the noise corrupting f (x,y) to form g(x,y) ;
(c) mL , the local mean of the pixels in Sxy ;
(d) σL2 , the local variance of the pixels in Sxy .
The following behaviours for the filter are required:

1. σn2 is zero, the filter should just return the value of g(x,y) . This is the zero-noise case in
which g(x,y) is equal to f (x,y) .
2. If the local variance, σL2, is high relative to σn2, the filter should return a value close to
g(x,y) .
A high local variance typically is associated with edges, which should be preserved.

3. If the two variances are equal, we want the filter to return the arithmetic mean value of the
pixels in Sxy .
This condition occurs when the local area has the same properties as the overall
image, and local noise is to be reduced simply by averaging.
Based on these assumptions, an adaptive expression for obtaining ̂ (x, y) may be written as
The only quantity needed to be estimated is the variance of the overall noise, σn2, and
other parameters can be computed from the pixels in Sxy .
Adaptive median filter
The median filter discussed previously performs well if the spatial density of the
impulse noise is not large (Pa and Pb are less than 0.2 ).
The adaptive median filtering can handle impulse noise with probabilities larger than
these. Unlike other filters, the adaptive median filter changes the size of Sxy during operation,
depending on certain conditions.
Let:
zmin = minimum intensity value in Sxy
zmax = maximum intensity value in Sxy
zmed = median of intensity values in Sxy
zxy = intensity value at coordinates (x,y)
Smax = maximum allowed size of Sxy

The adaptive median filtering algorithm works in two stages:
Purpose:
 To remove salt-and-pepper (impulse) noise;
 To provide smoothing of other noise that may not be impulsive; and
 To reduce the distortion of object boundaries.
3.4 Periodic Noise Reduction by Frequency Domain Filtering
Periodic noise can be analyzed and filtered effectively by using frequency domain
techniques.
Bandreject Filters
Rejects (attenuates) band of frequencies and allows the rest
Figure shows perspective plots of ideal, Butterworth, and Gaussian bandreject filters,

One of the principal applications of bandreject filtering is for noise removal in applications
where the general location of the noise component(s) in the frequency domain is
approximately known.
Illustration:
The noise components can be seen as symmetric pairs of bright dots in the Fourier spectrum
shown in Figure (b)
Since the component lie on an approximate circle about the origin of the transform, so
a circularly symmetric bandreject filter is a good choice.
Bandpass Filters
A bandpass filter performs the opposite operation of a bandreject filter.

The transfer function HBP (u,v) of a bandpass filter is obtained from a corresponding
bandreject filter transfer function HBR(u,v) by using the equation.
HBP (u,v) = 1 - HBR(u,v) .
Performing straight bandpass filtering on an image is not a common procedure

because it generally removes too much image detail. However, bandpass filtering is useful in
isolating the effects on an image caused by selected frequency bands.

Notch Filters
A notch filter rejects/passes frequencies in predefined neighbourhoods about a center
frequency. Figure shows plots of ideal, Butterworth, and Gaussian notch (reject) filters.
Restoration is done by Placing the notch filter at the location of spike (noise).
The transfer function HNP (u,v) of a notch pass filter is obtained from a corresponding
notch reject filter transfer function, HNR(u,v) , by using the equation
HNP (u,v) = 1 - HNR(u,v)
Optimum Notch Filtering
When several interference components are present, the methods discussed previously are not
always acceptable because they may remove too much image information in the filtering
process.
Optimum notch filters minimizes local variances of the restored estimate ̂ (x, y)

Step 1: The first step in Optimum notch filtering is to find the principal frequency
components and placing notch pass filters at the location of each spike in G(u,v), yielding
H(u, v). The Fourier transform of the interference pattern is thus given by
N(u, v)=HNP(u, v) G(u, v)
where G(u, v)is the Fourier transform of the corrupted image.
Step 2: The corresponding interference pattern in the spatial domain is obtained with the
inverse Fourier transform
Step3: f(x, y) can be obtained f (x,y) = g(x,y) - η(x,y)

Where, g(x, y) is corrupted image
Step 4: The effect of components not present in the estimate of η (x,y) can be minimized by
subtracting from g(x,y) a weighted portion of η (x,y) to obtain an estimate of f (x,y) :
Note: Weighting function can be chosen according to need; one approach minimizes the local
variance
3.5 Linear, Position- Invariant Degradations. [Important]
The input-output relationship shown in Figure before the restoration can be expressed as
g (x, y) = H [f(x, y)] + η(x, y) (1)
First, we assume that η(x, y) = 0 so that,
g(x, y) = H [ f (x, y)],
Linearity Property:
H is Linear if:
H [af1(x, y) + bf2(x, y)] = a H [ f1(x, y)] + b H [ f2(x, y)] (2)
Where a and b are scalars and f1(x, y) and f2(x, y) are any two input images. If a = b = 1, then
equation (2) becomes
H [af1(x, y) + bf2(x, y)] = H [ f1(x, y)] + H [ f2(x, y)] (3)
Which is called the property of additivity.

If f2(x, y) = 0 , equation (2) becomes
H [af1(x, y) + bf2(x, y)] = a H [ f1(x, y)] (4)
This is called the property of homogeneity. It says that the response to a constant multiple of
any input is equal to the response to that input multiplied by the same constant.
Position- Invariant:
An operator having the input-output relationship
g(x,y) = H [ f (x,y)]
is said to be position (space) invariant if
H [ f (x – α, y - β)] = g(x – α, y - β) (5)
For any f (x,y) and any α & β . Eq (5) indicates that the response at any point in the
image depends only on the value of the input at that point, not on its position.
With a slight change in notation in the definition of the impulse in -
f (x, y) can be expressed as
(6)
Assuming η(x, y) = 0 , then substituting (6) into (1) we have:
g(x, y) = H [ f (x, y)]
(7)
If H is a linear operator, then
(8)
Since f (α, β) is independent of x and y, using the homogeneity property, it follows that
Where the term  h(x, α, y, β) = H [δ(x - α, y - β)] is called the impulse response of H.

In other words, if η(x, y) = 0, then h(x, α, y, β) is the response of H to an impulse at (x, y).
(9)
is called the superposition (or Fredholm) integral of the first kind, and is a fundamental result
at the core of linear system theory.
If H is position invariant, from

H [f (x - α, y - β)] = g(x - α, y - β)
We have
H [δ(x - α, y - β)] = h(x - α, y - β)
Equation (9) reduces to
Above Equation tells us that knowing the impulse of a linear system allows us to compute its
response, g , to any input f . The result is simply the convolution of the impulse response and
the input function.
In the presence of additive noise, Equation (9) becomes,
If H is position invariant, it becomes
Assuming that the values of the random noise η (x,y) are independent of position, we have
Based on the convolution theorem, we can express above equation in the frequency domain
as
A linear, spatially invariant degradation system with additive noise can be modelled in the
spatial domain as the convolution of the degradation function with an image, followed by the
additive of noise. The same process can be expressed in the frequency domain.

3.6 Estimating the Degradation Function
There are three principal ways to estimate the degradation function used in image restoration.
 Estimation by Image Observation
 Estimation by Experimentation
 Estimation by Modeling
3.6.1 Estimation by Image Observation

Suppose that we are given a degraded image without any knowledge about the
degradation function H.
Based on the assumption that the image was degraded by a linear, position-invariant
process, one way to estimate H is to gather information from the image itself.
In order to reduce the effect of noise, we would look for an area in which the signal
content is strong.
Let the observed sub-image be denoted by gs (x, y), and the processed sub-image be
denoted by ̂ . Assuming that the effect of noise is negligible, it follows from
G(u, v) = H(u, v)F(u, v) + N(u, v)
That
Then, we can have H(u, v) based on our assumption of position invariant.
For example, suppose that a radial plot of Hs(u, v) has the approximate shape of a Gaussian
curve. Then we can construct a function H(u, v) on a large scale, but having the same basic
shape. This estimation is a laborious process used in very specific circumstances.
3.6.2 Estimation by Experimentation
If equipment similar to the equipment used to acquire the degraded image is available, it is
possible in principle to obtain an accurate estimate of the degradation.
Images similar to the degraded image can be acquired with various system settings
until they are degraded as closely as possible to the image we wish to restore.
Then the idea is to obtain the impulse response on the degradation by imaging an
impulse (small dot of light) using the same system settings.
An impulse is simulated by a bright dot of light, as bright as possible to reduce the
effect of noise to negligible values. Since the Fourier transform of an impulse is a constant, it
follows

3.6.3 Estimation by Modeling
Degradation modeling has been used for years. In some cases, the model can even take into
account environmental conditions that cause degradations.
For example, a degradation model proposed by Hufnagel and Stanley is based on the
physical characteristics of atmospheric turbulence
Where k is a constant that depends on the nature of the turbulence. Figure shows examples of
using Equation with different values of k.
A major approach in modeling is to derive a mathematical model starting from basic

principles. We show this procedure by a case in which an image has been blurred by uniform
linear motion between the image and the sensor during image acquisition. Suppose that an
image f (x, y) undergoes planar motion and that x0(t) and y0(t) are the time-varying
components of motion in the x – and y- directions.
The total exposure at any point of the recording medium is obtained by integrating the
instantaneous exposure over the time interval when the imaging system shutter is open. If the
T is the duration of the exposure, the blurred image g(x, y) is
The Fourier transform of g(x, y) is given by:

By reversing the order of integration,
Since the term inside the outer brackets is the Fourier transform of the displaced
function f [x - x0(t), y - y0(t)], we have
By defining
We can rewrite
G(u, v) = H(u, v) F(u, v)
3.7 Inverse Filtering
The simplest approach to restoration is direct inverse filtering, where we compute an

estimate, ̂ , of the transform of the original image by:
W.K.T G(u, v) = H(u, v) F(u, v)
Therefore,
(1)
But G(u,v) = H(u,v)F(u,v) + N(u,v)
Equation (1) becomes
(2)
Case 1: We cannot recover the undegraded image exactly because N(u,v) is not known.
Case 2: If the degradation function H(u,v) has zero or very small values, so the second term
of Eq (2) could easily dominate the estimate of ̂ .
One approach to get around the zero or small-value problem is to limit the filter
frequencies to values near the origin. As we know that H(0,0) is usually the highest value of
H(u, v) in the frequency domain.

3.8 Minimum Mean Square Error (Wiener) Filtering
 It is Better than the inverse filter

 It incorporates both the degradation function and statistical characteristics of noise
into the restoration process.
 Consider image and noise as random variables
 The objective is to find an estimate ̂of the uncorrupted image f such that the mean
square error between them is minimized.
The error measure is given by
(1)
Where E{ . } is the expected value of the argument.
Assumptions:
1. The noise and the image are uncorrelated.
2. One or the other has zero mean.
3. The intensity levels in the estimate are a linear function of the levels in the degraded
image.
Based on the above assumptions and with expectation of the minimum of the error function
as in equation (1), can be obtained in the frequency domain by the expression
Simplifying above equation

Where K = Sn(u, v)/Sf(u, v)

Value of ―K‖ usually not known and will be added to H(u, v) term on trial and error
Note: The value of K was chosen interactively to yield the best visual result.
Note: If the noise or K value is zero, then the Wiener filter reduces to the inverse filter.
Measure of Restored image:

Important measures is the signal-to-noise ratio, approximated using frequency domain
quantities such as
The mean square error given in statistical form
If one considers the restored image to be signal and the difference between this image and the
original to be noise, we can define a signal-to-noise ratio in the spatial domain as

3.9 Constrained Least Squares Filtering.

Module – 4
Color Image Processing: Color Fundamentals, Color Models, Pseudocolor Image
Processing.
Wavelets: Background, Multi-resolution Expansions. L1, L2,
Morphological Image Processing: Preliminaries, Erosion and Dilation, Opening and L3
Closing, The Hit-or-Miss Transforms, Some Basic Morphological Algorithms.
[Text: Chapter 6: Sections 6.1 to 6.3, Chapter 7: Sections 7.1 and 7.2, Chapter 9: Sections 9.1 to 9.5]
4.1 Color Image Processing:

4.1.1 Color Fundamentals
 The color that humans perceived in an

object are determined by the nature of
the light reflected from the object.
 Light is electromagnetic spectrum.
Visible light and Color:
 Visible light is composed of a relatively narrow band of frequencies in the ES.

 Human color perceive is a composition of different wavelength spectrum
 The color of an object is a body that favours reflectance in a limited range of visible
spectrum exhibits some shade of colors
 Example
o White: a body that reflects light that balanced in all visible wavelengths
o E.g. green objects reflect light with wavelength primarily in the 500 to 570 nm
range while absorbing most of the energy at other wavelengths.

Characterization of light:
 If the light achromatic (void of color), if its only attribute is intensity.

 Gray level refers to a scalar measure of intensity that ranges from black, to grays, and
finally to white
 Chromatic light spans the ES from about 400 to 700 nm
 Three basic quantities are used to describe the quality of a chromatic light source
o Radiance: total amount of energy flows from the light source
o Luminance: amount of energy perceive from light source
o Brightness: a subjective descriptor that is practically impossible to measure
Color sensors of eyes: cones
 Primary colors for

standardization
o blue : 435.8 nm, green :
546.1 nm, red : 700 nm
 Not all visible colors can be
produced by mixing these three
primaries in various intensity
proportions
 Cones in human eyes are
divided into three sensing
categories
o 65% are sensitive to red
light, 33% sensitive to
green light, 2% sensitive
to blue (but most
sensitive)
o The R, G, and B colors perceived by human eyes cover a range of spectrum
Primary colors and secondary colors:
 CIE (Commission Internationale de l‘Eclariage)

standard for primary color
o Red: 700 nm
o Green: 546.1 nm
o Blue: 435.8 nm
 Primary color can be added to produce secondary
colors
o Primary colors can not produce all colors
 Pigments (colorants)
o Define the primary colors to be the
absorbing one and reflect other two

Chromaticity:
 Hue + saturation = chromaticity

o Hue: an attribute associated with the dominant wavelength or dominant colors
perceived by an observer
o Saturation: relative purity or the amount of white light mixed with a hue (the
degree of saturation is inversely proportional to the amount of added white
light)
 Color = brightness + chromaticity
 Tristimulus values (the amount of R, G, and B needed to form any particular color :
X, Y, Z
o Trichromatic coefficients :
Chromaticity diagram
 Show color composition as a function of x,

y, and z
 Spectrum colors are indicated around the
boundary of the tongue-shaped
chromaticity diagram
 Point of equal energy : equal fractions of
three primary colors →CIE defined white
light
 Points located on the boundary of
chromaticity diagram are fully saturated --
the saturation at the center point is zero
 A straight line segment joining any

two points defines all color variations of the
combination of them
 No three colors in the diagram can
span the whole color space --not all colors
can be obtained with three single and fixed
primaries
 The color gamut produced by RGB
monitors ⇒ Triangle
 The color printing gamut is
irregularly-shaped ⇒ Irregular Region

4.1.2 Color Models

 A color model is a specification of a coordinate system within which each color is
represented by a single point
 Hardware-oriented color models,
o e.g., color monitors and printers
o RGB, CMY (cyan, magenta, yellow), CMYK (+black)
 Application-oriented color model
o HSI (hue, saturation, intensity)
RGB Color Model:

 Each color appears in its
primary spectral components of
R, G, and B
 Based on a Cartesian
coordinate system (cube)
 (R, G, B): all values of R, G, B
are between 0 and 1.
 With digital representation, for
a fixed length of bits each color
element. The total number of
bits is called color depth, or
pixel depth.
 For example, 24-bit RGB color (r, g, b), 8-bits for each color. The 8-bit binary
number r represents the value of r/256 in [0,1]
Displaying Colors in RGB

model

CMY and CMYK color Models:
 Useful in color printers and copiers

 Conversion between RGB and CMY
 In practice, combining CMY colors produces a muddy-looking black. To produce true

black, a forth color, black, is added ⇒ CMYK color model
 CMYK: (C, M, Y, B), where B is a fixed black color. This basically for printing
purpose, where black is usually the dominating color. When printing black, using B
rather than using (C, M, B) = (1, 1, 1)
HSI color model:
 RGB, CMY, and similar others

are not practical for human
interpretation
 Hue : a color attribute that
describes a pure color
 Saturation : a measure of the
degree to which a pure color is
diluted by white light
 Derivation of HSI from RGB
color cube
o All points contained in the plane segment defined by the intensity axis (i.e.,
from black to white) and one color point on the boundaries of the cube have
the same hue

 The HSI space is represented by a vertical

intensity axis, the length (saturation) of a
vector from the axis to a color point, and
the angle (hue) this vector makes with the
red axis
 The power of HSI color model is to allow
independent control over hue, saturation,
and intensity
Conversion between RGB and HSI:

4.1.3 Pseudocolor Image Processing.

Pseudo color – also called False Color
Pseudocolor image processing is to assign colors to gray values based on a
specified criterion.
Purpose: human visualization, and interpretation for gray-scale events
Two types:
1. Intensity Slicing
2. Intensity to color transformation
Intensity Slicing:
If an image is interpreted as a 3D function, the method can be viewed as one of

placing planes parallel to the coordinate plane of the image. Each plane then slices the
function in the area of intersection.
Fig. shows a plane at f(x, y)=li to slice the image function into two levels. If a
different color is assigned to each side of the plane shown in Fig., any pixel whose gray scale
is above the plane will be coded with one color, and any pixel below the plane will be coded
with the other.
The result is a two color image whose relative appearance can be controlled by
moving the slicing plane up and down the gray level axis. The idea of planes is useful
primarily for a geometric interpretation of the intensity-slicing technique. When more levels
are used, the mapping function takes on a staircase form.

Intensity to color transformation
Transform intensity function f(x, y) into three color component

Basically the idea underlying this approach is to perform three independent

transformations on the gray level of any input pixel. The three results are then fed separately
into the red, green, and blue channels of a color television monitor. This method produces a
composite image, whose color content is modulated by the nature of the transformations on
the gray-level values of an image, and is not functions of position.
Ex.
Let f(x, y) be an given gray scale image
fr(x, y) = f(x, y)
fg(x, y) = 0.33*f(x, y)
fb(x, y) = 0.12*f(x, y)
Transformed color image = fr+fg+fb

Morphological Image Processing
1.1 Preliminaries:
 “Morphology “– a branch in biology that deals with the form and structure of animals
and plants.
 “Mathematical Morphology” – as a tool for extracting image components, that are
useful in the representation and description of region shape.
 The language of mathematical morphology is – Set theory.
 Unified and powerful approach to numerous image processing problems.
 In binary images , the set elements are members of the 2-D integer space – Z2. where
each element (x, y) is a coordinate of a black (or white) pixel in the image.
Used to extract image components that are useful in the representation and description of
region shape, such as
 Boundaries extraction
 Skeletons
 Convex hull
 Morphological filtering
 Thinning
 Pruning
Basic Concepts in Set Theory
 Subset: A  B
Means every element in set A is also in set B.
 Union: A  B
 Intersection: A  B
 disjoint / mutually exclusive: A  B = 
 Complement:
 Difference:
 Reflection:
 Translation

Logic Operations Involving Binary Pixels and Images
 The principal logic operations used

in image processing are: AND, OR,
NOT (COMPLEMENT).
AND (Intersection): A and B is the

set that contains all elements
of A that also belong to B (or
equivalently, all elements of B that
also belong to A), but no other
elements.
OR (Union): A is a subset
of B. set A is included in set B. A is
a subset of B, but A is not equal to B.
A is a superset of B, but B is not
equal to A.
XOR: "one or the other, but not

both", and set difference, A−B.
 A subset is denoted by A  Band means every element in set A is also in set B.

 A superset is denoted by A  B and means every element in set B is also in set A.
 A proper subset is denoted by A  B and means every element in set A is also in set
B and the two sets are not identical.
 A proper superset is denoted by A  B and means every element in set B is also in
set A and the two sets are not identical.
 These operations are functionally complete.
 Logic operations are performed on a pixel by pixel basis between corresponding

pixels (bitwise).
 Other important logic operations : XOR (exclusive OR), NAND (NOT-AND)
 Logic operations are just a private case for a binary set operations, such: AND –
Intersection, OR – Union, NOT-Complement.

Reflection and Translation
The translation of A by x = (x1, x2) is ( A) z  c c  a  z, for a  A

Where, c = (c1, c2) = (a1+x1, a2+x2)
The reflection of B is B  w w  b, for b  B

ˆ
Here the reflection is with respect to a specific origin, such as a point center in the shape, e.g.,
the center of the shape.
1.2 Erosion and Dilation

Structuring element (SE):
 Small set to probe the image under study.
 For each SE, define origin.
 Shape and size must be adapted to geometric properties for the objects.
Example:
In parallel for each pixel in binary image:

– Check if SE is ‖satisfied‖
– Output pixel is set to 0 or 1 depending on used operation

Five Binary Morphological Operations
 Erosion
 Dilation
 Opening
 Closing
 Hit-or-Miss transform
Erosion: Erosion is used for shrinking of element A by using element B
 Erosion for Sets A and B in Z2, is defined by the following equation:
 This equation indicates that the erosion of A by B is the set of all points z such that B,
translated by z, is contained in A.
Dilation: Dilation is used for expanding an element A by using structuring element B
Dilation of A by B and is defined by the following equation:
This equation is based on obtaining the reflection of B about its origin and shifting this
reflection by z.
The dilation of A by B is the set of all displacements z, such that ̂ and A overlap by at least
one element. Based on this interpretation the above equation of can be rewritten as:

Example – Document image analysis
Usefulness:
Erosion: Removal of structures of certain shape and size, given by SE

Dilation: Filling of holes of certain shape and size, given by SE
Duality:
Dilation and erosion are duals of each other with respect to set complementation and
reflection. That is,
or
In other words, dilating the ―foreground‖ is the same as eroding the ―background‖, but the
structuring element reflects between the two. Likewise, eroding the foreground is the same as
dilating the background.
So, strictly speaking we don‘t really need both dilate and erode: with one or the other, and
with set complement and reflection of the structuring element, we can achieve the same
functionality. Hence, dilation and erosion are duals.
Proof:
( A  B) c   z ( B) z  A c
We know by erosion definition:
If set (B)z is contained in A, then (B)z∩ Ac = ɸ, therefore

( A  B) c  z ( B) z  Ac    c

But the complement of z‘s satisfies z ( B) z  Ac    =  z ( B)
c
z 
 Ac  
Therefore, 
( A  B) c  z ( B) z  A  
c
Hence Proved….

1.3 Opening and Closing
Opening:
An opening is an erosion followed by a dilation with the same structuring element:
A  B  ( A  B)  B
Remember that erosion finds all the places where the structuring element fits inside the
image, but it only marks these positions at the origin of the element.
By following an erosion by a dilation, we ―fill back in‖ the full structuring element at
places where the element fits inside the object.
So, an opening can be considered to be the union of all translated copies of the structuring
element that can fit inside the object. Openings can be used to remove small objects,
protrusions from objects, and connections between objects.
Find contour Fill in Contour 
Smooth the contour of an image, breaks narrow isthmuses, eliminates thin protrusions

Closing:
Closing works in an opposite fashion from opening:

A  B  ( A  B)  B
Whereas opening removes all pixels where the structuring element won‘t fit inside the
image foreground, closing fills in all places where the structuring element will not fit in the
image background.
Find contour Fill in Contour 
Smooth the object contour, fuse narrow breaks and long thin gulfs, eliminate small holes,
and fill in gaps.
Properties
Opening
(i) A°B is a subset (subimage) of A
(ii) If C is a subset of D, then C °B is a subset of D °B
(iii) (A °B) °B = A °B
Closing
(i) A is a subset (subimage) of A•B
(ii) If C is a subset of D, then C •B is a subset of D •B
(iii) (A •B) •B = A •B
Note: repeated openings/closings have no effect!

1.4 The Hit-or-Miss Transforms

 A basic morphological tool for shape detection.
 Let the origin of each shape be located at its center of gravity.
 If we want to find the location of a shape , say – X , at (larger) image, say – A :
 Let X be enclosed by a small window, say – W.
 The local background of X with respect to W is defined as the set difference
(W - X).
 Apply erosion operator of A by X, will get us the set of locations of the origin
of X, such that X is completely contained in A.
 It may be also view geometrically as the set of all locations of the origin of X
at which X found a match (hit) in A.
 Apply erosion operator on the complement of A by the local

background set (W – X).
 Notice, that the set of locations for which X exactly fits inside A is the
intersection of these two last operators above.
This intersection is precisely the location sought.
Formally:
If B denotes the set composed of X and it‘s background –
B = (B1, B2); B1 = X, B2 = (W – X).
The match (or set of matches) of B in A, denoted is:

Steps:
 Find the location of certain shape, ex. ‗X‘
Erosion
Find the set of pixels that contain shape X
Ac Erosion
with (W-X)
 Eliminate un-necessary parts


Detect
object
1.5 Some Basic Morphological Algorithms
Extract image components that are useful in the representation and description of shape:
 Boundary extraction
 Region filling
 Extract of connected components
 Convex hull
 Thinning
 Thickening
 Skeleton
 Pruning
Boundary Extraction:
 First, erode A by B, then make set difference between A and the erosion
 The thickness of the contour depends on the size of constructing object – B

Region Filling
 This algorithm is based on a set of dilations, complementation and intersections
 p is the point inside the boundary, with the value of 1
 X(k) = (X(k-1) xor B) conjunction with complemented A
 The process stops when X(k) = X(k-1)

 The result that given by union of A and X(k), is a set contains the filled set and the
boundary
Extraction of Connected Components
 This algorithm extracts a component by selecting a point on a binary object A

 Works similar to region filling, but this time we use in the conjunction the object A,
instead of it‘s complement

Convex Hull
 A is said to be convex if a straight line segment joining any two points in A lies
entirely within A
 The convex hull H of set S is the smallest convex set containing S
 Convex deficiency is the set difference H-S
 Useful for object description
 This algorithm iteratively applying the hit-or-miss transform to A with the first of B
element, unions it with A, and repeated with second element of B

Thinning
 The thinning of a set A by a structuring element B, can be defined by terms of the hit-
and-miss transform:
 A more useful expression for thinning A symmetrically is based on a sequence of

structuring elements:
{B}={B1, B2, B3, …, Bn}
 Where Bi is a rotated version of Bi-1. Using this concept we define thinning by a
sequence of structuring elements:
 The process is to thin by one pass with B1, then thin the result with one pass with B2,
and so on until A is thinned with one pass with Bn.
 The entire process is repeated until no further changes occur.
 Each pass is performed using the equation:
Thickening
 Thickening is a morphological dual of thinning.

 Definition of thickening .
 As in thinning, thickening can be defined as a sequential operation:
 The structuring elements used for thickening have the same form as in thinning, but
with all 1‘s and 0‘s interchanged.
 A separate algorithm for thickening is often used in practice, Instead the usual
procedure is to thin the background of the set in question and then complement the
result.
 In other words, to thicken a set A, we form C=Ac, thin C and then form Cc.

 Depending on the nature of A, this procedure may result in some disconnected points.
Therefore thickening by this procedure usually require a simple post-processing step
to remove disconnected points.
 We will notice in the next example that the thinned background forms a boundary for
the thickening process, this feature does not occur in the direct implementation of
thickening
 This is one of the reasons for using background thinning to accomplish thickening.
Skeleton
 The notion of a skeleton S(A) of a set A is intuitively defined, we deduce from this
figure that:
a) If z is a point of S(A) and (D)z is the largest disk centered in z and contained
in A (one cannot find a larger disk that fulfils this terms) – this disk is called
―maximum disk‖.
b) The disk (D)z touches the boundary of A at two or more different places.
 The skeleton of A is defined by terms of erosions and openings:
 With
 Where B is the structuring element and indicates k successive erosions of A:
 k times, and K is the last iterative step before A erodes to an empty set, in other
words:
 In conclusion S(A) can be obtained as the union of skeleton subsets Sk(A).

 A can be also reconstructed from subsets Sk(A) by using the equation:
 Where denotes k successive dilations of Sk(A) that is:
4.2 Wavelets:
4.2.1 Background
Unlike the Fourier transform, which decomposes a signal to a sum of sinusoids, the
wavelet transform decomposes a signal
(image) to small waves of varying
frequency and limited duration. The
advantage is that we also know when
(where) the frequency appears.
Many applications in image
compression, transmission, and analysis.
We will examine wavelets from a multi-
resolution point of view and begin with
an overview of imaging techniques
involved in multi-resolution theory.
Small objects are viewed at high
resolutions. Large objects require only a
coarse resolution. Images have locally varying statistics resulting in combinations of edges,
abrupt features and homogeneous regions.

Image Pyramids:
Originally devised for machine vision and

image compression.
It is a collection of images at decreasing
resolution levels.
Base level is of size 2Jx2J or NxN. Level j is
of size 2jx2j
Approximation pyramid:
At each reduced resolution level we have a

filtered and downsampled image.
f2 (n) = f (2n)
Prediction pyramid:
A prediction of each high resolution level is obtained by
up-sampling (inserting zeros) the
previous low resolution level
(prediction pyramid) and
interpolation (filtering).
Prediction residual pyramid:

At each resolution level, the prediction error is retained along with the lowest resolution level
image. The original image may be reconstructed from this information.
Approximation pyramid
Prediction residual pyramid

Subband Coding:
{Review - Digital signal filtering}
An image is decomposed to a set of band limited components (sub-bands). The

decomposition is carried by filtering and down-sampling. If the filters are properly selected
the image may be reconstructed without error by filtering and up-sampling.
Consider the two-band subband coding and decoding system as shown in figure (a). The
system is composed of two filter banks, each containing two FIR filters (h0(n), h1(n) & g0(n),
g1(n)).
Figure a
Figure b
Analysis filter bank includes h0(n) & h1(n) to break f(n) into two half-length
sequences flp(n) & fhp(n). Filters h0(n) & h1(n) are half-band filters whose idealized
characteristics are H0(w) and H1(w) are as shown in figure (b).
h0(n)  low pass filter, output flp(n) is called approximation of f(n).
h1(n)  high pass filter, output fhp(n) is called detail part of f(n).
Synthesis filter bank includes g0(n) & g1(n) combines flp(n) & fhp(n) to produce ̂ .
The goal in subband coding is to select h0(n), h1(n), g0(n) & g1(n) filters so that ̂ .
The resulting system is said to be Perfect Reconstruction Filters.
To obtain Perfect Reconstruction Filters, The analysis and synthesis

filters should be related in one of the two ways:
These filters are called cross-modulated.

Approximation subband
Vertical subband
Horizontal subband
Diagonal subband
Also, the filters should be biorthogonal:
Of special interest in subband coding are filters that move beyond biorthogonality and require
being orthonormal:
In addition, orthonormal filters satisfy the following conditions:
Where the subscript means that the size of the filter should be even.
Synthesis filters are related by order reversal and modulation.

Analysis filters are both order reversed versions of the synthesis filters.
An orthonormal filter bank may be constructed around the impulse response of g0
which is called the prototype.
1-D orthonormal filters may be used as 2-D separable filters for subband image
coding.
The wavy lines are due to aliasing of the barely

discernable window screen. Despite the
aliasing, the image may be perfectly
reconstructed.

The Haar Transform:
It is due to Alfred Haar [1910]. Its basis functions are the simplest known orthonormal
wavelets. The Haar transform is both separable and Symmetric. The Haar transform can be
expressed in matrix form
T = HFHT
Where F is an N*N image matrix, H is an N*N transformation matrix, T is the resulting N*N
transform.
 For the Haar transform, transformation matrix H contains the Haar basis functions,
h(z).
 They are defined over the continuous, closed interval for z ∈ [0,1] for k=0,1,2,…,N-
1, where N = 2n.
 To generate H, we define the integer k such that
k = 2P + q −1 where, 0 ≤ p ≤ n −1 q = 0 or 1 for p = 0
1≤ q ≤ 2 p for p ≠ 0
 For the above pairs of p and q, a value for k is determined and the Haar basis
functions are computed.
The ith row of a NxN Haar transformation matrix contains the elements of
hk(z) for z=0/N, 1/N, 2/N,…, (N-1)/N.
e.g. For instance, for N=4, p, q and k have the following values:
And the 4x4 transformation matrix is:
Similarly, for N=2, the 2x2 transformation matrix is:
The rows of H2 are the simplest filters of length 2 that may be used as analysis filters
h0(n) and h1(n) of a perfect reconstruction filter bank.
Moreover, they can be used as scaling and wavelet vectors (defined in what follows) of
the simplest and oldest wavelet transform.

4.2.2 Multi-resolution Expansions.

Series Expansions
Scaling Functions Assignment
Wavelet Functions

Module – 5
Segmentation: Point, Line, and Edge Detection, Thresholding, Region-Based
Segmentation, Segmentation Using Morphological Watersheds. L1, L2,
Representation and Description: Representation, Boundary descriptors. L3
[Text: Chapter 10: Sections 10.2, to 10.5 and Chapter 11: Sections 11.1 and 11.2]
5.1 Segmentation:
5.1.1 Point, Line, and Edge Detection
5.1.2 Thresholding, Region-Based Segmentation,
5.1.3 Segmentation Using Morphological Watersheds.
5.2 Representation and Description:
5.2.1 Representation
5.2.2 Boundary descriptors.

Module 5
Segmentation, Representation &
Description
Professor, ECE, SJBIT

Dr. Mahantesh K, Associate
By:
Dr. Mahantesh K
Associate Professor
Dept. of ECE, SJBIT
1
Outline:
Segmentation: Point, Line, and Edge Detection,
Thresholding, Region-Based Segmentation,
Segmentation Using Morphological Watersheds.

Representation and Description: Representation,
Boundary descriptors.
[Text: Chapter 10: Sections 10.2, to 10.5 and Chapter

11: Sections 11.1 and 11.2] – Gonzalez & Woods
2
Preview
• Segmentation subdivides an image to
regions or objects
The goal is usually to find individual objects in an image.
Two basic properties of intensity values

• Discontinuity
• Edge detection
• Similarity
• Thresholding
• Region growing/splitting/merging
3
Detection of discontinuities – Point Detection
• Mask operation

4
Point detection
• Isolated point
• whose gray value is significantly different from its

background

5
Line Detection
Mask operation
– Preferred direction is weighted by with a larger coefficient
– The coefficients in each mask sum to zero response of constant

gray level areas

– Compare values of individual masks (run all masks) or run
only the mask of specified direction
6
Example:
– Interested in lines of -45 degree
– Run the corresponding mask
– All other lines are eliminated

7
Edge Detection
Edge: a set of connected pixels
that lie on the boundary
between two regions
• ’Local’ concept is contrast to

’more global’ boundary
concept
• To be measured by grey-
level transitions
• Ideal and blurred edges
8
First derivative can be used to detect the
presence of an edge (if a point is on a
ramp)

The sign of the second derivative can be used to
determine whether an edge pixel lie on the dark or
light side of an edge
– Second derivative produces two value per edge
– Zero crossing near the edge midpoint
– Non-horizontal edges – define a profile perpendicular to
the edge direction 9
• Edges in the presence of
noise
– Derivatives are sensitive to
(even fairly little) noise
– Consider image smoothing
prior to the use of derivatives
• Edge definition again

– Edge point – whose first

derivative is above a pre-
specified threshold
– Edge – connected edge
points
– Derivatives are computed
through gradients (1st) and
Laplacians (2nd)
10
11

Detecting diagonal edges

12

13

14

15

16

17

18

Edge Linking & Boundary Detection
1. Local processing
• Analyze pixels in a small neighborhood following predefined
criteria.
• Two properties of edge points are useful for edge linking:
• the strength (or magnitude) of the detected edge points
• their directions (determined from gradient directions)

• Adjacent edge points with similar magnitude and direction are linked.
• For example, an edge pixel with coordinates (x0, y0) in a predefined
neighborhood of (x, y) is similar to the pixel at (x, y) if
f ( x, y)  ( x0 , y0 )  E, E : a nonnegativ e threshold
 ( x, y)   ( x0 , y0 )  A, A : a nonegative angle threshold
Both magnitude and angle criteria should be satisfied
19
Example: find rectangular
shapes similar to license
plate
• Find gradients

• Connect edge points
• Check horizontal-
vertical proportion
20
2. Global Processing – the Hough Transform
• Determine if points lie on a curve of specified shape
• Consider a point (xi, yi) and the general line equation
• Write the equation with respect to ab-plane (parametric space)
• Write the equation for a second point (xj, yj) and find the
intersection point (a’, b’) on the parametric space

• All points on a line intersect at the same parametric point
21
Computational aspects of the Hough transform
–
• Subdivision of the parametric space into accumulator cells
• The cell at (i,j) with accumulator values A(i,j) corresponds to
(ai,bj)
• For every point (xk,yk) vary a from cell to cell and solve for b:
• If ap generates bq, then increment the accumulator
A(p,q)=A(p,q)+1

• At the end of the procedure, a value of Q in A(i,j)
corresponds to Q points in the xy-plane lying on the line
• K different incremens of a generate K different values of b;
for n different image point, the method involves nK
computations (linear complexity)
22
23

24

25

Summary – Hough Transform
• Hough transform: a way of finding edge points in an image that lie

along a straight line.
• Example: xy-plane v.s. ab-plane (parameter space)
yi  axi  b

26
• The Hough transform consists of
finding all pairs of values of  and 
which satisfy the equations that pass
through (x,y). x cos   y sin   
• These are accumulated in what is

basically a 2-dimensional histogram.
• When plotted these pairs of  and 

will look like a sine wave. The
process is repeated for all
appropriate (x,y) locations.
27
The intersection of the
curves corresponding to
points 1,3,5

2,3,4
1,4
28
29

Thresholding
Foundation
– Histogram dominant modes: two or more
– Threshold and thresholding operation

30
Illumination
– Image is a product of reflectance
and illuminance
– Reflection nature of objects and

background
– Poor (nonlinear) illumination could

impede the segmentation

– The final histogram is a result of
convolution of the histogram of the
log reflectance and log illuminance
functions
– Normalization if the illuminance

function is known
31
Basic Global Thresholding
• Threshold midway between

maximum and minimum gray
levels
• Appropriate for industrial inspection
applications with controllable
illumination

• Automatic algorithm
• Segment with initial T into regions
G1 and G2
• Compute the average gray
level m1 and m2
• Compute new T=0.5(m1+m2)
• Repeat until reach an
acceptably small change of T
in successive iterations 32
Basic Adaptive Thresholding
• Divide the image into sub-
images and use local
thresholds

33
Optimal Global and Adaptive Thresholding
• Histograms considered as estimates of probability density
functions
• Mixure probability
• Select the value of T that minmizes the average error in making the
decision that a given pixel belongs to an object of to the
background

• Minimizing the probability of erroneous classification
• Differentiate the error equation and solve for T
• Estimating the densities using simple models, e.g. Gaussian
34
Boundary Characteristics for Histogram Thresholding
• Consider only pixels lying on and near edges
• Use gradient or Laplacian to preliminary process the
image

• Transition from light background to dark object is
characterized (-,+), object interior is coded by either
0 or +, transition from object to background (+,-)
35
Thresholds based on several variables
• Color or multispectral histograms
• Thresholding is based on finding clusters in multi-
dimensional space
• Example: face detection
• Different color models

• Hue and saturation instead of RGB
36
Region based Segmentation
Basic formulation
• Every pixel must be in a region
• Points in a region must be connected
• Regions must be disjoint
• Logical predicate for one region and for distinguishing between
regions
Region growing

• Group pixels from sub-regions to larger regions
• Start from a set of ’seed’ pixels and append pixels with similar
properties
• Selection of similarity criteria: color, descriptotors (gray
level + moments)
• Stopping rule
Region splitting and merging
• Quadtree decomposition
37
38

• Fig. 10.41 shows the histogram of Fig. 10.40 (a).
• It is difficult to segment the defects by thresholding methods.

(Applying region growing methods are better in this case.)

Figure 10.40 (a) Figure 10.41
39
Region splitting is the opposite of region growing
• First there is a large region (possible the entire image).
• Then a predicate (measurement) is used to determine if the region

is uniform.
• If not, then the method requires that the region be split into two

regions.
• Then each of these two regions is independently tested by the

predicate (measurement).
• This procedure continues until all resulting regions are uniform.
40
• The main problem with region splitting is determining
where to split a region.
• One method to divide a region is to use a quadtree
structure.
• Quadtree: a tree in which nodes have exactly four
descendants

41
The split and merge procedure:
• Split into four disjoint quadrants any region Ri for which
P(Ri) = FALSE.
• Merge any adjacent regions Rj and Rk for which
P(RjURk) = TRUE.

(the quadtree structure may not be preserved)
• Stop when no further merging or splitting is possible.
42
Segmentation by Morphological Watersheds
• The concept of watersheds is based on visualizing an image in three

dimensions: two spatial coordinates versus gray levels.
• In such a topographic interpretation, we consider three types of

points:

• (a) points belonging to a regional minimum
• (b) points at which a drop of water would fall with certainty to a

single minimum
• (c) points at which water would be equally likely to fall to more

than one such minimum
• The principal objective of segmentation algorithms based on these 43

concepts is to find the watershed lines.
44

45

46

Scanned by CamScanner

DIP 15EC72 Notes

Uploaded by

Copyright:

Available Formats

DIP 15EC72 Notes

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

DIP 15EC72 Notes

Uploaded by

Copyright:

Available Formats

Digital Image Processing 15EC72

||Jai Sri Gurudev||

Department of Electronics and Communication Engineering

B.E., VII Semester, Electronics &

[As per Choice Based Credit System (CBCS) scheme]

Digital Image Processing

DEPARTMENT OF ELECTRONICS AND COMMUNICATION

SJB INSTITUTE OF TECHNOLOGY

Faculty incharge: Dr. Mahantesh K, Associate Professor, ECE, SJBIT

Dept. of ECE/SJBIT Page 1

DIGITAL IMAGE PROCESSING

Dept. of ECE/SJBIT Page 2

[Text: Chapter 5: Sections 5.2, to 5.9]

Dept. of ECE/SJBIT Page 3

1.1 What is Digital Image Processing?

1.2 Origins of Digital Image Processing

Dept. of ECE/SJBIT Page 4

Some of the initial problems in

In parallel with space

Dept. of ECE/SJBIT Page 5

1.3 Examples of fields that use DIP

Dept. of ECE/SJBIT Page 6

Imaging in the Ultraviolet Band

Dept. of ECE/SJBIT Page 7

Imaging in the Visible and Infrared Bands

Dept. of ECE/SJBIT Page 8

Figure 1.12 and Figure 1.13 show

A major area of imaging in the visual spectrum is in an automated visual inspection of

Dept. of ECE/SJBIT Page 9

Imaging in the Microwave Band

Imaging in the Radio Band

Dept. of ECE/SJBIT Page 10

Examples in which Other Imaging Modalities Are Used

1.4 Fundamental Steps in Digital Image Processing

Dept. of ECE/SJBIT Page 11

However, unlike enhancement, which is subjective, image restoration is objective, in the

Dept. of ECE/SJBIT Page 12

1.5 Components of an Image Processing System

Specialized image processing hardware usually consists of the digitizer just

Dept. of ECE/SJBIT Page 13

Mass storage capability is a must in image processing applications. An image of size

Dept. of ECE/SJBIT Page 14

1.6 Elements of Visual Perception

Although the field of digital image processing is built on a foundation of mathematical

• Eye is nearly a sphere.

Dept. of ECE/SJBIT Page 15

• We perceive fine detail

Image of Formation in the Eye

Dept. of ECE/SJBIT Page 16

Brightness Adaptation and Discrimination

An increment of illumination, ΔI , is added

Dept. of ECE/SJBIT Page 17

Figure 2.6 shows that brightness

Figure 2.7 (a) shows the fact that the visual

Dept. of ECE/SJBIT Page 18

Light and the Electromagnetic Spectrum

Electromagnetic wave can be visualized as propagating sinusoidal waves with wavelengthλ ,

Regarding to (2.2-2), energy is proportional to frequency, so the higher-frequency

Dept. of ECE/SJBIT Page 19