0% found this document useful (0 votes)
2 views

Digital Image Processing_4 Unit_ASRao

The document discusses image segmentation, which is the process of partitioning an image into meaningful regions based on various measurements such as color, texture, and motion. It outlines different methods of segmentation, including edge detection, point detection, and line detection, along with their applications in object detection and tracking. Additionally, it explains the use of gradient and Laplacian operators for edge detection, emphasizing the importance of Gaussian smoothing in the process.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Digital Image Processing_4 Unit_ASRao

The document discusses image segmentation, which is the process of partitioning an image into meaningful regions based on various measurements such as color, texture, and motion. It outlines different methods of segmentation, including edge detection, point detection, and line detection, along with their applications in object detection and tracking. Additionally, it explains the use of gradient and Laplacian operators for edge detection, emphasizing the importance of Gaussian smoothing in the process.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 169

Allanki Sanyasi Rao

AMIE; M.Tech; (Ph.D); MISTE; MIETE


Associate Professor & HOD
Dept. of ECE
Introduction
 The purpose of image segmentation is to partition an image into
meaningful regions w.r.t a particular application.

 The segmentation is based on measurements taken from the image and


might be gray level, color, texture, depth or motion.
 Usually image segmentation is an initial and vital step in a series of
processes aimed at overall image understanding.

 Segmentation attempts to partition the pixels of an image into groups


that strongly correlate with the objects in an image
 Image segmentation divides an image into regions that are connected
and have some similarity within the region and some difference between
adjacent regions.

 The goal is usually to find individual objects in an image.

 Applications of image segmentation include:


 Detecting and analyzing objects in an image or video to measure
their size, shape, and other characteristics.
 Finding and tracking individual objects in a video, so that they can be
compressed and stored more efficiently.
 Using a laser sensor, a robot can measure the distance of objects
around it, creating a 3D map of its surroundings. This helps the robot
plan its path and move aroundASRao
safely. 3
ASRao 4
Example 1
 Segmentation based on gray scale
 When we use a simple way to look at gray colors in a picture, it can lead
to mistakes in identifying what objects are in the picture.
Example 2
 Segmentation based on texture
 Enables object surfaces with varying patterns of gray to be segmented.
Example 3
 Segmentation based on motion
 Separating moving objects from the background is tricky because
we need to first figure out how things are moving (even if it's not
100% accurate). Then, we use that information to separate the
objects, rather than focussing on the actual movement.
Example 3
 Segmentation based on depth

 This example shows a range image, obtained with a laser range


finder. A segmentation based on the range (the object distance
from the sensor) is useful in guiding mobile robots.
 Segmentation algorithms generally are based on one of two basis
properties of intensity values:
 Discontinuity
 Similarity

Discontinuity: to partition an image based on abrupt changes in intensity


(such as edges) – Edge-based Segmentation
Similarity: to partition an image into regions that are similar according to a
set of predefined criteria – Region-based Segmentation
ASRao 10
Image Segmentation fundamentals

 Segmentation subdivides an image into its constituent regions or objects,


until the objects of interest in an application have been isolated.

 Segmentation partitions image R into subregions R1, R2, R3, …… Rn such


that:
- R1  R2  …..  Rn = R

-- Each Ri is a connected set, i = 1, 2, 3, ….. n

- Ri  Rj =  for all i, j where i  j

- Q(Ri) = TRUE for every i

- Q(Ri  Rj) = FALSE for any two adjacent regions

Q is a predicate function that takes a region Ri as input and returns a


Boolean value (True/False).
 There are three kinds of discontinuities of intensity: Points, Lines & Edges.

 The common way is to run a mask through the image.,


 Use the image sharpening techniques
- the first order derivatives produce thicker edges
- the second order derivatives (Laplacian operation) have a strong
response to fine detail, such as thin lines and isolated points, and noise.

 The most common way to look for discontinuities is to scan a small


mask over the image. The mask determines which kind of discontinuity
to look for.
Point Detection
 Steps for Point Detection

 Apply Laplacian filter to the image to obtain R(x,y)


 Create binary image by threshold.

where T is a non-negative threshold

Laplacian Mask
ab
cd
Laplacian Mask FIGURE
(a) Laplacian kernel
used for point
detection.
(b) X-ray image
of a turbine blade
with a porosity
manifested by a
single black pixel.
(c) Result of convolving
the kernel
with the image.
(d) Result was a single
point (shown enlarged).
ASRao 16
Line Detection
 Only slightly more common than point detection is to find a one pixel
wide line in an image.

 A special mask is needed to detect a special type of line.

 Examples:
- Horizontal mask has high response when a line is passed through the
middle row of the mask with a constant background.
 Apply every masks on the image
 let R1 , R2 , R3 , R4 denotes the response of the horizontal, +45 degree,
vertical and -45 degree masks, respectively.
 if, at a certain point in the image
|Ri | > |Rj |, for all j≠i,
that point is said to be more likely associated with a line in the
direction of mask i.
 To detect lines in an image, one can use a special filter called a mask that
looks for lines in a specific direction. By applying the mask to the image
and taking the absolute value of the result, potential lines can be
identified.
 Finally, by setting a threshold, or a minimum value, you can determine what is considered
a line, and detect all lines in the image that match the direction defined by the mask.
 The points that are left are the strongest responses, which, for lines one pixel thick,
correspond closest to the direction defined by the mask.
Edge Detection
 Segmentation by finding pixels on a region boundary.
 Edges found by looking at neighboring pixels.
 Region boundary formed by measuring gray value differences between
neighboring pixels
 An edge is a set of connected pixels that lie on the boundary between
two regions.
 an edge is a “local” concept whereas a region boundary, owing to the
way it is defined, is a more global idea.

Figure: Models (ideal representation of a step, a ramp, and a roof edge, and
their corresponding intensity profiles
Figure: A 1508 X 1970 image showing (zoomed) actual ramp, step and roof
edge profiles. The profiles are from dark to light, in the areas indicated by
the short line segments shown in the small circles. The ramp and step
profiles span 9 pixels and 2 pixels. The base of the roof edge is 3 pixels.
ab
c
FIGURE: (a) Image. (b) Horizontal intensity profile that includes the
isolated point indicated by the arrow. (c) Subsampled profile; the dashes
were added for clarity. The numbers in the boxes are the intensity values
of the dots shown in the profile. The derivatives were obtained for the
first derivative and for the second.
 First column: 8-bit images
with values in the range [0,
255], and gray level profiles
of a ramp edge corrupted by
random Gaussian noise of
mean 0 and σ = 0.0, 0.1, 1.0
and 10.0, respectively.

 Second column: first-


derivative images and gray-
level profiles.
 Third column: second
derivative images and gray
level profiles.

ASRao 24
First Order derivative (Gradient Operator)
 First - order derivatives:
The Gradient of an image f(x,y) at location (x,y) is defined as the vector:

f ( x, y ) f ( x, y ) f ( x, y )
f     Gx  G y
xy x y
where Gx is the gradient along the x-direction and Gy is the gradient
along y-direction.

The magnitude, M(x, y), of this gradient vector at a point (x, y) is given
by :
M ( x, y )  f  Gx2  G y2

The direction of the gradient vector at a point (x, y) is given by

1 

Gy

 ( x, y )    tan 
 Gx 
Gradient is a non-linear operator ASRao 25
Roberts cross-gradient operators

Prewitt operators

 A mask detects horizontal edges by finding vertical gradients, and vertical


edges by finding horizontal gradients. By applying these masks to an
image, we can find the strength and direction of edges at each location.
ASRao 26
Sobel operators

 The Sobel Edge Operator reduces noise in an image by smoothing and


averaging the edges. It's preferred over the Prewitt Operator because it
reduces false edges caused by noise.

 Sobel operators use 2x2 or 3x3 masks to calculate partial derivatives,


giving more weight to the center point.

A 3 X 3 region of an image (the z’s are intensity values)


Z1 Z2 Z3
For 3 X 3 (Prewitt) mask
Gx   z7  z8  z9    z1  z2  z3 
Z4 Z5 Z6

G y   zASRao
3  z6  z9    z1  z 4  z7 
Z7 Z8 Z9
27
 Masks of size 3X3 are preferred using either the differences along the x-
and y-directions as
f   z7  z8  z9    z1  z2  z3    z3  z6  z9    z1  z4  z7 
or using the cross differences along the diagonal as
f   z1  z2  z4    z6  z8  z9    z2  z3  z6    z4  z7  z8 

Prewitt masks for


detecting diagonal edges

Sobel masks for


detecting diagonal edges

ASRao 28
cd
FIGURE:
(a) Image of size
834 × 1114 pixels,
with intensity values
scaled to the range
[0,1].
(b) |Gx|, the
component of the
gradient in the x-
direction, obtained
using the Sobel kernel
to filter the image.
(c) |Gy| , the
component in y-
direction
(d) The gradient
image, |Gx| + |Gy|.

f  G x  G y ASRao 29
Figure: Same sequence as previous, but with the original image smoothed
with a 5X5 averaging kernel prior to edge
ASRao detection. 30
ASRao 31
Diagonal Edge Detection
Second Order derivative (Laplacian Operator)
The Laplacian of an 2D function f(x,y) is defined as

2 f 2 f
 f
2

x 2
y 2
2 f
 f ( x  1, y )  f ( x  1, y )  2 f ( x, y )
x 2

2 f
 f ( x, y  1)  f ( x, y  1)  2 f ( x, y )
y 2

 2 f   f ( x  1, y )  f ( x  1, y )  f ( x, y  1)  f ( x, y  1)  4 f ( x, y )

Two forms in practice:

ASRao 32
 The Laplacian is a linear operator.

The discrete form of the Laplacian of f(x,y), taking the 4- Z1 Z2 Z3


neighbours into account, is obtained by summing the
Z4 Z5 Z6
discrete forms of partial derivatives along x- and y-
directions as Z7 Z8 Z9

 2 f  4 z5   z 2  z 4  z6  z8 

or taking all the 8-neighbours into account, is obtained by summing the


discrete forms of partial derivatives along x- and y- directions & along the
diagonals as
 2 f  8 z5   z1  z 2  z3  z 4  z6  z7  z8 

The corresponding 3X3 masks are:

ASRao 33
ASRao 34
 The Laplacian generally is not used in its original form for edge detection
for the following reasons: (i) Second-order derivative is unacceptably
sensitive to noise and (ii) The magnitude of the Laplacian produces
double edges.

 The Laplacian is often used with Gaussian smoother given by the function:
r2
h( r )   e 2 2

 The Laplacian of h is given by


2
r 2  r which is called the Laplacian of
2
 2 2
 h( r )   
2
 e Gaussian (LoG)
 
4


 The Laplacian of a Gaussian sometimes is called the Mexican hat


function. It also can be computed by smoothing the image with the
Gaussian smoothing mask, followed by application of the Laplacian
mask. ASRao 35
ASRao 36
(a) (b)
Sobel Gradient

(d)
(c)

ASRao 37
(e) (f) (g)

ASRao 38
Sobel Edge Detector
 Sobel Edge detector is one of the famous edge detector.
 Consider the Sobel Edge Detector, then the image is processed in the X
and Y directions respectively.
 This would result in the formation of new image, which actually is the
sum of the X and Y edges of the image.
 This approach works through calculation of the gradient of the image
intensity at every pixel within the image.

 Sobel filter has two kernels (3X3 matrix)


 One of them correspond the x (horizontal) and the other shall be for the
y (vertical) direction.
 These two kernels shall be convolved with the original image under
process and through which the edge points are calculated with ease.
 The kernel values shown, are fixed for Sobel filter and cannot be altered.
ASRao 39
 The Gaussian filter play a vital role in the entire process.
 The fundamental idea behind Gaussian filter is the center having more
weight than the rest.
 The general approach for detecting the edges is with the first order or
second order derivatives as shown.

ASRao 40
 The next sequence is to convolute the Gx (Sobel X) and Gy (Sobel Y) over
the input image which shall enable us to calculate the value for one pixel
at a time.
 Then, the shift happens and the move has to be made towards the right,
i.e., the next column shift till the column end.

 Similar process is followed for the row shifting from top-bottom.


Remember for column it is left to right hovering where as for the rows it
is top to bottom hovering with the Gx and Gy.

ASRao 41
Convolution

 The above matrix is nothing but a 5X4 image and is now being convolved
with Gx (Sobel X) operator, in the considered 5X4 matrix, we take the first
3X3 matrix, in which, the center value (i) is computed.
 On the convolution, the following resultant matrix will be obtained.

ASRao 42
 The Sobel-X values got convoluted with the original image matrix values.
Hence, the resultant matrix values would be as follows:
Gx  1* a  0 * b  (1) * c  2 * d  (2) * e  1* f  0 * g  (1) * h
 1*100  0 *100  (1) * 50  2 *100  (2) * 50  1*100  0 *100  (1) * 50
 200 ASRao 43
 Going by the same calculations we have done for the Gx, Gy can be
computed.
Gx  (1) * a  (2) * b  (1) * c  0 * d  0 * e  1* f  2 * g  1* h
 (1) *100  (2) *100  (1) * 50  0 *100  0 * 50  1*100  2 *100  1* 50
0

 The above calculations shall help the reader in visualizing the edges
indirectly. In the X-axis there are changes in the cells. But in the Y-axis,
there are no changes and hence the result Gy was 0.

 The resulting Gradient Approximation can be calculated with

G  Gx2  G y2

 The G will be compared against the threshold and with which, one can
understand the taken point is an edge
ASRao or not. 44
Input Image

Gradient Magnitude

Edges
ASRao 45
Prewitt Edge Detector
 The Prewitt Edge Detector is similar to the Sobel Edge Detector but with
minor change. The Prewitt operator shall give the values which are
symmetric around the center. But, Sobel operator gives weight to the
point which is lying close to (x,y).

 The Prewitt x and y operator values are

ASRao 46
Input Image

Prewitt Gradient

Detected Edges

ASRao 47
Laplacian Edge Detection
 The Laplacian edge detector uses only one kernel where it was two w.r.t
Sobel edge detector. Laplacian edge detector is capable and it calculates
the second order derivative in a single go.

 The following are the kernels used:

 The kernel focuses on horizontal and vertical


directions alone.

 The kernel focuses on all the directions


including diagonals..

ASRao 48
 There are however, some disadvantages in this approach:
- There are pixel thick edges produced in this method.
- Very highly sensitive to noise
ASRao 49
Canny Edge Detection
 This is one of most important and complex edge detector. Canny is not
like other traditional edge detectors, it is just not masking or hovering on
the input image matrix.

 Canny Edge Detection is a multi-stage algorithm to detect a wide range of


edges in images.

 It was developed by John F. Canny in 1986 and is known for its optimal
edge detection capabilities.

 Canny edge detection technique, not just is a plain edge detection


technique. It has an additional feature. It also suppresses the noise while
detecting the edges flawlessly.

ASRao 50
Steps to follow for Canny Edge Detection:
1. Conversion to the Gray Scale
 Let us take a sample image and proceed with the conversion. We have
converted the input RGB image to a Grayscale image.

2. Noise Reduction using Gaussian Blurring


 It is an operator, which helps tremendously in removing the noise in the
input image. This noise removed image shall enable further processing to
be smooth and flawless. The σ value has to be set appropriately for
better results. ASRao 51
3. Intensity Gradient Calculation
 Sobel filter is to be used in this process. Let’s understand what an edge
is all about. Sudden intensity change is the edge and in fact, the intensity
change of the pixel is the edge.

 Now the Sobel operator has to applied over the input image. The
resultant Sobel operated image is shown below. This is referred as
Gradient Magnitude of the image.
ASRao 52
 We preferred Sobel operator and it is the general approach. But, it is not
a mandated rule to always go with Sobel operator. It can be any gradient
operator and the result should be the gradient magnitude of the image.

 The resulting Gradient Approximation can be calculated with

G  Gx2  G y2

 The G will be compared against the threshold and with which, one can
understand the taken point is an edge
ASRao or not. 53
1  Gy
 The formula for finding the edge detection is just 
  tan  
 Gx 

4. Non-Maximum Suppression
 This is the next step in the sequence. The Gradient magnitude operators
discussed in the previous stage normally obtains thick edges. But the
final image is expected to have thin edges.

 Hence the process Non-Maximum Suppression shall enable us to derive


thin edges from thicker one through the following steps:
 We have the edge direction already available with us. The
subsequent step is to relate the identified edge direction to a
DIRECTION that can be sketched in the image i.e., ideally, it is
prediction of how the movement of edges could happen.
 An example is always handy and we have taken 3X3 matrix as a
reference. It’s all about the colors and the below is to be visualized
as 3X3 matrix for the scenarioASRao
being discussed. 54
The possible directions of movement could be...

North to South:

East to West:

Both the diagonals:

ASRao 55
 The Center cell is the region of interest for us. It is important to
understand this point.

 There can be only 4 possible directions for any pixel. They are
- 0 Degrees - 45 Degrees - 90 Degrees - 135 Degrees

 Hence, it forces us to a situation where the edge has to be definitely


oriented to one of these four directions.

 This is a kind of approximation where if the orientation angle is


observed to be 5 degrees then it is taken as 0 degrees. Similarly, if is 43
degrees, it shall be made as 45 degrees.

 For ease of understanding we have drawn a semicircle with color


shading. It is representing 180 degrees (but actual scenario is for 360
degrees).
ASRao 56
 Any edge which come under the yellow range is set to 0 degrees. (which
means from 0 to 22.5 degrees and 157.5 to 180 degrees are set to 0
degrees).
 Any degree which come under the green range is all set to 45 degrees.
(which means 22.5 degrees to 67.5 degrees is set as 45 degrees).
 Any edge coming under the blue range is all set to 90 degrees. (which
means 67.5 to 112.5 degrees is set as 90 degrees.

 Any edge coming under the red range is all set to 135 degrees. (which
ASRao 57
means 112.5 to 157.5 degrees is set as 135 degrees.
 After this process, direction of the edges is mapped to any of the 4
directions mentioned. The input image now shall be like the one
presented below where directions of the edges are appropriately
mapped.

ASRao 58
 The 4th step in the process comes now.
 The edge directions are all determined and the Non-Maximum
suppression is to be applied.
 Non-Maximum suppression as the name suggests, is a process where
suppression of the pixels to zero which cannot be considered as an edge
is carried out. This shall enable the system to generate a thin line in the
output image as shown below.
 These results are obtained before the thresholding and as expected the
next stage to perform thresholding and smoothing.

ASRao 59
5. Thresholding – A must do process
 As one could see from the previous stage results, Non-Maximum
suppression has not provided us excellent results.

 There is still some noise. The image even raises a threat in mind that
some of the edges shown may not really be so and some edges could
be missed in the process. Hence, there has to be a process to
address this challenge. The process to be followed is thresholding.
 We need to go with double thresholding and in this process we have to
set two thresholds. One, a high and other is low. What is this?

 Simple. Assume high threshold value as 0.8. any pixel with a value above
0.8 is to be seen as stronger edge.
- Another threshold, the lower one can be 0.2. in that case, any pixel
below this value is not an edge atASRao
all and hence set them all to 0. 60
 Now comes the next question, what about the values in between?
- They may or may not be an edge. They are referred as a weak edge.
There has to be a process to determine which of the weak edges are
actual edges so as to not to miss them.

6. Edge Tracking

 It is important for us to now understand which of the weaker edges are


actual edges.
 Simple approach has to be followed. We can call the weak edges
connected to strong edges as strong/actual edges and retain them. Weak
edges which are not connected to stronger ones are to be removed.

ASRao 61
7. The final cleansing
 All the remaining weak edges can be removed and that is it. The process
is complete. Once this process is done, we could get the following output
image as the result.

ASRao 62
ASRao 63
 Set of pixels from edge detecting algorithms, seldom define a boundary
completely because of noise, breaks in the boundary etc.

 Therefore, Edge detecting algorithms are typically followed by linking and


other detection procedures, designed to assemble edge pixels into
meaningful boundaries.

 Edge linking means joining the edges.

 An edge detection algorithm (Roberts, Sobel, Prewitt, LoG etc.) enhance


the edges. When implemented, there are normally breaks in lines. Due to
this reason, these are generally followed by linking procedures to
assemble edge pixels into meaningful edges.

A good
Input Gradient Edge Edge output
Image Operation Linking image
ASRao 64
cd
FIGURE:
(a) Image of size
834 × 1114 pixels,
with intensity values
scaled to the range
[0,1].
(b) |Gx|, the
component of the
gradient in the x-
direction, obtained
using the Sobel kernel
to filter the image.
(c) |Gy| , the
component in y-
direction
(d) The gradient
image, |Gx| + |Gy|.

f  G x  G y ASRao 65
 There are two basic approaches for edge linking:
(1) Local Processing: This is a simplest approach for linking pixels in
a small neighborhood.
(2) Global Processing via the Hough Transform: Here, we attempt
to link edge pixels that lie on specified curves. The Hough transform
is designed to detect lines, using the parametric representation of a
line.

ASRao 66
Local Processing

 Local Processing is a simple method for Edge linking.

 Analyze the characteristics of pixels in a small neighborhood Sxy (say, 3X3,


5X5) about every edge pixels (x, y) in an image that have undergone edge
detection.

 All points that share common properties are linked together. These are:
 Strength/magnitude of the gradient
 Direction of the gradient

 Adjacent edge points with similar magnitude and direction are linked.

ASRao 67
The magnitude, M(x, y), of this gradient vector at a point (x, y) is given
by its Euclidean vector norm:

The direction of the gradient vector at a point (x, y) is given by

 Pixels (s,t) and (x,y) are similar and linked if,

M ( s , t )  M ( x, y )  E where E is a positive threshold

 ( s , t )   ( x, y )  A where A is a positive angle threshold

 Two neighboring pixels are linked if both magnitude and direction


criteria are satisfied.
 Linked points are assigned a different
ASRao
gray level. 68
 Algorithm Steps:
1. Compute f , M ( x, y ),  ( x, y )

2. Form a binary image, g(x,y), whose value at any pair of coordinates


(x,y) is given by:
1, M ( x, y )  TM and  ( x, y )  A  TA
g ( x, y )  
0, otherwise

Where TM is a threshold, A is a specific angle direction, and ±TA defines


a “band” of acceptable direction about A

3. Scan the rows of g and fill all gaps in each row that do not exceed a
specified length , K.

3. To detect gaps in other direction, θ, rotate g by this angle and apply


horizontal scanning procedure in step 3. Rotate the result back by –θ.

ASRao 69
In this example, we can find the license plate candidate after edge
linking process.
ASRao 70
Global Processing via the Hough Transform
 Points are linked by determining whether they lie on a curve of specified
shape.

 Motivation
 Given a point (xi, yj), many lines pass through this point as yi=axi+b
with different a and b.
 Find all lines determined by every pair of points and subsets of
points that are close to particular lines.

ab
FIGURE
(a) xy-plane.
(b) Parameter space.
ASRao 71
Hough Transform

 Hough transform is a process that converts xy- plane to a ab-plane


(parameter space).

 Considering b = -axi +yi in ab-plane:


 A point (xi, yi) in the image space is mapped to many points {(a,b)} in
the parameter space which are on line.
 (xj, yj) is mapped to many points {(a,b)} in the parameter space
which are on line: b = -axi + yi and b = -axj+yj in the parameter space
intersect at (a’, b’)

ASRao 72
The Procedures of Hough Transform
Step 1:
Subdivide ab- plane to accumulator cells. Let A(i, j) be the cell at (i, j) where
amin ≤ ai ≤ amax, bmin ≤ bj ≤ bmax and A(i, j) = 0

Step 2:
For every (xk, yk), find b=-xkap + yk for each allowed p

Step 3:
Round off b to the nearest allowed value bq. Let A(p, q) = A(p, q) + 1.

Refer to our class notes for ASRao


more details on this topic. 73
ASRao 74
 Suppose that at an image f(x,y), is composed of light objects on a dark
background, and the following figure is the histogram of the image.

 Then, the objects can be extracted by comparing pixel values with a


threshold T.

 One way to extract the objects from the background is to select a


threshold T that separates object from background.

ASRao 75
 Any point (x,y) for which f(x,y) > T is called an object point; otherwise the
point is called a background point.

 When T is a constant applicable over an entire image, then the above


process is called as Global thresholding

 When the value T changes over an image, then that process is referred
as Variable thresholding. Sometimes it is also termed as local or regional
thresholding.

 Where, the value of T at any point (x,y) in an image depends on


properties of a neighborhood of (x,y).
ASRao 76
 If T depends on the spatial coordinates (x,y) themselves, then variable
thresholding is often referred to as dynamic or adaptive thresholding.

ASRao 77
ASRao 78
Multilevel Thresholding
 It is also possible to extract objects that have a specific intensity range
using multiple thresholds.
Image with dark
Background and two
Light objects

 A point (x,y) belongs

ASRao 79
 Segmentation problems requiring multiple thresholds are best solved
using region growing methods.

 Thresholding can be viewed as

T  T x, y, p( x, y ), f ( x, y )

where f(x,y) is gray-level at (x,y) and p(x,y) denotes some local


property of this point, for example average gray level neighborhood.

ASRao 80
 A Thresholded image g(x,y) is defined as

1, f ( x, y )  T
g ( x, y )  
0, f ( x, y )  T

where 1 is object and 0 is background

When T = T[f(x,y)], threshold is global

When T = T[p(x,y), f(x,y)], threshold is local

When T = T[x, y, p(x,y), f(x,y)], threshold is dynamic or adaptive

ASRao 81
Role of Noise in Image Thresholding

ASRao 82
ASRao 83
Role of Illumination in Image Thresholding
f ( x, y )  i ( x, y ) r ( x, y )
 Non-uniform illumination may change the histogram in a way that it
becomes impossible to segment the image using a single global
threshold.
z ( x, y )  ln f ( x, y )
 ln i ( x, y )  ln r ( x, y )
 i ' ( x, y )  r ' ( x, y )
Solution
g ( x, y )  k i ( x, y )

- Then, for any image f ( x, y)  i( x, y) r ( x, y), divide by g(x,y).


ASRao 84
This yields: f ( x, y ) i ( x, y ) r ( x, y )

g ( x, y ) k i ( x, y )
r ( x, y )
h ( x, y ) 
k
If r(x,y) can be segmented by using a single threshold T, then h(x,y) can also
be segmented by using a single threshold of value T/k

ASRao 85
Basic Global Thresholding
Based on visual inspection of histogram

1. Select an initial estimate for T


2. Segment the image using T. This will produce two groups of pixels: G1
consisting of all pixels with gray level values > T and G2 consisting of
pixels with gray level values ≤ T
3. Compute the average gray level values μ1 and μ2 for the pixels in
regions G1 and G2.
4. Compute a new threshold value.
1
Tnew  1   2 
2
5. Compare if |Tnew – T| > z0 (predefined threshold z0) then T=Tnew and go
to step 2, else stop. ASRao 86
Note: The clear valley of the histogram and the segmentation between
object and background.
5 3 9 
Try Global Thresholding concept on the following image:
2 1 7 
 
ASRao
8 4 87 2
Basic Global thresholding
 Works well in situations where there is a reasonably clear valley between
the modes of the histogram related to objects and background.
 z0 is used to control the number of iterations.

 Initial threshold must be chosen greater than the minimum and less than
the maximum intensity level in the image.

 The average intensity of the image is a good initial choice for T


 Single value thresholding only works for bimodal histograms.
Be Careful…!

 If you get the threshold


wrong, the results can be
disastrous.
ASRao 88
Local Thresholding

 In images where background intensities dominate, finding a clear


threshold value for segmentation is challenging. This is because the
object's pixel histogram is often negligible
ASRao compared to the background's.
89
 To solve this, consider only pixels near the object's boundary or between
the object and background. By plotting a histogram of these pixels, a
bimodal distribution can be observed.

ASRao 90
 Focusing on boundary pixels offers two advantages:
1. Background and object pixel probabilities are nearly equal.
2. Object and background regions have similar areas, making the histogram
symmetrical.

 With a symmetrical histogram, thresholding becomes straightforward.


However, identifying the object boundary is crucial for this approach.

 To identify the object boundary, the concepts of gradient and Laplacian


are used. The gradient helps determine the boundary, while the
Laplacian (second derivative) indicates whether a point is on the darker
or brighter side of the edge.

 The gradient reveals whether a pixel is on edge or not.


 Laplacian indicates the transition ASRao
of the gray level on the edge. 91
 In this case, a positive Laplacian value indicates the pixel belongs to the
darker object, while a negative value indicates it belongs to the brighter
background. The gradient provides edge information, representing the
maximum rate of change in intensity. (as shown in figure)

ASRao 92
If we assume that an image
 Object is dark
 Background is white.
 If the gradient value is greater than or equal to some threshold T, we
assume that this point belong to edge point since points near the edge,
the gradient value will be high.
 If the gradient value is less than some threshold T, we assume that this
point does not belong to edge point not even within a region near the
edge.
 Laplacian is +ve for the object pixel and –ve for the background pixel.

ASRao 93
ASRao 94
 Local thresholding is a technique used in image segmentation to separate
objects from the background by applying a threshold value that varies
locally across the image.

 The fundamental concept of local thresholding is to calculate a threshold


value for each pixel or a small region of pixels based on the local
characteristics of the image.

Key Concepts:

1. Local Window: A small window or neighborhood of pixels is considered


for calculating the local threshold value. The size of the window can vary,
but it is typically small (e.g., 3x3, 5x5, or 7x7 pixels).
ASRao 95
2. Local Statistics: The local threshold value is calculated based on local
statistics, such as the mean, median, or standard deviation of the intensity
values within the local window.

3. Threshold Calculation: The local threshold value is calculated using a


formula that takes into account the local statistics. For example, the
threshold value might be set to the mean or median intensity value within
the local window.

4. Pixel-wise Thresholding: Each pixel is assigned a threshold value based


on the local window and statistics. If the pixel's intensity value is above the
threshold, it is classified as an object pixel; otherwise, it is classified as a
background pixel.
ASRao 96
Numerical Example:

Suppose we have a 5x5 image with the following intensity values:


10 20 30 40 50
20 30 40 50 60
30 40 50 60 70
40 50 60 70 80
50 60 70 80 90

We want to apply local thresholding using a 3x3 window and calculating the
threshold value as the mean intensity value within the window.

 For the top-left pixel (10), the 3x3 window is:


10 20 30
20 30 40
30 40 50
ASRao 97
The mean intensity value is (10+20+30+20+30+40+30+40+50)/9 = 30. The
threshold value for this pixel is 30. Since the pixel's intensity value (10) is
below the threshold, it is classified as a background pixel.

 For the center pixel (50), the 3x3 window is:


30 40 50
40 50 60
50 60 70
The mean intensity value is (30+40+50+40+50+60+50+60+70)/9 = 50.
The threshold value for this pixel is 50. Since the pixel's intensity value (50)
is equal to the threshold, it is classified as an object pixel.
By applying local thresholding to each pixel, we can segment the image into
objects and background. ASRao 98
Adaptive (or) Dynamic Thresholding
 Global thresholding often fails in the case of uneven illumination.
 The solution is to divide the image into sub images, and determine T for
each sub image. As the threshold used for each pixel in terms of the
sub-images, this type of thresholding is adaptive.

ab
cd
(a) Original image
(b)Result of global
Thresholding
(c) Image subdivided
Into individual sub Images
(d) Result of adaptive
thresholding
ASRao 99
Figure: Zoomed version of portions shown in red square boxes in the
previous slide.

How to solveASRao
this problem? 100
Answer: Subdivision

 Further subdivision can improve the quality of adaptive thresholding.


 But this comes at the cost of additional computational complexity and
time.

ASRao 101
Figure: Comparison of the thresholding algorithms (a) Original image (b)
Result of adaptive thresholding algorithm (c) Result of Global thresholding
algorithm

Original Image Single Threshold


ASRao Multiple Threshold
102
ASRao 103
 Edge detection and detection using thresholding sometimes do not give
good results for segmentation.
 Region based segmentation algorithms find regions directly, instead of
the boundaries dividing the regions.

 Region-based segmentation is used on the connectivity of similar pixels


in a region.

 Region-based segmentation is a technique in image processing where an


image is divided into regions or areas of similar pixels, based on
characteristics such as intensity, color, or texture.

 There are two main approaches to region-based segmentation:


 Region Growing
 Region Splitting & Merging
- Region Splitting - Region Merging
- Region ASRao
Splitting & Merging 104
Basic Formulation:
 Segmentation partitions image R into subregions R1, R2, R3, …… Rn such
that:
n
(a )  Ri  R
i 1
(b) Ri is a connected region, i  1, 2, 3, ..., n
(c) Ri  R j   for all i and j , i  j
(d ) Q( Ri )  TRUE for i 1, 2, 3, ..., n
(e) Q( Ri  R j )  FALSE for i  j

Q(Ri) is a logical predicate property defined over the points in set Ri.

Ex. Q(Ri) = TRUE if all pixel in Ri have the


ASRaosame gray level. 105
Region Growing

 Region growing algorithms works on principle of similarity.

 It states that a region is coherent if all the pixels of that region are
homogeneous w.r.t some characteristics such as color, intensity, texture,
or other statistical properties..

 Thus idea is to pick a pixel inside a region of interest as a starting point


(also known as a seed point) and allowing it to grow.

 Seed point is compared with its neighbors, and if the properties match,
they are merged together.

 This process is repeated till the regions converge to an extent that no


further merging is possible.
ASRao 106
Region Growing Algorithm
 It is a process of grouping the pixels or subregions to get a bigger region
present in an image.

 Find all connected components in S(x,y) and reduce each connected


component to one pixel. Label all such pixel found as 1. All other pixels in
S are labeled 0.
 Form an image fQ such that, at each point (x,y), fQ(x,y) = 1, if the input
image satisfies a given predicate, Q at those coordinates, and fQ(x,y) = 0
otherwise.
 Let g be an image formed by appending to each seed point in S all the 1-
valued points in fQ that are 4- or 8 –connected to that seed point.

 Label each connected component in g with a different region label (ex.


Integers or letters). This is the segmented
ASRao image. 107
Q1: Apply Region growing on the following image with initial point at
(2,2) and threshold value as 2. Use 4-connectivity.

0 1 2 3
0 0 1 2 0
1 2 5 6 1
2 1 4 7 3
3 0 2 5 1

ASRao 108
Q1: Apply region growing on the following image with seed point as 6
and threshold value.as 3.

5 6 6 7 6 7 6 6
6 7 6 7 5 5 4 7
6 6 4 4 3 2 5 6
5 4 5 4 2 3 4 6
0 3 2 3 3 2 4 7
0 0 0 0 2 2 5 6
1 1 0 1 0 3 4 4
1 0 1 0 2 3 5 4
ASRao 109
Advantages and Disadvantages of Region Growing

 Advantages
 Region growing methods can correctly separate the regions that have
the same properties we define.

 Region growing methods can provide the original images which have
clear edges with good segmentation results.

 The concept is simple. We only need a small number of seed points


to represent the property we want, then grow the region.

ASRao 110
 Disadvantages
 Computationally expensive

 It is a local method with no global view of the problem.

 Sensitive to noise.

 Unless the image has had a threshold function applied to it, a


continuous path of points related to color may exist which connects
any two points in the image.

ASRao 111
Region Splitting

 Entire image is assumed as a single region. Then the homogeneity


(similarity) test is applied, where pixels that are similar are grouped
together. If the conditions are not met, then the regions are split into
four quadrants, else leave the region as it is.

 Split and continue the subdivision process until some stopping criteria is
fulfilled. The stopping criteria often occur at a stage where no further
splitting is possible.

 This process is repeated for each quadrant until all the regions meet the
required homogeneity criteria. If the regions are too small, then the
division process is stopped.
ASRao 112
 To explain this in terms of graph theory, we call each region a node.

 This technique has a convenient representation in the form of a quadtree


structure.

 Quadtree: A Tree in which nodes have exactly four descendants.

ASRao 113
Region Merging

 Region merging is opposite to region splitting..

 Here we start from the pixel level and consider each of them as a
homogeneous region..

 At any level of merging, we check if the four adjacent regions satisfy the
homogeneity property. If yes, they are merged to form a bigger region,
otherwise the regions are left as they are.
 This is repeated until no further region requires merging.

ASRao 114
Region Splitting and Merging

 Splitting or merging might not produce good results when applied


separately. Better results can be obtained by interleaving merge and split
operations..

 The split and merge procedure is as follows:


 First there is a large region (possible the entire image).
 Split into four disjoint quadrants any region Ri for which Q(Ri) =
FALSE
 Merge any adjacent regions Rj and Rk for which Q(Rj  Rk) =
TRUE. (the quadtree structure may not be preserved).
 Stop when no further merging or splitting is possible..

ASRao 115
Q1: Apply Splitting and merging on the following image with
threshold value equal to 3.

5 6 6 6 7 7 6 6
6 7 6 7 5 5 4 7
6 6 4 4 3 2 5 6
5 4 5 4 2 3 4 6
0 3 2 3 3 2 4 7
0 0 0 0 2 2 5 6
1 1 0 1 0 3 4 4
1 0 1 0 2 3 5 4
ASRao 116
Allanki Sanyasi Rao
AMIE; M.Tech; (Ph.D); MISTE; MIETE
Associate Professor & HOD
Dept. of ECE
ASRao 117
Low Level Processing
 Input & Output both are images
 Image preprocessing for noise reduction, contrast enhancement, image
sharpening.

Mid Level Processing


 Input: Image
 Output: Image attributes - edges, contours, recognition of objects etc.
 Ex: segmentation, edges of image, identification

High Level Processing


 Input: attributes extracted from images
 Image analysis, vision related cognitive operations.
 Ex.: Automatic Character Recognition, missile recognition, Computer
Vision , autonomous navigation ASRao 118
Segmentation Vs Morphological Processing

ASRao 119
Introduction
What is Morphology?
 Morphology generally concerned with Shape and properties of objects.
Used for segmentation and feature extraction.
 The word Morphology commonly denotes a branch of biology that
deals with the form and structure of animals and plants.
 The fundamental use of Morphological processing is to remove the
imperfections in the structure of images.
 Binary images may contain numerous imperfections. In particular, the
binary regions produced by simple thresholding are distorted by noise
and texture.
 Morphological image processing pursues the goals of removing these
imperfections.
ASRao 120
 Mathematical morphology as a tool for extracting image components
such as regions, boundaries, skeletons etc.
 We can also use morphological techniques for pre- or post processing
images.

Preliminaries
 We will use Set Theory to formalize the
operations in morphological image
{B}
processing.

 Sets in mathematical morphology


represent objects in an image.

 The set of all white pixels in a binary image is a complete


morphological description of the object in the image.
ASRao 121
 In the figure, we can represent the object by the set B
 Each members of the set B is a coordinate pair Z=(x, y), representing a
white pixel.

 The concepts of set reflection


and translation are used
extensively in morphology.

 Reflection:

Bˆ  w | w  b, for b  B

 Translation:
B z  c | c   zb, for b  B

ASRao 122
Structuring Element (SE)

 SE is a shape mask used in the basic morphological operations.


 You can assume it as similar to the
spatial filter matrices.

 The origin of a structuring


element also must be specified, if
not specified, assume it at its
center of gravity.

 They can be any shape and size


that is digitally representable.

 The number of pixels added or removed from the objects in an image


depends on the size and shape of the structuring element used to
ASRao 123
process the image.
 The structuring element is positioned at all possible locations in the
image and it is compared with the corresponding neighborhood of pixels.
 Some operations test whether the element “fits” within the
neighborhood , while others test whether it “hits” or intersects the
neighborhood.
Probing of an image with a Structuring element

ASRao 124
Operators by Graphical Examples

abc
de
Figure:
(a) Two sets A and B
(b) The union of A and B
(c) The intersection of A and B
(d) The complement of A
ASRao (e) The difference between A125and B
Logical Operators for Binary Images

Figure:
Some logic operations
Between binary images.

ASRao 126
Common Morphological Operations

ASRao 127
ASRao 128
 Dilation: Adds pixels to the boundaries of objects in an image.
 Fills in holes.
 Smoothens object boundaries.
 Adds an extra outer ring of pixels onto object boundary, i.e., object
becomes slightly larger..
 Dilation expands the connected sets of 1s of a binary image.
 It can be used for
1. Expanding shapes:

2. Filling holes, gaps and gulfs:

ASRao 129
 Suppose A and B are sets of pixels, dilation of A by B

A  B   Ax
xB

 Replace every pixel in A with copy of B (vice versa)

 For every pixel x in B,


 Translates A by x
 Take union of all these translations

 Mathematically, the dilation of a set A by B, denoted AB, is defined as

  z  A   
A  B  z | Bˆ

ASRao 130
Before Dilation After Dilation

ASRao 131
Example of Dilation using three different rectangular structuring elements

ASRao 132
Common Morphological Operations

ASRao 133
Structuring Element for Dilation (Examples)

ASRao 134
Before Dilation

After Dilation

ASRao 135
Practice problem

Solution

ASRao 136
ASRao 137
 Erosion reduces the number of pixels from the object boundary or
“shrink” or “thin” objects in a binary image.
 The number of pixels removed depends on the size of structuring
element.
 Mathematically, the erosion of a set A by B denoted AӨB is defined as

 
AB  z | Bˆ z  Ac   
Typical Uses of Erosion
 Removes isolated noisy pixels.
 Smoothens object boundary.
 Removes the outer layer of object
pixels, i.e. object becomes slightly
smaller.
ASRao 138
Example

First write all the zeros as it is.


Keep this SE at each white pixel
and
see for perfect fit.

When we keep origin at the


center (second row 3rd column
element), The intersection is
not perfectly matched. We
have two zeros there. So make
this center element zero. Same
ASRao procedure to be followed.139
After Erosion process, the final result is as follows:

ASRao 140
Before Erosion

After Erosion

ASRao 141
ASRao 142
ASRao 143
Structuring Element Decomposition
 Structuring element decomposition is a technique used in morphological
processing to break down a complex structuring element into simpler
components. This is useful for improving the efficiency and flexibility of
morphological operations.
 Imagine having a large, complex shape (structuring element) for image
processing. Breaking it down into smaller, simpler shapes can achieve the
same result. This approach is similar to dividing a large task into smaller,
more manageable steps.
Examples:
1. Decomposing a rectangle: A large rectangle can be decomposed into
smaller rectangles or even lines.
2. 2. Decomposing a circle: A circle can be decomposed into smaller circles
ASRao 144
or even lines and arcs.
Figure. (a) Decomposition of structuring element B = B1 ⊕ B2;
(b) Input and output to decomposition
ASRao windows B1 and B2. 145
 Suppose we have a structuring element that is a 5x5 square:
11111
11111
11111
11111
11111
 We want to decompose this structuring element into smaller
components. One possible decomposition is into four 3x3 squares:
111 111
111 111
111 111
111 111
111 111
111 111
 This decomposition can be used to perform morphological operations,
such as erosion or dilation, more efficiently.

ASRao 146
Benefits:
Improved efficiency: Decomposing a structuring element can reduce the
number of operations required to perform morphological processing.

Increased flexibility: Decomposition allows for more complex structuring


elements to be used, enabling a wider range of morphological operations.

ASRao 147
ASRao 148
Combining Dilation and Erosion

 Combining dilation and erosion is a fundamental concept in


morphological image processing. It involves applying dilation and erosion
operations in a specific order to achieve a desired outcome.

Opening Operation:
An opening operation is a combination of erosion followed by dilation.
The process involves:
1. Eroding the image with a structuring element to remove small objects and
noise.
2. Dilating the eroded image with the same structuring element to restore
the original shape.
ASRao 149
Closing Operation:
A closing operation is a combination of dilation followed by erosion. The
process involves:
1. Dilating the image with a structuring element to fill small gaps and holes.
2. Eroding the dilated image with the same structuring element to restore
the original shape.

ASRao 150
Closing & Opening Processes

ASRao 151
Differences between Dilation and Erosion & Open and Close
 Erosion and dilation clean image but leave objects either smaller or
larger than their original size.
 Opening and closing perform same function as erosion and dilation but
object size remains the same. ASRao 152
ASRao 153
 Closing is a process in which first dilation operation is performed and
then erosion operation is performed.

ASRao 154
Example:

Original Image

Structuring Element

ASRao 155
After Dilation

Final Image after Erosion

ASRao 156
ASRao 157
ASRao 158
 Opening is a process in which first erosion operation is performed and
then dilation operation is performed.

ASRao 159
Example:

Original Image

Structuring Element

ASRao 160
After Erosion

Final Image after Dilation

ASRao 161
ASRao 162
ASRao 163
 In morphology, hit-or-miss transform is an operation that detects a
given pattern in a binary image using a structuring element containing
1’s, 0’s and blank for don’t cares.

Probing of an image with a Structuring element

ASRao 164
 Hit-and-miss algorithm can be used to thin and skeletonize a shape in a
binary image.

 Hit-or-miss transform: AB = (AӨB1) (AcӨB2)


 B = (B1 (object), B2 (Background)), locate all pixels that match B1
ASRao 165
structure (i.e. a hit) but do not match that of B2 (i.e. a miss)
Output Image

ASRao 166
Example: Hit & Miss transform is an iterative process containing repeated
steps to thin the shape by hit-and-miss method. In each iteration, some
different structuring elements are used to identify the edge pixels t be
removed.

ASRao 167
Example:

ASRao 168
ASRao 169

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy