Laboratory 3. Basic Image Segmentation Techniques
Laboratory 3. Basic Image Segmentation Techniques
Segmentation subdivides an image into constituent regions or objects. The level of detail to which
the subdivision is carried depends on the problem being solved. Segmentation of nontrivial images is one
of the most difficult tasks in image processing. Segmentation accuracy determines the eventual success or
failure of more complex computer vision algorithms used in various analysis procedures.
with parameters:
1
Fundamentals of Image Processing and Computer Vision – Laboratory 3
src is the input array ot image (multiple-channel, 8-bit or 32-bit floating point).
thresh is the threshold value.
maxval is the maximum value to use with the THRESH_BINARY and THRESH_BINARY_INV
thresholding types.
type is the thresholding type (cv2.THRESH_BINARY, cv2.THRESH_BINARY_INV,
cv2.THRESH_TRUNC, cv2.THRESH_TOZERO, cv2.THRESH_TOZERO_INV). More information
on the thresholding types can be found here.
dst is the output array or image of the same size and type and the same number of channels as src.
retval is the threshold value in case of using thresholding types such as Otsu or Triangle.
• Ex. 3.1 Read the images ‘grayShades.jpg’ and ‘grayFlowers.jpg’ as grayscale images. Experiment with
multiple types of thresholding by changing the parameter type and keeping the same threshold value
for each image separately. Display and compare the output images.
• Ex. 3.2 Read the image ‘adeverinta.jpg’ as grayscale and repeat the previous exercise. Comment on
the output.
src, maxValue, dst, thresholdType – represent the same parameters as in a simple threshold
operation.
adaptiveMethod – method to compute the threshold value. The 2 possible options are: cv2.
ADAPTIVE_THRESH_MEAN_C (the mean of the neighborhood blockSize x blockSize minus
constant C) or cv2. ADAPTIVE_THRESH_GAUSSIAN_C (the threshold is a gaussian-weighted sum of
the neighborhood values, minus constant C).
blockSize – indicates the size of the neighborhood area
C – constant subtracted from the mean.
2
Fundamentals of Image Processing and Computer Vision – Laboratory 3
• Ex. 3.3 Read the image ‘adeverinta.jpg’ as grayscale and apply adaptive thresholding (both options).
Compare the results with the ones from Ex. 3.2 and justify the outputs. The image has different lighting
conditions in different areas and a smoothing filter applied before using adaptive thresholding will
reduce the noise.
Figure 1. Original image (left) and its bimodal histogram of the intensity channel (right)
dst,threshOt = cv2.threshold(src,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
• Ex. 3.4 Read the images ‘rose.jpg’ and ‘yellowFl.jpg’ as grayscale and apply Otsu thresholding.
Compare the results with the previous methods. Experiment also with gaussian smoothing before the
Otsu binarization.
3
Fundamentals of Image Processing and Computer Vision – Laboratory 3
3.2.1 Dilation
Dilation signifies ‘expansion’ or becoming larger. White regions are enlarged using a structuring
element moved along the boundary of the white area. An example is presented in Figure 2, where a white
square is dilated using as structuring element a white circle.
Figure 2. Dilation of a white square (left → right) with circle as structuring element
Structuring element can have various shapes and the results after dilation will be different. The
OpenCV function that creates a structuring element is cv2.getStructuringElement. The function that
implements dilation in OpenCV is:
• Ex. 3.5 Read the image ‘euro.jpg’ and transform it into a binary image using a simple threshold
operation with 212 as threshold value. The coins should be mostly white (with some imperfections – see
Figure 3 – the mask on the left side) on a black background. Try to remove the imperfections by dilation,
4
Fundamentals of Image Processing and Computer Vision – Laboratory 3
using 2 structuring elements: cv2.MORPH_ELLIPSE of size 7x7 and then a smaller kernel of 3x3. Start
with 1 iteration for both kernels, then increase to 2 iterations. An example of the desired output mask is
presented in Figure 3, on the right side. The desired output should label with white filled circles both coins
and the white circles must not be joined together. Comment on the results.
Figure 3. Left: original mask with imperfections. Right: correct mask obtained with morphological imperfections.
3.2.2 Erosion
The opposite of image dilation is erosion. Erosion means to gradually wear away or destroy (soil
or rock). The effect of erosion is to shrink a shape. In dilation, mass is added to white regions, while in
erosion mass is removed from the boundary of the white region, as if an eraser is moved along the boundary
of the object. An example of erosion is presented in Figure 3.
The OpenCV function that implements erosion is cv2.erode and it has the same set of parameters
as cv2.dilate.
5
Fundamentals of Image Processing and Computer Vision – Laboratory 3
Figure 5. Left: original binary image. Middle: image after erosion. Right: image after dilation.
Small black holes inside white regions can be filled by performing dilation followed by erosion.
This operation is called morphological closing and it is displayed in Figure 5.
Figure 6. Left: original binary mask with 3 small holes. Middle: image after dilation. Right: image after erosion.
In OpenCV, the opening and closing operations are implemented using the function
cv2.morphologyEx, with the following syntax:
6
Fundamentals of Image Processing and Computer Vision – Laboratory 3
• Ex. 3.6 Read the image ‘Coins.png’ and convert it into a grayscale image. Split it into the 3 color
channels. Then decide which of the previous 4 matrices (grayscale, blue, green, red) can be used to obtain
a binary mask for the coins. Use thresholding and morphological operations (dilatation, erosion, opening,
closing) to obtain that binary mask, as illustrated in Figure 6.
7
Fundamentals of Image Processing and Computer Vision – Laboratory 3
The OpenCV function that implements this algorithm is cv2.canny, with the following possible
syntaxes:
• Ex. 3.7 Write an application to find the edges using Canny detection and test it on the ‘rose.jpg’.
The threshold values will be varied using two trackbars. What is the effect of threshold values? Experiment
with multiple images. Indication: use the function cv2.createTrackbar() and create a callback
function which is called every time the trackbars are used. A similar example in C++ can be found here.
and parameters :
src - input image (8-bit single-channel). Non-zero pixels are treated as 1's. Zero pixels remain 0's, so the
image is treated as binary. You can use compare, inRange, threshold , adaptiveThreshold,
Canny, and others to create a binary image out of a grayscale or color one.
contours - Detected contours. Each contour is stored as a vector of points.
hierarchy - Optional output vector containing information about the image topology.
8
Fundamentals of Image Processing and Computer Vision – Laboratory 3
• Ex. 3.8 Starting from the binary mask obtained in Ex.3.6, use contour detection to count the number
of coins present in the image ‘Coins.png’. Use cv2.RETR_LIST as parameter mode, and
cv2.CHAIN_APPROX_SIMPLE as method. Then use the function cv2.drawContours to draw all the
contours detected previously.
9
Fundamentals of Image Processing and Computer Vision – Laboratory 3
Connected component labeling works on binary (or grayscale) images and different measures of
connectivity are possible (generally 8-connectivity considering 8 neighbors for one pixel, but 4-connectivity
is also possible). Considering a binary image, the connected components labeling operator scans the image
by moving along a row until it comes to a point p (where p denotes the pixel to be labeled at any stage in
the scanning process) for which V={1}. When this is true, it examines the four neighbors of p which have
already been encountered in the scan (i.e. the neighbors (i) to the left of p, (ii) above it, and (iii and iv) the
two upper diagonal terms). Based on this information, the labeling of p occurs as follows:
- if all 4 neighbors are 0, assign a new label to p, else
- if only 1 neighbor has V={1}, assign its label to p, else
- if more than one of the neighbors have V={1}, assign one of the labels to p and make a note of
the equivalences. After completing the scan, the equivalent label pairs are sorted into equivalence classes
and a unique label is assigned to each class. As a final step, a second scan is made through the image, during
which each label is replaced by the label assigned to its equivalence class. For display, the labels might be
different gray levels or colors.
Connected component analysis is therefore an algorithm for labeling blobs in a binary image. It can
also be used to count the number of blobs in a binary image. That binary mask can be provided by any of
the previous studied methods: thresholding, morphological operations, etc.
• Example:
im = cv2.imread('cca.jpg', cv2.IMREAD_GRAYSCALE)
10