Documentation Image Processing Day 1
Documentation Image Processing Day 1
Documentation Image Processing Day 1
PROCESSING
DAY 1
1. Image is represented as a 2D array (grey) or as a 3D array (coloured:BGR)
2. Each element of the array is called a pixel and takes a value from 0 to 255
(black to white)
3. To declare an image, we use the Mat class:
a. Mat img(rows,cols,CV_8UC1,Scalar(number)) //for grey
b. Mat img(rows,cols,CV_8UC3,Scalar(number,number,number)) //for
coloured
DAY 2
1. To read an image from the computer, me use the function imread().
Syntax: Mat img_name=imread(“path”,0/1);
Here path is the address of the image in the computer. And 0 is the image in
grayscale and 1 is the image in colour.
2. To save an image we use the function imwrite().
Syntax: imwrite (“path”, variable_of_the_image);
Here path is the address where the image will be saved. The second
argument is the object of the image.
3. Covering coloured image to grayscale:
a. Take average of the 3 colours (R,G,B) ie. (R+G+B)/3
b. (max(R,G,B)+min(R,G,B))/2
c. 0.21R+0.72G+0.07B
This method is scientifically studied and gives weightage according to
how our eyes perceive the respective colours.
Code:
https://drive.google.com/file/d/1V3RqZSPOhFYrZ78zl52dbshtLTN9TPNO/view?usp=
sharing
4. To convert from grayscale to binary: We use a trackbar for that. The trackbar sets
a threshold value above which the pixel will be white and below which it will be black.
To create trackbar we use the createTrackbar() function.
Syntax: createTrackbar
(“name_of_trackbar”,”window_name”,&var,max_value_of_variable, callback)
Here: var is the threshold value.
5. Callback function: it is a function in which we tell what the trackbar to do
Syntax: callback(int t, void* c)
Here t is the threshold value.
Code (Segmentation):
https://drive.google.com/file/d/1Ljjqa8YqnyUsOcX5oOo88SRdGnCYs8tM/view?usp=
sharing
6. Resizing images:
2x2 --- 4x4 --- 8x8
downscaling upscaling
To upscale, we copy one pixel in multiple pixels.
To downscale, we take average of the pixels and store in one pixel.
Downscaling involves loss in data. To change an image into a ratio which is not a
rational number, we first upscale, and then downscale.
Codes:
1. Downscaling
https://drive.google.com/file/d/1-Lk9o8D74PLgI_eZy-U-yiI3vYE2Ejpi/view?usp
=sharing
2. //Upscaling
https://drive.google.com/file/d/1oG4iT_llPumSYfshK4zwRxLPdYVVugMp/view?usp=
sharing
7. Rotating an image:
We use the axes rotation formula:
X cosθ -sinθ x
= *
Y sinθ cosθ y
DAY 3
Kernel: A kernel is a matrix (usually 3x3) which is used to perform operation on an
image.
Padding: It is adding an extra layer to an image so as the kernel does not overflow
outside the image.
Types:
a. 0 Padding: Adding a layer of black pixels at the end of the image as a
frame.
img** img*
img* img*
b. Reflection Padding:We reflect the second last layer of eacd side with
respect to the last layer to add a layer of padding
v1 v1 v1 v3 v4 v3
v9 v10 c v8 v7 v8
v4 v1 v1 v3 v4 v1
v7 v10 c v8 v7 v10
v1 v1 v3 v4
*We can use the isValid function if we don't want to do padding as illustrated below.
Blurs:
a. Mean Blur: In this, we take a kernel, compute the average of all the pixels in
the kernel and store it in the centre pixel of the kernel.
Code (Mean Blur):
https://drive.google.com/file/d/1_egVXkIn_tqOWbKab19Cpv5BlyjcK-FA/view?
usp=sharing
b. Median Blur: This blur take the median of all the elements of the kernel except
the centre element and stores the value in the centre of the kernel.
Code:
https://drive.google.com/file/d/1V4hW462ZveAHptm334SVEfZARh34vw6c/view?usp
=sharing
c. Gaussian Blur: Like a Gaussian graph, this blur takes the account of weightages
of the distances of the cells of the kernel from the centre pixel of the kernel. It adds
the pixel values according to that weightage and stores the sum in the centre kernel,
Erosion: We check a kernel. Even if one value in the kernel is black, the centre
kernel becomes black.
It is also defined as minima of the kernel
Erosion increases the black content of the image
Dilution: We check a kernel. Even if one value in the kernel is white, the centre
kernel becomes white
It is also defined as maxima of the kernel.
Erosion increases the white content of the image
Code:
https://drive.google.com/file/d/1vBi8nQf6sAw1WYtZhHLDqb0DC3mSVr-t/view?usp=
sharing
Edge Detection:
a. Using Erosion and Dilution: Perform dilution on an image and store it in
separate image (say img1). Perform erosion on the same image and store it
in separate image (say img2). Do img1-img2 to get the final image.
This works because dilution enlarges the white parts and erosion reduces
them. When we subtract, we will get the place where the edge is.
b. Using blur: Use mean/Gaussian blur on a binary image. The edges will become
grayish. Subtract the new and old images to get the edges.
c. Prewitt Filter:
A. We take 2 matrices to indicate changes in x and y direction.
-1 0 1
Gx=⅙ -1 0 1
-1 0 1
-1 -1 -1
Gy=⅙ 0 0 0
1 1 1
Code:
https://drive.google.com/file/d/1xDJd7NUtfY9YhD5Ch18nIjW1kE_S_4AW/view?usp=
sharing
d. Sobel Filter:
It is the exact same thing as Prewitt Filter except the matrix is changed. It is
considered more effective than Prewitt Filter.
-1 0 1
Gx= ⅛ -2 0 2
-1 0 1
-1 -2 -1
Gy=⅛ 0 0 0
1 2 1
Code:
https://drive.google.com/file/d/1HDORugdR6agARvUQIOYli_XbyVRc-0ew/vie
w?usp=sharing
DAY 4
Graphs:
It is a data structure in which we have nodes connected to different
nodes.
Traversal of graph:
a. Depth First Search:
Code:
PS: Given a binary image, find a path from (0,0) to
(img.rows-1,img.cols-1)
https://drive.google.com/open?id=11DFFou8TMt5tkb2qxNWw1dh3LZ8h19rs
CONTOURS:
Closed edges in an image are called contours
Contours are detected in an image using the findContours() function and drawn
using the drawContours() function.
*Read about it from documentation*
HOUGH TRANSFORM:
Used it find accurate lines in an image after Canny. After Canny, we get edges, but
they are not often in a straight line. So to get those, we use Hough transform.
In this, we take each white (edge) pixel in the image and take all θ values from 0 ° to
180 ° . Corresponding to this, we find find the value of r using the formula:
xcosθ + ysinθ = r
Hence, we find the equation of every line passing through that point. For each of this
line, We conduct voting the the other white points. We for another plane of (r, θ )
where each point defines a line. This is called the Hough plane. If a line is voted by a
point, its intensity in the (r, θ ) plane is increased by a constant amount. After doing
this for all the points, we see that we get particular points in the Hough plane with
most intensity. These values of r and θ give us the corect lines.
CODE:
https://drive.google.com/file/d/1e7xfoRy7zv_aL36fF4MnGEpS3pjqIeTJ/view?usp=sh
aring
HISTOGRAM:
It is a bar graph plotted between frequency of a pixel value and pixel values (0-255).
This means that it gives an idea about the count of the pixel values in an image.
CODE:
https://drive.google.com/file/d/1im-dmJd9hS4xfNlnNVaQn3-A_M76lPXS/view?usp=s
haring
VIGNETTE FILTER:
https://drive.google.com/file/d/18PhYNhrICosp2N8Dy9HWPtcdSJqWFYlk/view?usp=
sharing