ch3
ch3
Computer Vision is a field of artificial intelligence (AI) that uses Sensing devices and deep learning
models to help systems understand and interpret the visual world.
Fig.
In representing images digitally, each pixel is assigned a numerical value. For
monochrome images, such as black and white photographs, a pixel's value typically ranges
from 0 to 255. A value of 0 corresponds to black, while 255 represents white.
ACTIVITY 3.1 - Binary Art: Recreating Images with 0s and 1s
values.
Fig. 3.7
Step 5: Copy the Pixel Values
Once the pixel values are extracted, select all the values from the tool and copy
them.
Fig. 3.9: image formation as 0s and 1s recreate the original grayscale image
In coloured images, each pixel is assigned a specific number based on the RGB
colour model, which stands for Red, Green, and Blue.
8
1 byte= 8 bits so the total number of binary numbers formed will be 2 =256.
By combining different intensities of red, green, and blue, a wide range of colours
can be represented in an image, each colour channel can have a value from 0 to 255,
resulting in over 16 million possible colours.
3.3.2. Preprocessing:
Preprocessing in computer vision aims to enhance the quality of the acquired image.
Some of the common techniques are-
a. Noise Reduction: Removes unwanted elements like blurriness, random spots, or
distortions. This makes the image clearer and reduces distractions for algorithms.
Example: Removing grainy effects in low-light photos.
Fig. 3.10: before image is noise image, after image is noise reduced
b. Image Normalization: Standardizes pixel values across images for consistency.
Adjusts the pixel values of an image so they fall within a consistent range (e.g., 0 1
or -1 to 1).
Ensures all images in a dataset have a similar scale, helping the model learn better.
Example: Scaling down pixel values from 0 255 to 0 1.
Fig. 3.13
The main goal for preprocessing is to prepare images for computer vision tasks by:
Removing noise (disturbances).
Highlighting important features.
Ensuring consistency and uniformity across the dataset.
3.3.4. Detection/Segmentation:
Detection and segmentation are fundamental tasks in computer vision, focusing on
identifying objects or regions of interest within an image. These tasks play a pivotal role in
applications like autonomous driving, medical imaging, and object tracking. This crucial
stage is categorized into two primary tasks:
1. Single Object Tasks
2. Multiple Object Tasks
Single Object Tasks: Single object tasks focus on analysing/or delineate individual objects
within an image, with two main objectives:
Fig.3.15: classification, classification+localization
i) Classification: This task involves determining the category or class to which a
single object belongs, providing insights into its identity or nature. KNN(K-Nearest
Neighbour)algorithm may be used for supervised classification while K-means
clustering algorithm can be used for unsupervised classification.
ii) Classification + Localization: In addition to classifying objects, this task also
involves precisely localizing the object within the image by predicting bounding
boxes that tightly enclose it.
Multiple Object Tasks: Multiple object tasks deal with scenarios where an image contains
multiple instances of objects or different object classes. These tasks aim to identify and
distinguish between various objects within the image, and they include:
i) Object Detection: Object detection focuses on identifying and locating multiple
objects of interest within the image. It involves analysing the entire image and
drawing bounding boxes around detected objects, along with assigning class
labels to these boxes. The main difference between classification and detection
is that classification considers the image as a whole and determines its class
whereas detection identifies the different objects in the image and classifies all of
them.
In detection, bounding boxes are drawn around multiple objects and these are
labelled according to their particular class. Object detection algorithms typically
use extracted features and learning algorithms to recognize instances of an object
category. Some of the algorithms used for object detection are: R-CNN (Region-
Based Convolutional Neural Network), R-FCN (Region-based Fully Convolutional
Network), YOLO (You Only Look Once) and SSD (Single Shot Detector).
Fig.3.17
3.3.5. High-Level Processing: In the final stage of computer vision, high-level processing
plays a crucial role in interpreting and extracting meaningful information from the detected
objects or regions within digital images. This advanced processing enables computers to
achieve a deeper understanding of visual content and make informed decisions based on
the visual data. Tasks involved in high-level processing include recognizing objects,
understanding scenes, and analysing the context of the visual content. Through
sophisticated algorithms and machine learning techniques, computers can identify and
categorize objects, infer relationships between elements in a scene, and derive insights
from complex visual data. Ultimately, high-level processing empowers computer vision
systems to extract valuable insights and drive intelligent decision-making in various
applications, ranging from autonomous driving to medical diagnostics.
4. Duplicate and False Content: Computer vision introduces challenges related to the
proliferation of duplicate and false content. Malicious actors can exploit
vulnerabilities in image and video processing algorithms to create misleading or
fraudulent content. Data breaches pose a significant threat, leading to the
dissemination of duplicate images and videos, fostering misinformation and
reputational damage.
Fig.3.18
Fig.3.19
4.
5.
Fig.3.20
6. You will get a screen like this.
Fig.3.21
You have the option to choose between two methods: using your webcam to capture
images or uploading existing images. For the webcam option, you will need to position the
image in front of the camera and hold down the record button to capture the image.
Alternatively, with the upload option, you have the choice to upload images either from your
local computer or directly from Google Drive.
7. the computer.
Fig.3.22
8.
Fig.3.23
Fig.3.24
Once the model is trained, you can test the working by showing an image infront of the
web camera. Else, you can also upload the image from your local computer / Google drive.
Fig.3.25
10.
my model.
Fig.3.26
11.Once your model is uploaded, Teachable Machine will create a URL which we will use
in the Javascript code. Copy the Javascript code by clicking on Copy.
Fig.3.27
12.Open Notepad and paste the JavaScript code and save this file as web.html.
13.Let us now deploy this model in a website.
14.Once you create a free account on Weebly, go to Edit website and create an appealing
website using the tools given.
Fig.3.28
15.Click on Embed Code and drag and place it on the webpage.
Fig.3.29
16.
and paste it here as shown.
Fig.3.30
17.
Fig.3.31
18.Copy the URL and paste it into a new browser window to check the working of your
model.
Fig.3.32
19.Click on start and you can show pictures of kitten and puppy to check the predictions of
your model.
Fig.3.33
To use OpenCV in Python, you need to install the library. Use the following command in
your terminal or command prompt:
pip install opencv-python
3.7.2. Loading and Displaying an Image: Let us understand the loading and displaying
using a scenario followed by a question.
Scenario- You are working on a computer vision project where you need to load and display an
image. You decide to use OpenCV for this purpose.
Question:
What are the necessary steps to load and display an image using OpenCV? Write a Python code
snippet to demonstrate this.
sol -
Here's a simple Python script to load and display an image using OpenCV:
import cv2
image = cv2.imread('example.jpg') # Replace 'example.jpg' with the path to
your image
cv2.imshow('original image', image)
cv2.waitKey(0)
cv2.destroyAllWindows()
import cv2
image = cv2.imread('example.jpg') # Replace 'example.jpg' with the path
to your image
new_width = 300
new_height = 300
# Resize the image to the new dimensions
resized_image = cv2.resize(image, (new_width, new_height))
cv2.imshow('Resized Image', resized_image)
cv2.waitKey(0)
cv2.destroyAllWindows()
import cv2
image = cv2.imread('example.jpg')# Replace 'example.jpg' with the path to
your image
grayscale_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
cv2.imshow('Grayscale Image', grayscale_image)
cv2.waitKey(0)
cv2.destroyAllWindows()
EXERCISES
A. Multiple Choice Questions:
is________________.
a. Python b. Convolution c. Computer Vision d. Data Analysis
2.Task of taking an input image and outputting/assigning a class label that best describes
the image is ____________.
b. Image classification b. Image localization
c. Image Identification d. Image prioritization
3.Identify the incorrect option
(i) computer vision involves processing and analysing digital images and videos
to understand their content.
(ii) A digital image is a picture that is stored on a computer in the form of a
sequence of numbers that computers can understand.
(iii) RGB colour code is used only for images taken using cameras.
(iv) Image is converted into a set of pixels and less pixels will resemble the original
image.
a. ii b. iii c. iii & iv d. ii & iv
4.The process of capturing a digital image or video using a
digital camera, a scanner, or other imaging devices is related to ________.
a. Image Acquisition b. Preprocessing
c. Feature Extraction d. Detection
5. Which algorithm may be used for supervised learning in computer vision?
a. KNN b. K-means c. K-fold d. KEAM
6. A computer sees an image as a series of ___________
a. colours b. pixels c. objects d. all of the above
7. ____________ empowers computer vision systems to extract valuable insights and drive
intelligent decision-making in various applications, ranging from autonomous driving to
medical diagnostics.
a. Low level processing b. High insights
c. High-level processing d. None of the above
8. In Feature Extraction, which technique identifies abrupt changes in pixel intensity and
highlights object boundaries?
a. Edge detection b. Corner detection
c. Texture Analysis d. boundary detection
9. Choose the incorrect statement related to preprocessing stage of computer vision
a. It enhances the quality of acquired image
b. Noise reduction and Image normalization is often employed with images
c. Techniques like histogram equalization can be applied to adjust the distribution
of pixel intensities
d. Edge detection and corner detection are ensured in images.
10. 1 byte = __________ bits
a. 10 b. 8 c. 2 d. 1