Project Report (2) RRRRRRRRRRR

A PROJECT REPORT
ON
“Image Analysis using CNN”
For the partial fulfillment for the award of the degree of
BACHELOR OF TECHNOLOGY
In
Applied Computational Science and Engineering

Submitted By
Rahul Kumar (2001921530041) Prateek Joshi (2001921530037)

Aman Kumar (2001921530008)
Under the Supervision of

Ms. Hina Gupta
G.L. BAJAJ INSTITUTE OF TECHNOLOGY &
MANAGEMENT, GREATER NOIDA
Affiliated to
DR. APJ ABDUL KALAM TECHNICAL UNIVERSITY,
LUCKNOW
2023 – 20234
Declaration
We hereby declare that the project work presented in this report entitled “Image Analysis using
CNN”, in partial fulfillment of the requirement for the award of the degree of Bachelor of Technology
in Computer Science & Engineering, submitted to A.P.J. Abdul Kalam Technical University,
Lucknow, is based on my own work carried out at Department of Computer Science & Engineering,
G.L. Bajaj Institute of Technology & Management, Greater Noida. The work contained in the report is
original and project work reported in this report has not been submitted by me/us for award of any
other degree or diploma.
Signature:
Name: Rahul kumar
Roll No: 2001921530041
Signature:
Name: Aman Kumar
Roll No:
2001921530008
Signature:
Name: Prateek Joshi
Roll
No:2001921530037
Place: Greater Noida
ii
5 Application of Project
5.1 Facial recognition
Face detection in photos has been accomplished using CNNs. After receiving an image as input, the
network outputs a set of values that indicate the attributes of faces or facial features at various points
in the image. They can accurately and readily identify facial features like the eyes, nose, and mouth
while minimizing distortions brought on by angles or shadows.
5.2 Medical imaging
In medical imaging, CNN is valuable in better accuracy in identifying tumours or other anomalies
in X-ray and MRI images. Based on previously processed similar images by CNN networks, CNN
models may analyze an image of a human body part, such as the lungs, and pinpoint where there
might be a tumour and other anomalies like broken bones in X-ray images. Similarly, medical images
like CT scans and mammograms can be used to diagnose cancer. In order to determine whether any
indicators within a picture indicate malignancy or damage to cells owing to both hereditary and
environmental factors, such as smoking habits, CNN models compare the image of a patient with
database images that include comparable features.
5.3 Document analysis
Document analysis can also make use of convolutional neural networks. This has a significant impact
on recognizing in addition to being helpful for handwriting analysis. A machine must process
approximately a million commands per minute to scan someone's writing and compare it to its
extensive database. By identifying words and phrases associated with the subject of a given document,
CNN networks can use both text and visuals to comprehend better what is written within.
5.4 Autonomous driving
Images can be modelled using convolutional neural networks (CNN), which are used to model spatial
information. CNNs are regarded as universal non-linear function approximators because of their
superior ability to extract features from images such as obstacles and interpret street signs.
Furthermore, as the depth of the network grows, CNNs may detect a variety of patterns. For
instance, the network's initial layers will record edges, but its deeper layers will capture aspects like
an object's shape that are more complicated (leaves in trees or tyres on a vehicle). As a result, CNNs
are the primary algorithm in self-driving cars.
iii
5.5 Biometric authentication
By identifying specific physical traits connected to a person's face, CNN has been utilised for
biometric identification of user identity. CNN models can be trained on people's images or videos to
identify particular face traits like the space between the eyes, the nose's shape, the lips' curvature, etc.
CNN models have also recognised various emotional states such as happiness or sadness based on
photos or videos of people's faces. CNNs can also assess whether a subject is blinking in a photo and
the general form of multiple-frame facial images.
iv
8 Limitations of Project
However, CNNs also have some drawbacks that limit their performance and applicability. One of the
main disadvantages of CNNs is that they require a large amount of labeled data to train effectively, which
can be costly and time- consuming to obtain and annotate. Moreover, they are prone to overfitting, which
means that they can memorize the noise and details of the training data, and fail to generalize to new and
different data. To prevent overfitting, various regularization techniques, such as dropout, batch
normalization, and data augmentation, have to be applied, which can increase the complexity and
computational cost of the network. Another disadvantage of CNNs is that they are often considered as
black boxes, which means that they are hard to interpret and explain. This can pose challenges for
debugging, validating, and trusting the network's decisions, especially in sensitive and critical domains,
such as healthcare, security, and law.
v
6 Plan of Work
6.1. Problem Definition:

• Clearly define the problem and goals of the image recognition system.
6.2. Data Collection:
• Gather a diverse dataset of room images with annotated labels for chairs and tables.
6.3. Data Preprocessing:

• Resize images, normalize pixel values, and perform data augmentation.
6.4. Model Selection:
• Choose a suitable CNN architecture for object detection (e.g., YOLO, SSD, Faster R-
CNN).
6.5. Model Development:
• Implement and modify the chosen model for your specific counting task.
6.6. Dataset Splitting:
• Divide the dataset into training, validation, and test sets.
6.7. Model Training:
• Train the model on the training set and fine-tune hyperparameters.
6.8. Evaluation: 60
• Evaluate the model on the test set using relevant metrics.
6.9 Deployment:
. Deploy the trained model for inference on new room images.
6.10. Monitoring:
Monitor the model's performance in real-world scenarios and gather feedback.
Documentation:
Document the model architecture, training process, and deployment steps.
6.12. Ethics and Privacy:
Ensure compliance with ethical guidelines and address privacy concerns.
60
7 Tools and Technology
Developing an image recognition system using CNNs for counting objects in a room involves
several tools and technologies. Below is a list of key tools and technologies commonly used
in this context:
1. Deep Learning Frameworks:

• TensorFlow: An open-source deep learning framework developed by Google.
TensorFlow provides tools and libraries for building and training neural
networks, making it popular for image recognition tasks.
• PyTorch: An open-source deep learning framework maintained by Facebook.
PyTorch is known for its dynamic computational graph and is widely used for
image recognition.
2. Computer Vision Libraries:
• OpenCV (Open Source Computer Vision Library): A library of programming
functions mainly aimed at real-time computer vision. OpenCV provides tools
for image processing, feature detection, and object tracking.
3. Image Annotation Tools:
• LabelImg: A graphical image annotation tool that helps in creating labeled
datasets for training computer vision models.
• RectLabel: A macOS app for bounding box annotation of images.
4. Pre-trained Models:
• YOLO (You Only Look Once): An object detection system that divides an
image into a grid and predicts bounding boxes and class probabilities for each
grid cell.
• Faster R-CNN (Region-based Convolutional Neural Network): A two-stage
object detection model that uses region proposals to improve accuracy.
• SSD (Single Shot Multibox Detector): A single-shot object detection model
that predicts bounding boxes and class scores for multiple object scales.
5. Data Augmentation Tools:
• Augmentor: A Python library for image augmentation, which can help increase
the diversity of your training dataset.
6. Cloud Platforms:
• Google Cloud Platform (GCP): Provides cloud-based services like Cloud
Vision API for pre-trained image recognition models, and AI Platform for
training custom models.
• Amazon Web Services (AWS): Offers services like Rekognition for pre-trained
models and SageMaker for building and training custom models.
7. Containerization and Orchestration:
• Docker: Containerization techno6 l0ogy that allows you to package your
application and its dependencies into a container.
• Kubernetes: An open-source container orchestration platform for automating
the deployment, scaling, and management of containerized applications.
8. Version Control:
• Git: A distributed version control system that helps manage changes to your
codebase.
9. Documentation Tools:
Jupyter Notebooks: An interactive computing environment that allows you to
create and share documents that contain live code, equations, visualizations,
and narrative text.
Markdown: A lightweight markup language for creating formatted text using a
plain-text editor.
10. Integrated Development Environments (IDEs):
JupyterLab, Spyder, or Visual Studio Code: Common IDEs for developing
and experimenting with deep learning models.
60
60

Project Report (2) RRRRRRRRRRR

Uploaded by

Copyright:

Available Formats

Project Report (2) RRRRRRRRRRR

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Project Report (2) RRRRRRRRRRR

Uploaded by

Copyright:

Available Formats

A PROJECT REPORT

Applied Computational Science and Engineering

Rahul Kumar (2001921530041) Prateek Joshi (2001921530037)

Under the Supervision of

Place: Greater Noida

5.1 Facial recognition

5.2 Medical imaging

5.3 Document analysis

5.4 Autonomous driving

6.1. Problem Definition:

6.3. Data Preprocessing:

1. Deep Learning Frameworks:

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.