Summer Training Report 1
Summer Training Report 1
(AIML359)
on
Bachelors of Technology
in
Artificial Intelligence and Machine Learning
Submitted by
Sahib Preet Singh
00713211621
DECLARATION
I hereby declare that all the work presented in this Summer Training Report-1 entitled
“Artificial Intelligence” for the partial fulfillment of the requirements for the award of
the degree of Bachelor of Technology in Artificial Intelligence and Machine Learning,
Guru Tegh Bahadur Institute of Technology, affiliated to Guru Gobind Singh Indraprastha
University Delhi, is an authentic record of my own work carried out at CodeClause as AI
Intern from 1 August 2023 to 1st September 2023
The work reported in this has not been submitted by me for the award of any other degree
of this or any other institute.
2
CERTIFICATE
3
ACKNOWLEDGEMENT
I would like to express my great gratitude towards Mr. P S Bedi Sir , who has given me
support and suggestions. Without their help, I could not have presented this work to the
present standard. Their feedback vastly improved the quality of this report and provided
an enthralling experience. I am indeed proud and fortunate to be supervised by him.
I am also thankful to Dr. Savneet Kaur mam, H.O.D. IT department, Guru Tegh
Bahadur Institute of Technology, New Delhi, for her constant encouragement, valuable
suggestions, and moral support and blessings.
I shall remain indebted to the faculty and staff members of Guru Tegh Bahadur Institute
of Technology, New Delhi. I also take this opportunity to give thanks to all others who
gave their support for the project or in other aspects of our study at Guru Tegh Bahadur
Institute of Technology
4
ABSTRACT
5
LIST OF FIGURES
6
CONTENTS
Title Page 1
Declaration 2
Certificate 3
Acknowledgement 4
Abstract 5
List of Figures 6
Chapter 1: Introduction 8
1.1: Objective 8
1.2: Organization Overview – CodeClause 9
1.3: Internship Overview 9
1.4 Scope and Significance 10
Chapter 2: Software Requirements 12
2.1 Technology Stack Requirements 12
2.2 Development Environment Requirements 14
Chapter 3: Project- Road Lane Detection 15
3.1 Overview 15
3.2 Libraries Used 16
3.3 Procedure 17
Chapter 4: Project- Self Driving Car Simulation 21
4.1 Overview 21
4.2 Libraries Used 22
4.3 Procedure 24
Chapter 5: Conclusion 27
Chapter 6: Appendices 28
6.1 Appendix A: Outputs (Images) 28
6.2 Appendix B: Source Code 32
Chapter 7: Future Scope 39
Chapter 8: References 40
7
1. Introduction
1.1 Objective
8
1.2 Organization Overview
The global perspective and responsibility that CodeClause embraces are evident in its
approach to deep connectedness between people, ideas, communities, and the
environment. The organization places a premium on treating each individual with respect,
fostering an open environment that encourages learning and growth while embracing
diversity.
Over the course of one enriching month, I actively participated in projects at the forefront
of software development and emerging technologies. CodeClause, a visionary technology
9
company, provided a dynamic platform for hands-on learning alongside high-caliber
engineers. The internship's focus centered on the pivotal projects — the refinement of a
road lane detection system using data augmentation techniques and the development of a
self-driving car system, incorporating Python, machine learning, and computer vision.
The first project not only honed my technical skills but also underscored the importance
of precision in identifying lane markings — a fundamental step in the autonomous
vehicle landscape. The second project propelled me into the realm of self-driving cars,
requiring the integration of Python, machine learning, and computer vision technologies.
This endeavor aimed at not just developing a system, but fostering an intelligent and
adaptive vehicle capable of navigating diverse scenarios.
The scope of the road lane detection project is paramount in establishing a fundamental
building block for advanced autonomous vehicle systems. By implementing a refined
image processing pipeline, the project seeks to elevate the accuracy and reliability of lane
detection, ensuring a robust foundation for subsequent stages in the development of
self-driving car technologies. The scope extends beyond mere lane identification,
encompassing the intricate steps of grayscale conversion, edge detection, region of
interest masking, and Hough line transformation. This comprehensive approach not only
contributes to the immediate project goals but sets the stage for broader applications in
real-world scenarios, where precise lane detection is critical for safe and effective
autonomous navigation.
In tandem with the road lane detection project, the broader scope of developing a
self-driving car system using computer vision is monumental. The significance lies in the
convergence of these cutting-edge technologies to create an intelligent and adaptive
vehicle capable of navigating diverse and dynamic environments. This project's scope
encompasses algorithmic complexities, training data nuances, and the integration of
computer vision techniques. The self-driving car system's potential significance extends
beyond the immediate project objectives; it aligns with the global pursuit of safer, more
efficient transportation, showcasing the transformative impact that such technologies can
have on the future of mobility.
10
The significance of these projects within the broader context of CodeClause's mission
and the technological landscape cannot be overstated. They represent a tangible
contribution to the ongoing evolution of autonomous driving technologies, addressing
real-world challenges and setting new standards for innovation. The projects not only
push the boundaries of current capabilities but also align with the global trajectory
towards intelligent, connected, and autonomous vehicles — a paradigm shift with
far-reaching implications for the automotive industry, transportation infrastructure, and
societal norms. In essence, the scope and significance of these projects extend beyond
individual technical accomplishments, resonating with the broader narrative of
technological progress and its transformative impact on our daily lives.
11
2. Software Requirements
The software requirements section serves as the foundational blueprint for the successful
execution of any project, acting as the bridge between conceptual vision and tangible
implementation. From specifying programming languages and version compatibility to
delineating the intricacies of machine learning algorithms and computer vision
frameworks, the software requirements serve as a comprehensive guide.
As a roadmap for developers and stakeholders alike, the software requirements overview
underscores the meticulous planning essential for translating visionary concepts into
tangible, operational software solutions.
The technology stack requirements encompass Python 3.9 as the programming language,
leveraging its rich library ecosystem for computer vision tasks. The stack integrates
computer vision techniques, data augmentation, and image processing to enhance road
lane detection accuracy. RGB values understanding further refines the system,
collectively forming a comprehensive and effective technology stack for the project.
12
Fig 2.2 Python and Computer Vision
2.1.1 Python 3.9: It serves as the programming language backbone for the road lane
detection project, providing a versatile and expressive environment for algorithm
implementation. Its extensive library support, particularly in the field of computer vision
with libraries like OpenCV and scikit-image, facilitates the seamless integration of
sophisticated image processing techniques and machine learning algorithms.
2.1.2 Computer Vision Techniques: At the heart of the project lies a robust set of
computer vision techniques that empower the system to comprehend and interpret visual
data from road scenes. These techniques, ranging from traditional image processing
methods to advanced pattern recognition and machine learning algorithms, collectively
enable the system to detect and analyze lane markings with precision, contributing to the
overall effectiveness of the road lane detection system.
2.1.3 Data Augmentation: Data augmentation plays a pivotal role in fortifying the
model's resilience by artificially diversifying the training dataset. Through techniques
such as rotation, scaling, and flipping, the model becomes more adept at generalizing to
various real-world scenarios and adapting to different environmental conditions,
ultimately improving its performance and accuracy in lane detection tasks.
13
2.1.5 RGB Values Understanding: Understanding RGB values is integral to the
project's success as it facilitates the extraction of crucial color information from images.
In the context of road lane detection, the analysis of RGB values proves invaluable for
distinguishing lane markings from the surrounding environment. This understanding
significantly contributes to the system's capability to accurately detect and delineate lanes
on the road.
14
3. Road Lane Detection
The goal of this project is to detect road lanes in a video stream. This project is based on
the detection of road lanes. It is the base step of creating a self driving car project. In this
I have collected the images data from kaggle. It involves processing images of roads to
identify and highlight lane markings. The pipeline includes steps like grayscale
conversion, edge detection, region of interest masking, and Hough line transformation.
3.1 Overview
The project commences with the collection of image data from Kaggle, establishing a
diverse dataset essential for training and testing the lane detection algorithm. Leveraging
key libraries such as OpenCV, NumPy, and Matplotlib for image processing tasks, the
pipeline involves grayscale conversion, Gaussian blur application, canny edge detection,
region of interest masking, and Hough line transformation.
The implementation unfolds in a systematic manner, wherein the project reads all files
from a designated folder, applies preprocessing steps like grayscale conversion and edge
detection, and defines a region of interest to focus on relevant details. The Hough line
15
transformation is then employed to identify and extrapolate lane markings, contributing
to accurate lane detection. The project concludes with a weighted image function that
blends the detected lanes with the original image, producing a visually intuitive
representation of the road environment.
File operations, facilitated by the 'os' library, enable efficient handling of multiple images
within the pipeline. The final output is displayed on-screen or saved to disk based on user
input, providing a tangible and interpretable result. This project not only showcases
proficiency in image processing techniques but also lays the groundwork for more
advanced applications in autonomous vehicle technology.
3.2.1 OpenCV
OpenCV (Open Source Computer Vision Library) is a fundamental library for image
processing tasks in the project. Leveraging its extensive functionalities, OpenCV plays a
16
pivotal role in tasks such as edge detection, Gaussian blur application, Hough line
transformation, and image blending. Its versatility and efficiency make it an
indispensable tool for handling and manipulating image data.
3.2.2 NumPy
NumPy serves as a crucial numerical computing library, providing support for array
operations and manipulation. In the context of the road lane detection project, NumPy
facilitates efficient handling of image data, enabling numerical operations essential for
various image processing tasks. Its array-based approach enhances the speed and
performance of numerical computations, contributing to the overall efficiency of the
project.
3.2.3 Matplotlib
Matplotlib is employed for data visualization purposes within the project. With its
capabilities for creating plots, charts, and graphical representations, Matplotlib aids in
visually assessing the results of image processing steps. It is instrumental in displaying
images, plots, and the final output, providing a valuable tool for interpreting and
validating the effectiveness of the implemented lane detection algorithm.
3.2.4 OS
This library provides a platform-independent interface for interacting with the operating
system, enabling tasks such as directory listing and file operations. In the context of the
project, the 'os' module is employed to efficiently read files from a designated folder,
facilitating the seamless processing of multiple images. By leveraging the 'os' library, the
project ensures robust file handling, allowing for the systematic execution of image
processing steps on diverse datasets.
3.3 Procedure
17
variations in road environments. The selected images, "swr.jpg" and "syl.jpg," exhibit
white and yellow border lines, respectively, both featuring white dashed lines on the road.
3.3.2 Pre-processing of the Image using Grayscale Method and Gaussian Blur
The grayscale images and Gaussian-blurred versions provide a clearer foundation for
edge detection and lane identification.
Grayscale Method
The Grayscale method involves converting a full-color image into a single-channel
image, where each pixel represents the intensity of light. This process effectively
removes color information, simplifying the image while preserving essential details
related to brightness and darkness.
Gaussian Blur
Gaussian Blur is employed to smooth out high-frequency noise and details in the image
by applying a weighted average to each pixel and its neighboring pixels. This step helps
in reducing noise and enhancing the robustness of the subsequent image processing tasks.
18
define the region and subsequently highlights the detected lane lines on the original
images.
region_of_interest(img, vertices):
This function establishes a region of interest (ROI) mask on the input image, defining
specific vertices that encompass the relevant portion of the road. The mask is then
applied to the original image, effectively eliminating extraneous information from the
analysis.
slope_lines(img, lines):
19
This function identifies and computes the average slopes of the detected lines. By
categorizing lines as left or right lanes based on their slope, a composite representation of
the left and right lane lines is generated. The resulting lanes are then drawn onto the input
image.
get_vertices(image):
This function defines the vertices of the region of interest based on the dimensions of the
input image. These vertices determine the polygonal area within which the pipeline
focuses its lane detection efforts.
3.3.7 Testing
The testing of the road lane detection model involves iterating through a list of image
files within a specified directory. For each image, the pipeline function is applied to
detect and highlight road lanes, and the input and output images are displayed side by
side for visual assessment. The code snippet you provided uses Matplotlib to generate a
subplot with the original image on the left and the output image (with detected road
lanes) on the right.
20
4. Self Driving Car
The main objective of this project is to train a model which can drive a car on its own.
The model is trained on the basis of the data collected from the car. The data is collected
using the Udacity Simulator. The model is trained using the Convolutional Neural
Network. The model is trained on the basis of the images collected from the car. The
model is trained to predict the steering angle of the car. The model is tested on the
Udacity Open Source simulator.
4.1 Overview
4.1.1 Features
Autonomous Driving
The project introduces a sophisticated self-driving car system capable of autonomously
navigating through diverse environments. The integration of machine learning and
computer vision techniques empowers the vehicle to make informed decisions, ensuring
safe and efficient autonomous driving.
21
Steering Wheel Angle
An integral aspect of the project involves utilizing the angle of the steering wheel as a
key input for controlling the direction of the vehicle. This crucial information is
fundamental in maintaining accurate and secure navigation, allowing the self-driving car
to adapt to changing road conditions.
Camera Data
The self-driving car relies on data captured by the center camera to analyze its
surroundings in real-time. The visual input from the camera serves as a primary source of
information, enabling the car to perceive and respond to the environment dynamically.
This data-driven approach enhances the system's ability to make precise driving
decisions.
Open-Source Simulator
To rigorously test and validate the self-driving car system, the project integrates with an
open-source simulator. This simulator provides a virtual environment where the
autonomous vehicle can navigate autonomously, replicating real-world scenarios. Testing
in the simulator ensures the robustness and adaptability of the developed self-driving
technology.
4.2.1. TensorFlow:
TensorFlow is an open-source machine learning framework developed by Google. It
provides a comprehensive set of tools and libraries for building and training machine
learning models, including neural networks.
22
4.2.2 Keras:
Keras is an high-level neural networks API written in Python and integrated into
TensorFlow. It simplifies the process of building and training deep learning models,
offering a user-friendly interface for constructing complex neural networks.
4.2.3. NumPy:
NumPy is a fundamental library for numerical computing in Python. It provides support
for large, multi-dimensional arrays and matrices, along with a collection of mathematical
functions to operate on these arrays. NumPy is essential for efficient data manipulation
and processing.
4.2.4. Matplotlib.pyplot
Matplotlib is a plotting library for the Python programming language. Matplotlib.pyplot,
a submodule of Matplotlib, is commonly used for creating various types of visualizations,
including line plots, bar plots, and scatter plots, which are useful for analyzing and
presenting data.
4.2.5. Pandas
Pandas is a powerful data manipulation and analysis library for Python. It offers data
structures like DataFrames, making it easy to handle and manipulate structured data.
Pandas is commonly used for data preprocessing and exploration in machine learning
projects.
4.2.6. OpenCV
OpenCV, or Open Source Computer Vision Library, is a computer vision and image
processing library. It provides a wide range of tools and functions for tasks such as image
manipulation, feature extraction, and object detection, making it valuable for tasks
involving visual data.
4.2,7. argparse
Argparse is a Python module for parsing command-line arguments. It simplifies the
process of writing user-friendly command-line interfaces for your programs, allowing
users to customize the behavior of the script.
4.2.8. base64
The base64 module provides functions for encoding and decoding data in base64 format.
It is commonly used for tasks like encoding binary data for transmission over text-based
protocols.
23
4.2.9. Datetime
The datetime module in Python provides classes for working with dates and times. It is
useful for manipulating and formatting timestamps in various applications.
4.2.10. Socket.IO
It is a real-time web library that enables bidirectional communication between web
clients and servers. The socketio library in Python facilitates the integration of
WebSocket communication in applications.
4.3 Procedure
24
4.3.4 Applying Data Augmentation to Increase the Size of the Training Dataset
In this step, data augmentation techniques are applied to artificially increase the size of
the training dataset. Two specific transformations, namely Random Rotation and Flipped
Image, are implemented to introduce variations in the dataset.
Random Rotation
The random_rotation function takes an image path, its corresponding label, and a
maximum rotation angle as parameters. It randomly rotates the image within the specified
angle range to simulate variations in camera perspectives. The resulting rotated image
and label are then returned.
Flipped Image
The flipped_data function takes an image directory, image name, and label as input. It
reads the image, flips it horizontally (creating a mirror image), and negates the label. This
operation is performed to introduce diversity in the dataset by presenting images from the
opposite perspective.
The training process (model.fit) is executed with the train_generator for a specified
number of epochs (20 in this case), and the validation data (valid_generator) is used for
model evaluation during training. The training history is stored in the history variable for
later analysis or visualization.
25
Mean Absolute Error (MAE) Plot
The first plot illustrates the training and validation MAE values across different epochs.
The x-axis represents the epochs, while the y-axis represents the MAE values. Two lines
are plotted, one for the training set and one for the validation set, allowing the
observation of model performance on both datasets over time.
Loss Plot
The second plot focuses on the training and validation loss values. Similar to the MAE
plot, it visualizes how the loss changes during the training process. The x-axis represents
epochs, and the y-axis represents the loss values.
26
5. Conclusion
27
6. Appendices
28
Fig 6.1.3 Blurred
29
Fig 6.1.5 Region Selection
30
6.1.7 Steering Angles
6.1.9Randomly Flipped
31
6.2.0 Model Graphs
32
6.2 Appendix B : Source Code
33
34
6.2.2 Self Driving Car
35
36
37
38
7. Future Scope
The future scope of these projects involves the seamless integration of AI with
electronics and IoT for real-world deployment. Implementing sensor technologies, such
as LiDAR and radar, can enhance the model's perception capabilities, ensuring robust
navigation in diverse environments. Integration with IoT facilitates real-time
communication between vehicles, traffic infrastructure, and central control systems,
contributing to safer and more efficient transportation. Additionally, deploying the model
in edge devices for on-board processing enhances responsiveness and reduces reliance on
external servers. This holistic approach paves the way for the development of advanced
autonomous driving systems that can revolutionize the automotive industry and urban
mobility.
39
8. References
1. Python: https://www.python.org/
2. Kaggle : https://www.kaggle.com/code/soumya044/lane-line-detect
3. Lane Detection Dataset :
https://www.kaggle.com/datasets/thomasfermi/lane-detection-for-carla-driving-simulator/
code
4. Python Libraries : https://www.geeksforgeeks.org/libraries-in-python/
5. Geek4geeks:
https://www.geeksforgeeks.org/videos/top-10-most-popular-python-libraries/
6. Udacity simulator GitHub link : https://github.com/udacity/self-driving-car-sim
7. Udacity stimulator :
https://www.instructables.com/Self-Driving-Car-With-Udacity-Simulator/
8. Computer Vision GFG : https://www.geeksforgeeks.org/computer-vision/
9. Kaggle :
https://www.kaggle.com/code/aslanahmedov/self-driving-car-behavioural-cloning
10. Keras Documentation: https://www.tensorflow.org/guide/keras
11. Deep Learning : https://www.geeksforgeeks.org/introduction-deep-learning/
12. CNN : https://www.geeksforgeeks.org/introduction-convolution-neural-network/
13. DL Documentation : https://deeplearn.org/
14. CNN Javatpoint: https://www.javatpoint.com/pytorch-convolutional-neural-network
15. Matplotlib: https://www.javatpoint.com/matplotlib
16. A Paradigm of Complexity:
https://www.google.co.in/books/edition/CNN/G86zMVD3yNYC?hl=en&gbpv=1&dq=cn
n+&pg=PA1&printsec=frontcover
17. Detection of Lanes Research paper:
https://www.sciencedirect.com/science/article/abs/pii/S003132032030426X
18. Hough Transformation: https://ieeexplore.ieee.org/abstract/document/1593662
19. Canny Edge Detection: https://ieeexplore.ieee.org/abstract/document/6885761/
20. OpenCV:
http://roswiki.autolabor.com.cn/attachments/Events(2f)ICRA2010Tutorial/ICRA_2010_
OpenCV_Tutorial.pdf
21. w3school : https://www.w3schools.com/python/pandas/default.asp
22. numpy: https://numpy.org/
40