0% found this document useful (0 votes)

34 views16 pages

Himanshu Gupta Configuration Manual

Uploaded by

Sheethal K. S

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views16 pages

Himanshu Gupta Configuration Manual

Uploaded by

Sheethal K. S

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

Configuration Manual

MSc Research Project

Data Analytics (MSCDA-B)

Himanshu Gupta
Student ID: x18203302

School of Computing
National College of Ireland

Supervisor: Dr. Muhammad Iqbal

National College of Ireland
Project Submission Sheet
School of Computing

Student Name: Himanshu Gupta

Student ID: x18203302
Programme: Data Analytics (MSCDA-B)
Year: 2020
Module: MSc Research Project
Supervisor: Dr. Muhammad Iqbal
Submission Due Date: 28/09/2020
Project Title: Configuration Manual
Word Count: 1294
Page Count: 14

I hereby certify that the information contained in this (my submission) is information
pertaining to research I conducted for this project. All information other than my own
contribution will be fully referenced and listed in the relevant bibliography section at the
rear of the project.
ALL internet material must be referenced in the bibliography section. Students are
required to use the Referencing Standard specified in the report template. To use other
author’s written or electronic work is illegal (plagiarism) and may result in disciplinary
action.

Signature:

Date: 28th September 2020

PLEASE READ THE FOLLOWING INSTRUCTIONS AND CHECKLIST:

Attach a completed copy of this sheet to each project (including multiple copies). □
Attach a Moodle submission receipt of the online project submission, to □
each project (including multiple copies).
You must ensure that you retain a HARD COPY of the project, both for □
your own reference and in case a project is lost or mislaid. It is not sufficient to keep
a copy on computer.

Assignments that are submitted to the Programme Coordinator office must be placed
into the assignment box located outside the office.

Office Use Only

Signature:

Date:
Penalty Applied (if applicable):
Configuration Manual
Himanshu Gupta
x18203302

28th September 2020

1 Introduction
This manual presents the system configuration required to run the submitted project. It
contains all the packages, libraries, and programming codes written and used during the
project implementation of : ”Trash Image Classification System using Machine Learning
and Deep Learning Algorithms”.

2 System Configurations
2.1 Hardware
Following Hardware configuration used:
RAM:8GB System type: Macintosh 64 bit Processor: Dual-Core Intel Core i5
CPU:1.8GHz Storage:1 TB HDD GPU::Intel HD Graphics 6000 1536 MB.

2.2 Software
• PyCharm: It is an IDE that is majorly used to run the python code. Two ver-
sions are there one is professional and one is community version. For this project
Community edition has been downloaded from this website1.
• Google Colaboratory: Also known as Colab, this is an online cloud service
that provides an environment to run your Jupyter notebooks freely. All the basic
packages for machine learning problems are already installed in the environment
like TensorFlow, Keras, pandas, and user needs to import these packages according
to their usage. However, to run a specific version of software the version should be
mentioned in the notebook before calling their functions. Three modes are provided
to run the notebook which are None, GPU, and TPU. GPU setting was used to
execute the notebooks. Authenticated google drive access is necessary to access
Colab. Sometimes GPU is available for limited usage per day and in that case,
None can be selected in the settings as shown in Figure 1.

1https://www.jetbrains.com/pycharm/

1
Figure 1: Settings: Google Colaboratary

3 Project Development
The main steps in this research development are data pre-processing( data downloading,
data analyses, new data structure creation, removing unwanted columns), conversion
from image to NumPy arrays, creating dummy variables, data split, data array reshaping
for each model, data normalization in several stages. Several codes have been written
for successfully performing and evaluating all the experiments such: creating a baseline
Sequential Keras, ResNet-50, VGG-19 neural network, adding several layers with different
weights and defining training parameters, selecting hyper-parameters for the XGBoost
model. Writing codes for running all the models at different k-folds for cross-validation,
changing sample size and many epochs, creating classification matrix, training testing
accuracy, and plotting evaluation graphs.

3.1 Data Gathering

The TACO ( Trash Annotation in Context ) dataset used in this study is not available
directly. This involves two below steps:
1) Go to the Taco site 2 and click on the Download button. Download ’annota-
tions.json’ from 3 which contains information like URL path of images hosted on Flickr
server, filenames, categories, bounding boxes, images width, and height, etc. Store this
annotation file in the local drive directory.
2) Download python file ’download.py’ from the same site 4 and open this file in
PyCharm IDE give the directory path of ’annotation file’ downloaded above as shown in
Figure 1. This download script downloaded all 1500 images in 10 sub folders batches in
the mentioned path of the local drive.
3. To access data in Google Colab all the data needs to upload on google drive from
local drive in a folder ’data’ (folder name could be any).
4. Both ’annotations.json’ and ’download.py’ files were attached in the project arti-
facts after giving the proper credits and references to the original authors of data.
5. Created two folders in main directory path with names as shown in figure 3

2http://tacodataset.org/
3
https://github.com/pedropro/TACO/blob/master/data/annotations.json
4https://github.com/pedropro/TACO/blob/master/download.py

2
Figure 2: Annotation file path

Figure 3: folder structure

3.2 Data Preparation

Data was prepared for baseline model and then data augmentation has been done to
generate more data. Python script for data preparation shown in Figure 4. Bounding
boxes were fetched from the initial data frame and the padding of ’20’ has adjusted to
find the minimum and maximum value of the x-axis and y-axis. A null check has been
run on the data and then converted the data frame into CSV format and saved in the
drive directory path in Figure 5.

3
Figure 4: First Part of the script which create data frame with selective five categories
and remove duplicate values from the data

In the continuation of above script, after creating the initial data frame augmented data
generated through the script.

• A free open source Python image library ’PIL’ has been imported.

• Image cropping, Rotation, Gaussian blur ,horizontal flip functions applied to get
these four types of images.

• Images belongs to ’Drink Can’ and ’Plastic Straw’ were very lessor as compare to
other categories and therefore another set of vertical flipped images generated for
these two categories.

• Bounding boxes columns dropped from the dataset.

The code snippet is shown in Figure 7. Generated data further divided into train and
test dataset using scikit-learn which is a machine learning open-source library.

4
Figure 5: Initial data frame saved into CSV file which used by various base model

Figure 6: Code for generating Augmented data

5
Final data has saved into csv file format in the main directory folder and data has
been fetched from there while implementing the model.

Figure 7: Final data

4 Codes for machine and deep learning models

The codes for neural network models involve importing Keras sequential model layers
and model initialization. Neural network models can not read images directly so need to
convert it into NumPy array and reshaping according to the model input dimensions.

4.1 Experiments with Sequential Keras Baseline Model

Two baseline models were developed for initial data and second for the augmented data.
The input array needs to created before splitting into training and testing data. The
early parameters setting keeps the same for both the model. After model evaluation, the
cross-validation k-fold method implemented with 10 splits as shown in figure 13. But first
needs to import all the required libraries like in below figure 9. Pillow is the main image
processing library used to augment image data and have functions like rotate, crop etc.
After that all keras layers and models along with early stopping package which is used
for model optimization as shown in Figure 8

Figure 8: Required Keras Libraries

6
Figure 9: Required Libraries

Figure 10: Keras Sequential model baseline structure from the scratch

7
Figure 11: Keras Early stopping package imported and use to stop the training if it is
not improving to same the computational and time cost. Also, for overfitting the model

Figure 12: K-fold cross validation

Figure 13: Plot evaluation results

8
4.2 Experiments with ResNet-50 Model
The code for ResNet-50 model initialisation is shown in Figure 15. The weights for the
ResNet which trained on ’imagenet’ has downloaded in the model call. The first layer of
the model is not trainable because it has already trained on imagenet. The summary of
the model is shown in Figure 16. Import ResNet-50 model from keras application which
will load the ResNet-50 object during the initialisation call.

Figure 14: ResNet-50 Model Import

Figure 15: ResNet-50 Model initial object creation.

Figure 16: Trainable and Non-trainable parameters of ResNet-50

After plotting all the evaluation plots the confusion matrix created using sklearn
metrics plot of confusion matrix. The code reference for plotting the confusion matrix is
referred from 5

5https://analyticsindiamag.com/transfer-learning-for-multi-class-image-classification-using-deep-

convolutional-neural-network/

9
Figure 17: Confusion Matrix Code

4.3 Experiments with VGG-19 Model

VGG-19 model first need to import from keras applicaiton along with preprocess input
which is used to convert input image data array into preprocessed train and test features
required by VGG-19 for feature extraction and model training. The code is shown below
in ?? . The code for VGG-19 model initialisation is shown in Figure 19. The weights
for the VGG-19 was downloaded itself with the function call. This summary of model
is shown in Figure 20. To run the cross validation input processes again according to
VGG-19 and the code shown in Figure 22 for the same.

Figure 18: VGG-19 Import

Figure 19: VGG-19 Function call

In Figure 21 the code for converting the input train and test array into features
according to the format required by VGG-19.

10
Figure 20: Trainable and Non-trainable parameters of VGG-19 Model

11
Figure 21: Converting to VGG-19 input features

Figure 22: Cross Validation parameters of VGG-19 Model

4.4 Experiments with XGBoost Model

Apart from the neural network, the XGBoost classifier has also implemented in which
data has first converted into the array and then saved as numeric input in the CSV
file. The number of columns converted according to the input array shape (128*128*3)
and converted into that much of columns (49152). Log loss has been calculating while
applying cross-validation of xgboost as shown in Figure 26. The code for xgboost cross
validation methods has been referenced from official python API6.
6https://xgboost.readthedocs.io/en/latest/python/python
api.html

12
Figure 23: Data conversion to csv before fetching for the model training. DMatrix
optimization method applied on input dataset

Figure 24: XGBoost Model training function call

13
Figure 25: XGBoost Model Evaluation on unseen data for obtaining the classification
report

Figure 26: Log loss function calculation using xgboost cross validation method

Data Science and Machine Learning
No ratings yet
Data Science and Machine Learning
30 pages
Lavajiit Singh CV
No ratings yet
Lavajiit Singh CV
3 pages
Python Library Functions
No ratings yet
Python Library Functions
12 pages
Artificial Intelligence 3171105 Lab Manual
No ratings yet
Artificial Intelligence 3171105 Lab Manual
38 pages
AIot Lab Syllabus
No ratings yet
AIot Lab Syllabus
4 pages
Introduction To Machine Learning Course Code: 4350702
No ratings yet
Introduction To Machine Learning Course Code: 4350702
12 pages
Tools and Technology: Page - 6
No ratings yet
Tools and Technology: Page - 6
4 pages
It, Hardware Exp1
No ratings yet
It, Hardware Exp1
10 pages
MSC Academic Internship Config Manual IDS Improvement Using MIGBM Feature Selection
No ratings yet
MSC Academic Internship Config Manual IDS Improvement Using MIGBM Feature Selection
19 pages
IML Lab Manual
No ratings yet
IML Lab Manual
31 pages
Cleantech Documentation
No ratings yet
Cleantech Documentation
15 pages
Vishnu. ML
No ratings yet
Vishnu. ML
26 pages
Test Project
No ratings yet
Test Project
17 pages
l9 Scientific Python Proc
No ratings yet
l9 Scientific Python Proc
30 pages
It, Hardware Exp1
No ratings yet
It, Hardware Exp1
10 pages
Self Intoduction 1 Project
No ratings yet
Self Intoduction 1 Project
11 pages
DMV Lab Manual
No ratings yet
DMV Lab Manual
45 pages
Improved - FCC - Cat - Dog - Ipynb - Colab
No ratings yet
Improved - FCC - Cat - Dog - Ipynb - Colab
12 pages
ML Lab Manual (Final) Dtu
No ratings yet
ML Lab Manual (Final) Dtu
52 pages
Data Science Lab-KTU
No ratings yet
Data Science Lab-KTU
5 pages
Micro Project Report Format
No ratings yet
Micro Project Report Format
11 pages
Project Report - Intro To AI
No ratings yet
Project Report - Intro To AI
40 pages
ML LAB Manual
No ratings yet
ML LAB Manual
18 pages
Deep Learning PPT Full Notes
No ratings yet
Deep Learning PPT Full Notes
105 pages
ML Lab File
No ratings yet
ML Lab File
33 pages
20AI16 - ML Record
No ratings yet
20AI16 - ML Record
24 pages
SL-III Lab Manual
No ratings yet
SL-III Lab Manual
74 pages
DL Student Lab Manual
No ratings yet
DL Student Lab Manual
81 pages
DSBDAlab Manual
No ratings yet
DSBDAlab Manual
116 pages
Face Mask Detection
No ratings yet
Face Mask Detection
32 pages
Divy HPC
No ratings yet
Divy HPC
36 pages
AI Lab - Manual - 136
No ratings yet
AI Lab - Manual - 136
17 pages
Welcome To Colaboratory - Colaboratory
No ratings yet
Welcome To Colaboratory - Colaboratory
5 pages
CIS 6213 Applied Machine Learning Coursework
No ratings yet
CIS 6213 Applied Machine Learning Coursework
5 pages
Final Project Report
No ratings yet
Final Project Report
34 pages
Heart Disease Prediction Report
No ratings yet
Heart Disease Prediction Report
113 pages
Practical Labs Guide
No ratings yet
Practical Labs Guide
34 pages
CL-I Lab Manual
No ratings yet
CL-I Lab Manual
131 pages
University Institute of Engineering Department of Computer Science & Engineering
No ratings yet
University Institute of Engineering Department of Computer Science & Engineering
11 pages
AI File
No ratings yet
AI File
35 pages
Ecg Analysis Sytem Report
No ratings yet
Ecg Analysis Sytem Report
54 pages
Dsbda Unit4
No ratings yet
Dsbda Unit4
110 pages
ML Lab Manual
No ratings yet
ML Lab Manual
90 pages
Lab Manual
No ratings yet
Lab Manual
80 pages
ML Manual
No ratings yet
ML Manual
21 pages
DL Lab-III-II
No ratings yet
DL Lab-III-II
98 pages
Project2 - 158755. 4.21
No ratings yet
Project2 - 158755. 4.21
3 pages
ML Exp
No ratings yet
ML Exp
9 pages
Ass1 DSBDA Writeup
No ratings yet
Ass1 DSBDA Writeup
8 pages
Deep Learning Record
No ratings yet
Deep Learning Record
70 pages
Dsbda Lab Manual Merged
No ratings yet
Dsbda Lab Manual Merged
117 pages
Report Sentiment Analysis Marcos Matheus
No ratings yet
Report Sentiment Analysis Marcos Matheus
12 pages
Syllabus - ML Lab
No ratings yet
Syllabus - ML Lab
3 pages
DL Lab-Final
No ratings yet
DL Lab-Final
22 pages
hw1 Problem Set
No ratings yet
hw1 Problem Set
8 pages
Intro Ai Group3
No ratings yet
Intro Ai Group3
35 pages
Kavin
No ratings yet
Kavin
13 pages
A Modified Deep Residual-Convolutional Neural Netw
No ratings yet
A Modified Deep Residual-Convolutional Neural Netw
23 pages
Paper Deep Learning and Machine Learning Neural Network Approaches For Multi Class Leather Texture Defect Classification and Segmentatio
No ratings yet
Paper Deep Learning and Machine Learning Neural Network Approaches For Multi Class Leather Texture Defect Classification and Segmentatio
22 pages
Using LLM To Transcribe Restaurant Menu Photos - DoorDash
No ratings yet
Using LLM To Transcribe Restaurant Menu Photos - DoorDash
15 pages
CNN Model Construction
No ratings yet
CNN Model Construction
22 pages
3 Must-Have Projects For Your Data Science Portfolio - by Aakash N S - Jovian - Jan, 2021 - Medium
No ratings yet
3 Must-Have Projects For Your Data Science Portfolio - by Aakash N S - Jovian - Jan, 2021 - Medium
1 page
Resnet50 Summary
No ratings yet
Resnet50 Summary
4 pages
3 # Deep Learning
No ratings yet
3 # Deep Learning
36 pages
Draft 2
No ratings yet
Draft 2
71 pages
V05 SS24 DL CNNs Lecture2
No ratings yet
V05 SS24 DL CNNs Lecture2
73 pages
Pattern Recognition Lab
No ratings yet
Pattern Recognition Lab
24 pages
1nt22cs215 1nt22cs168 Matlab Report
No ratings yet
1nt22cs215 1nt22cs168 Matlab Report
20 pages
Parkinson's Disease Detection
100% (1)
Parkinson's Disease Detection
88 pages
Face Mask Classification Using Convolutional Neural Networks With Facial Image Regions and Super Resolution
No ratings yet
Face Mask Classification Using Convolutional Neural Networks With Facial Image Regions and Super Resolution
10 pages
Unit 3
No ratings yet
Unit 3
80 pages
COMP9491 Week2 Deep - Learning 1
No ratings yet
COMP9491 Week2 Deep - Learning 1
66 pages
2023 Scopus Enhanced Road Damage Detection
No ratings yet
2023 Scopus Enhanced Road Damage Detection
11 pages
V1 0-Mdpi
No ratings yet
V1 0-Mdpi
36 pages
Gan Cts
No ratings yet
Gan Cts
8 pages
Unit 1 Part 1
No ratings yet
Unit 1 Part 1
61 pages
Peerj 16597
No ratings yet
Peerj 16597
23 pages
A Survey On Waste Detection and Classification Using Deep Learning
No ratings yet
A Survey On Waste Detection and Classification Using Deep Learning
15 pages
5873-Article Text-10627-1-10-20210520
No ratings yet
5873-Article Text-10627-1-10-20210520
15 pages
Thesis Final Presentation
No ratings yet
Thesis Final Presentation
33 pages
5330-Article Text-8555-1-10-20200508
No ratings yet
5330-Article Text-8555-1-10-20200508
8 pages
Draft Skin Disease Detection Using ResNet-50
No ratings yet
Draft Skin Disease Detection Using ResNet-50
13 pages
Neural Architecture Search: Basics
No ratings yet
Neural Architecture Search: Basics
20 pages
Skin Disease Classification Using CNN Algorithms
No ratings yet
Skin Disease Classification Using CNN Algorithms
8 pages
10 Detection of Cotton Plant Diseases Using Deep Transfer Learning
No ratings yet
10 Detection of Cotton Plant Diseases Using Deep Transfer Learning
19 pages
Plant Disease Report 4
No ratings yet
Plant Disease Report 4
29 pages
Unit 5
No ratings yet
Unit 5
24 pages
Empirical Analysis of Squeeze and Excitation-Based Densely Connected CNN For Chili Leaf Disease Identification
No ratings yet
Empirical Analysis of Squeeze and Excitation-Based Densely Connected CNN For Chili Leaf Disease Identification
12 pages
Sustainability 15 16454
No ratings yet
Sustainability 15 16454
16 pages
Tomato Leaf Disease Detection Using Convolutional Neural Network With Data Augmentation
No ratings yet
Tomato Leaf Disease Detection Using Convolutional Neural Network With Data Augmentation
8 pages
DBDAL LAB - MANUAL - Final
No ratings yet
DBDAL LAB - MANUAL - Final
93 pages
Green AI
No ratings yet
Green AI
12 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Himanshu Gupta Configuration Manual

Uploaded by

Himanshu Gupta Configuration Manual

Uploaded by

Configuration Manual

MSc Research Project

Supervisor: Dr. Muhammad Iqbal

Student Name: Himanshu Gupta

Date: 28th September 2020

PLEASE READ THE FOLLOWING INSTRUCTIONS AND CHECKLIST:

Office Use Only

28th September 2020

3.1 Data Gathering

Figure 3: folder structure

3.2 Data Preparation

• Bounding boxes columns dropped from the dataset.

Figure 6: Code for generating Augmented data

Figure 7: Final data

4 Codes for machine and deep learning models

4.1 Experiments with Sequential Keras Baseline Model

Figure 8: Required Keras Libraries

Figure 12: K-fold cross validation

Figure 13: Plot evaluation results

Figure 14: ResNet-50 Model Import

Figure 15: ResNet-50 Model initial object creation.

Figure 16: Trainable and Non-trainable parameters of ResNet-50

4.3 Experiments with VGG-19 Model

Figure 18: VGG-19 Import

Figure 19: VGG-19 Function call

Figure 22: Cross Validation parameters of VGG-19 Model

4.4 Experiments with XGBoost Model

Figure 24: XGBoost Model training function call

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.