Shwaga Abay Final Submitted

Download as pdf or txt
Download as pdf or txt
You are on page 1of 76

DSpace Institution

DSpace Repository http://dspace.org


Computer Science thesis

2021-06

MANGO DISEASE DETECTION USING


MACHINE LEARNING TECHNIQUE

SHWAGA, ABAY

http://ir.bdu.edu.et/handle/123456789/12657
Downloaded from DSpace Repository, DSpace Institution's institutional repository
BAHIR DAR UNIVERSITY
BAHIR DAR INSTITUTE OF TECHNOLOGY
SCHOOL OF RESEARCH AND POSTGRADUATE STUDIES
FACULTY OF COMPUTING

MANGO DISEASE DETECTION USING MACHINE LEARNING


TECHNIQUE

BY

SHWAGA ABAY

JUNE, 2021
BAHIR DAR, ETHIOPIA
Mango Disease Detection Using Machine Learning Technique

By
Shwaga Abay Ashebr

A Thesis Submitted to the School of Research and Graduate Studies of Bahir Dar
Institute of Technology, BDU in Partial Fulfilment for the Degree of Master of
Science in Software Engineering in the Faculty of Computing

Advisor: Seffi Gebeyehu (Ass. Professor)

June, 2021
Bahir Dar, Ethiopia

i
ii
©2021
SHWAGA ABAY ASHEBR
Mango Disease Detection Using Machine Learning Technique
ALL RIGHT RESERVED

iii
iv
ACKNOWLEDGEMENT
First and foremost, praise and thanks to the Almighty God, for His blessing, support, and
protection to successfully complete the study during my research work. I would also like
to thank the Ever-Virgin, St. Marry, our Lord's mother. She is my strength ever when I
get tiered going through a hard way and helps me complete this study.

I would like to express my gratitude to my Advisor Mr. Seffi Gebeyehu (Ass. Professor)
for his valuable guidance throughout this research. It was a great honor to work and study
under his supervision. I submit my heartiest gratitude to Mr. Abreham Debasu (Ass.
Professor) for the encouragement and constructive comments given. Without his
guidance and diligent help, this thesis would not have been conceivable.

I’m deeply indebted to my beloved Husband Mr. Girma Negash, my beautiful-hearted


mother Mrs. Alganesh Tensu, and my caring father Mr. Abay Ashebr for being my
inspiration. You’re the reason why I keep pushing. My special appreciation to my friends
for their relevant assistance and sincere help in completing this thesis.

My sincere thanks also goes to staff of Weramit fruit and vegetable center, Bahir-Dar,
who have helped me at a time of data collection and labelling. Special thanks goes to Mr.
Sisay and Mr. Adam for their kind support and sharing mango leaf pictures and providing
constructive ideas about the mango leaf disease and insect pest identification.

Last but not least, I also want to deliver my special thanks to all the people who along the
way believed in me.

v
ABSTRACT
Mango (Mangifera indica) is of great significant fruit crop which grows in different
agro-ecologies in the world. Mangoes are good sources of vitamins and minerals.
However, nowadays its productivity is very limited since it is attacked by different
diseases and pests. Thus to increase the mango fruit quality and productivity, it is crucial
and feasible to detect diseases and insect pests at the early stage. In this study, we have
designed and developed mango leaf disease identification mechanism using machine
learning (ML) technique. Healthy and diseased mango leaf images were captured
manually from main production areas in Amhara Region such that Weramit fruit and
vegetable research and training sub-centre, and Bahir Dar city for the identification
method. As an implementation tool, Python on an anaconda Spyder working
environment, and Google Collaboratory were used. To enhance the dataset different pre-
processing techniques (i.e. image resizing, histogram adjustment, noise removal, and
image augmentation) using the OpenCV library were applied. To enhance the
classification performance and to achieve the objective of this study different
segmentation techniques such as k means, Mask R-CNN, and combined were used.
Besides, after the pre-processing and segmentation steps, features of mango leaf images
were extracted using CNN to get the relevant features. Then the classification model was
built using CNN, SVM, and CNN-SVM classifiers on the extracted features of mango leaf
images. For these classification models three different activation functions such that
Tanh, Relu, and Leaky Relu were applied to achieve better classification accuracy. From
the experiment, we noticed that these classifiers using segmented images and Leaky Relu
activation function were achieved a significant classification performance with an
accuracy of CNN 97.62%, SVM 98.01%, and CNN-SVM 99.78% respectively.

Keywords: Machine learning, Mask R-CNN, SVM, disease detection

vi
Table of Contents
DECLARATION ................................................................................... Error! Bookmark not defined.
ACKNOWLEDGEMENT ...................................................................................................................... v
ABSTRACT ........................................................................................................................................ vi
LIST OF FIGURES ............................................................................................................................... x
LIST OF TABLES .................................................................................................................................xi
ABBREVIATIONS ..............................................................................................................................xii
CHAPTER ONE .................................................................................................................................. 1
1. INTRODUCTION ........................................................................................................................ 1
1.1. Background of the Study.................................................................................................. 1
1.2. Statement of the Problem ............................................................................................... 3
1.3. Objectives of the Study .................................................................................................... 5
1.3.1. General Objective .................................................................................................... 5
1.3.2. Specific Objective ..................................................................................................... 5
1.4. Methodologies ................................................................................................................. 5
1.4.1. Literature Review..................................................................................................... 5
1.4.2. Data Collection and Datasets Preparation ................................................................ 5
1.4.3. Data Pre-processing and Segmentation.................................................................... 5
1.4.4. Feature Extraction and Modelling............................................................................ 6
1.4.5. Performance Evaluation Technique ......................................................................... 6
1.4.6. Implementation Tool ................................................................................................ 6
1.5. Scope and limitation of the Study .................................................................................... 7
1.6. Significance of the Study .................................................................................................. 7
1.6.1. Empirical Contribution ............................................................................................ 7
1.6.2. Methodological/Scientific Contribution .................................................................. 7
1.7. Organization of the Study ................................................................................................ 8
CHAPTER TWO ................................................................................................................................. 9
2. LITERATURE REVIEW ................................................................................................................ 9
2.1. Introduction ..................................................................................................................... 9
2.2. Major mango diseases and insect pests ........................................................................ 10
2.2.1. White mango scale insect (Aulacaspis tubercularis Newstead (Hemiptera:
Diaspididae)) .......................................................................................................................... 10

vii
2.2.2. Powdery Mildew (Oidium mangiferae) ................................................................. 11
2.2.3. Anthracnose (Colletotrichum gloeosporioides) ..................................................... 12
2.3. Plant disease and pest detection using image processing and deep learning .............. 13
2.3.1. Convolutional Neural Networks ............................................................................ 14
2.3.2. Support vector machine ......................................................................................... 19
2.4. Image pre-processing..................................................................................................... 22
2.4.1. Image Resizing....................................................................................................... 22
2.4.2. Histogram Equalization.......................................................................................... 24
2.4.3. Image noise types and de-noising techniques ........................................................ 27
2.4.4. Image Augmentation .............................................................................................. 30
2.5. Segmentation ................................................................................................................. 30
2.6. Feature Extraction.......................................................................................................... 36
2.7. Related Work ................................................................................................................. 37
2.8. Summary ........................................................................................................................ 38
CHAPTER THREE ............................................................................................................................. 40
3. METHODOLOGY ..................................................................................................................... 40
3.1. Introduction ................................................................................................................... 40
3.2. Architectural Framework of the proposed model ......................................................... 41
3.3. Image Acquisition........................................................................................................... 42
3.4. Image Pre-processing..................................................................................................... 43
3.4.1. Image Resizing....................................................................................................... 43
3.4.2. Histogram Equalization.......................................................................................... 44
3.4.3. Image de-nosing ..................................................................................................... 46
3.4.4. Data Augmentation ................................................................................................ 47
3.5. Image Segmentation ...................................................................................................... 48
3.5.1. K-means Segmentation .......................................................................................... 48
3.5.2. Mask R-CNN ......................................................................................................... 49
3.6. Classification .................................................................................................................. 50
3.6.1. Building the Model ................................................................................................ 50
3.6.2. Compiling the Model ............................................................................................. 51
3.6.3. Training and Testing the Model ............................................................................. 51
CHAPTER FOUR .............................................................................................................................. 53

viii
4. EXPERINENT, RESULT AND DISCUSSION ................................................................................ 53
4.1. Introduction ................................................................................................................... 53
4.2. Experimental Setup ........................................................................................................ 53
4.2.1. Dataset ........................................................................................................................... 53
4.2.2. Implementation ............................................................................................................. 54
4.3. Result and Discussion..................................................................................................... 54
CHAPTER FIVE ................................................................................................................................ 58
5. CONCLUSION AND RECOMMENDATION ............................................................................... 58
5.1. Conclusion ...................................................................................................................... 58
5.2. Recommendation........................................................................................................... 59
References ..................................................................................................................................... 60

ix
LIST OF FIGURES
Figure2. 1 sample mango images........................................................................................ 9
Figure 2. 2: white mango scale ......................................................................................... 11
Figure 2. 3 Mango powdery mildew ................................................................................. 12
Figure 2. 4 Mango anthracnose ......................................................................................... 13
Figure 2. 5 An overview of convolutional neural network (CNN) architecture and the
training process. ................................................................................................................ 15
Figure 2. 6 Convoluting a 5x5x1 image with a 3x3x1 kernel to get a 3x3x1 convolved
feature. .............................................................................................................................. 16
Figure 2. 7 Relu operation ................................................................................................ 17
Figure 2. 8 max-pooling and average-pooling operations ................................................ 18
Figure 2. 9 SVM hyperplanes ........................................................................................... 19
Figure 2. 10 SVM optimal hyperplane ............................................................................. 20
Figure 2. 11 image resizing operation using nearest neibor interpolation ........................ 23
Figure 2. 12 image resizing operation using biliear interpolation .................................... 23
Figure 2. 13 Histogram and its Equalized Histogram for H.E. ......................................... 25
Figure 2. 14 BI-histogram Equalization Method .............................................................. 26
Figure 2. 15 Image segmentation types ............................................................................ 31
Figure 2. 16 Mask R-CNN ................................................................................................ 34

Figure 3. 1: Architectural framework of the proposed model .......................................... 41


Figure 3. 2: Results of resized mango leaf images. .......................................................... 44
Figure 3. 3: Typical results of histogram enhanced mango leaf image ............................ 45
Figure 3. 4: Typical results of the median filtered mango leaf image .............................. 46
Figure 3. 5: Original mango leaf image ............................................................................ 47
Figure 3. 6: Typical results of augmented mango leaf images ......................................... 48
Figure3. 7: Result of k-means segmentation on mango leaf image .................................. 49

x
Figure 4. 1: Training and validation accuracy of CNN-SVM on raw mango leaf images 55
Figure4. 2: Training and validation loss of CNN-SVM on raw mango leaf images ........ 55
Figure 4. 3: validation accuracy of CNN-SVM on preprocessed mango leaf images ...... 56
Figure 4. 4 validation loss of CNN-SVM classifier on preprocessed dataset ................... 56

LIST OF TABLES
Table 2. 1: Summary of related works .............................................................................. 39
Table 4. 1: Dataset description ……………………………………………………………………………………………...53
Table 4. 2: CNN classification analysis with raw mango leaf images.............................. 54
Table 4. 3: Classification analysis of CNN model with different Activation functions on a
preprocessed image dataset ............................................................................................... 57
Table 4. 4: Classification analysis of SVM + RBF model with different Activation
functions on a preprocessed image dataset ....................................................................... 57
Table 4. 5: Classification analysis of Combined ( CNN + SVM )model with different
Activation functions on a preprocessed image dataset ..................................................... 57

xi
ABBREVIATIONS
ANN Artificial Neural Network

BPAHE Brightness Preserving Adaptive Histogram Equalization


CNN Convolutional Neural Network
CV Computer Vision
Mask R-CNN Masked Region-Based Convolutional Neural Network
ML Machine Learning
MRKT Modified Rotation Kernel Transformation
RBF Radial Basis Function
SGD Stochastic Gradient Descent
SNNP Southern Nations Nationalities and People
SVM Support Vector Machine

xii
CHAPTER ONE
1. INTRODUCTION
1.1. Background of the Study
According to (Tewodros Bezu, Kebede Woldetsadik , & Tamado Tana, 2014), fruit crops
play an important role in the national food security of people. They are generally
delicious and highly nutritious, mainly of vitamins and minerals that can balance cereal-
based diets. Fruits supply raw materials for local industries and could be sources of
foreign currency. Moreover, the development of the fruit industry will create employment
opportunities, particularly for farming communities. In general, Ethiopia has great
potential and encouraging policy to expand fruit production for fresh market and
processing both for domestic and export markets. Besides, fruit crops are friendly to
nature, sustain the environment, provide shade, and can easily be incorporated into any
agroforestry programs.

In the study conducted by (Jurgen, 2003), they stated that mango, because of its attractive
appearance and the very pleasant taste of selected cultivars, is claimed to be the most
important fruit of the tropics and has been touted as 'king of all fruits’. This fruit contains
almost all the known vitamins and many essential minerals. The protein content is
generally a little higher than that of other fruits except for the avocado. Mangos are also a
fairly good source of thiamine and niacin and contain some calcium and iron.

According to (FAO, 2019)Ethiopia is endowed with diverse agro-ecologies that are


favourable to grow a variety of fruit and vegetable crops. The agro-ecological conditions
and low labor cost, as well as proximity to export market destinations to the neighboring
countries and the Middle East and Europe, give the country a comparative advantage.
More than 5.8 million smallholder farmers and farmers’ cooperatives produce banana,
mango, potato, and tomato in Ethiopia. Banana and mango are grown as major fruit
crops. These crops represent a higher share of national earning from the horticulture
sector. Mango is grown widely in several parts of the region. The production area of fruit
crops in Ethiopia is estimated at 104,421.80ha (CSA, 2017/18). Mango ranks third in the
area of production (15, 373.04ha) next to banana (59,298.19ha) and avocado
(18,021.13ha). In production, it ranks 2nd (1,049,807.79 Quintals) next to banana

1
(4,936,022.34 Quintals). In 2017, the fresh mango production level was estimated to be
1049.82 tons (CSA, 2017/18). Deke and Wonjeta Kebeles Bahirdar Zuria wereda in the
Amhara region produce about 8, 443,400 tons of Alponso mango variety every year.
However, mango production is severely affected in recent years by insect pests such as
white mango scale and mango diseases (FAO, 2019).
Besides in Ethiopia, the major reasons for the low production are due to damage by local
and invasive pests, diseases like Powdery mildew and Anthracnose leading to huge
production losses. The white mango scale is one of the major invasive scale insects and
causes serious damage to mango plants. It is a serious pest that injures mangoes by
feeding on the plant sap through leaves, branches, and fruits, causing defoliation, drying
up of young twigs, poor blossoming and so affecting the commercial value of fruits and
their export potential especially to late cultivars where it causes conspicuous pink
blemishes around the feeding sites of the scales (Anjulo, 2019).

Along with the technologies invented in the past few decades, image processing and
machine learning have gained important applications in the areas of agriculture.
Therefore, the implementation of those technologies in such areas will have paramount
importance to identify and control the prevalence of mango diseases and insect pests.

2
1.2. Statement of the Problem
Agriculture production is something that the economy relies on heavily. This is one of the
reasons why disease identification in plants plays an important role in the field of
agriculture since it is very common to have a disease in plants. If proper care is not taken
in this area, then serious effects on plants are caused and the resulting quality, quantity, or
productivity of the product are affected (Dessalegn, Assefa, Derso, & Tefera, 2014).
However, Mango production is highly decreasing in recent years because of different
diseases and insect pests. According to (Tewodros Bezu, Kebede Woldetsadik , &
Tamado Tana, 2014) the major mango production constraints are water shortage or
erratic rainfall followed by insect pest problems. Powdery mildew and anthracnose are
also the major diseases that lead to the decline of mango fruit production. Lack of
knowledge and recommended production practices like nutrition, pruning, pest
management, and post-harvest losses are also noted as major problems of the cultivators.
Another study conducted by (Dessalegn, Assefa, Derso, & Tefera, 2014) shows that the
adoption of improved mango production practices by farmers largely depends on the
availability of knowledgeable extension workers in the area.

It is valuable to detect plant disease through an automated technique as it reduces the


large monitoring job in large crop farms and even it allows farmers to identify and notice
the symptoms of diseases by themselves at a very early stage without the need for an
expert, i.e. when they appear on plant leaves(Grant, 2020).
Even if lots of studies are done to overcome the problems stated above there is a gap to
design and develop enhanced imaging and ML to detect Mango diseases and insect pests,
especially in our country Ethiopia.

In the findings of (Ullagaddi & Raju, 2017), modified rotation kernel transformation
(MRKT) based directional feature extraction has been applied in the identification of
black spots on mango fruit and leaf, and for recognition, ANN had been used. Though
they applied MRKT and ANN for the identification of black spots, the MRKT feature
vector is suitable for only one class. However, in the proposed system CNN were used
for both extracting relevant features instead of handcrafted features for more than one

3
class. Moreover, hybrids of SVM and CNN techniques were considered for classification
to enhance the computational cost and identification performance of the model.

In the study conducted by (Arivazhagan & Ligi, 2018) a convolutional neural network
was applied for the Identification of mango leaf diseases. They used a dataset of 1200
images of diseased and healthy mango leaves of which 600 for training and 600 for
testing. And in another study conducted by (rdjan Sladojevic, Marko Arsenovic, Andras
Anderla, Dubravko Culibrk, & Darko Stefanovic, 2016), CNN has been tested for 13
different types of plant diseases. In spites, in both studies, they have used CNN as both
classification and feature extraction for plant diseases and achieved better results, here
the main gap and the reason to get a better result is they have used augmented images in
order to maximize their number of datasets; however CNN takes more training time for
classification as the dataset increases. To overcome this problem, we have considered
different pre-processing and segmentation techniques to minimize the time that CNN
takes to extract features from raw images. Besides in this study, the hybrid of different
machine learning techniques with respective learning and kernel functions are used to
enhance the identification performance of the model. To this end, the following research
questions are answered by this thesis work:

✓ Which image segmentation technique is preferable to enhance the computational


cost of the proposed model?
✓ How to enhance the performance of the proposed model?
✓ How to design an enhanced machine learning model?
✓ What will be the performance of the proposed model?

4
1.3. Objectives of the Study

1.3.1. General Objective


The general objective of this study is to design and develop a Machine Learning
technique to detect mango diseases.

1.3.2. Specific Objective


The specific objectives of the study:
✓ To identify preferable image segmentation technique to enhance the
computational cost of the proposed model
✓ To identify preferable learning function to enhance the performance of the
proposed model
✓ To design and develop an enhanced mango disease and insect pest detection
model
✓ To evaluate the performance of the proposed model.
1.4. Methodologies
To achieve the aforementioned objectives of this study the following methods, tools and
techniques are used throughout the research work as described in the following sections.

1.4.1. Literature Review


To achieve the above-stated objectives of the research, different works of literature on the
contemporary development of image was processing, and machine learning related
technologies to plant diseases and insect pest identification are reviewed.

1.4.2. Data Collection and Datasets Preparation


In this study, mango leaf images were considered as a dataset As a target, the datasets
were collected from Amhara Region's main mango production areas in such that Weramit
fruit and vegetables research and training sub-center, and Bahir Dar city. Those areas
were selected purposively based on the severity of mango insect pests and diseases and
convince for data collection.

1.4.3. Data Pre-processing and Segmentation


After the data is collected pre-processing and segmentation were the succeeding tasks.
The aim of pre-processing is to remove noise and to make the subsequent step simple.
Since the acquired mango leaf data were messy and come from different sources, they
5
need to be standardized and cleaned up. Hence, different pre-processing steps (like
histogram adjustment, noise removal, image resizing, and data augmentation) were
conducted to reduce the computation complexity and increase the performance of the
proposed model.

In the segmentation step, two segmentation techniques were used such as k-means
segmentation and Mask R-CNN segmentation from the given pre-processed mango leaf
images to extract the relevant leaf lesions as a region of interest to simplify the feature
extraction process.

1.4.4. Feature Extraction and Modelling


Next to the pre-processing and segmentation step, the researchers focused on two main
different tasks. In the first task, CNN was used for both feature extraction and disease
detection/classification task. Then in the second task, features were extracted using CNN
and then other two classification models such that SVM and hybrid CNN-SVM were
trained to identify the diseases and insect pests based on the extracted features.

1.4.5. Performance Evaluation Technique


In this study, a confusion matrix was used to evaluate the final generalization
performance (accuracy) of the trained classification models.

1.4.6. Implementation Tool


Python on the Windows platform was considered as an implementation tool for this
study. Python for machine learning and computer vision is a great choice, as this is a very
flexible, platform-independent high-level programming language with an interactive
environment used by millions of engineers and scientists worldwide. As ML requires
continuous and bunch data processing, Python’s inbuilt libraries let us access, handle and
transform data most conveniently and effectively. And the low entry barrier allows us to
quickly pick-up Python and start using it for ML without wasting too much effort on
learning and understanding the language (Luashchuk, 2019).

6
1.5. Scope and limitation of the Study
Currently, plant Disease detection and identification is a wider research area. However,
the main focus of this study was designing and developing an enhanced imaging and
machine learning technique to identify mango leaf diseases and insect pests. In this study,
only mango leaf images were used as a dataset and they have been collected from
Amhara region mango production areas, in particular, Bahir Dar city, Weramit fruit, and
vegetable research and training sub-center. For this study, only White mango scale (insect
pest), Anthracnose, and Powdery Mildew were considered to be identified. Those
diseases and insect pests were selected purposively based on the severity of mango
production loss and the availability of their image data.

1.6. Significance of the Study


This study has significance from various directions:

1.6.1. Empirical Contribution


The first and the most significant of this study is, it will provide great support for farmers
and mango cultivators to identify a mango insect pest and diseases easily at a very early
stage without the need of any agricultural experts.
It also helps the agricultural consulting experts by minimizing their time and labor they
spent visiting mango production farms to support the mango fruit cultivators in
identifying insect pests and diseases to recommend the appropriate treatments.
This study will also have a valuable contribution to increasing mango fruit production
and boosting the country’s economy.

1.6.2. Methodological/Scientific Contribution


The main scientific contribution of this research work is described as follows:
1. We have proposed a CNN model for mango leaf disease detection with a small
number of parameters, learn faster, and that improve the computational
complexity of the classification compared with other state-of-the-art CNN
architectures.
2. A CNN-SVM combination algorithm was designed for enhanced mango leaf
disease detection.

7
3. We have proposed a combination of K-Means & Mask R-CNN for segmenting
mango leaf lesions. It helps to segment the region of interest effectively that
increase the performance of our model and minimize training time.
4. Improved mango leaf disease detection performance by studying the effect of
different activation function and by adding different CNN parameters such as
convolution, pooling, batch normalization, and dropout layers to our proposed
model.
1.7. Organization of the Study
The rest of the thesis is organized as follows:

Chapter 2 describes the literature review part which is aimed at evaluating prior
findings into the details of mango leaf disease, image classification, including image
processing viewpoints, hand feature extraction, and machine learning. There is also a run-
down of comparable studies using the features extracted by CNN to feed the classifiers of
machine learning such as SVM, which has been found as a convincing basis for this
research to be conducted.

Chapter 3 in this section the methodology for the proposed work performed in this study
is presented. This includes the approach taken to conduct the research, software libraries,
and tools used in the architecture of the proposed model as well as the methodology flow.
The training procedure is discussed here with the proposed classification model.

Chapter 4 covers the result and discussion which describes the experimental set up to
train the classification model, and which offers a breakdown of the experimental results,
the model evaluation, and the discussion of the results in light of the research question.
Besides, comparative result analysis (which includes the performance evaluation) of each
proposed classification model in this thesis is included in this section.

Chapter 5 concludes the work accomplished in this paper and includes recommendation
for the work to be done in the future.

8
CHAPTER TWO
2. LITERATURE REVIEW
2.1. Introduction
Mango (Mangifera indica) is one of the delicious and most important fruit crops
cultivated in Ethiopian agriculture. It is exported to many countries in the form of raw or
ripe fruits and also in the form of processed consumables like ripe mango slices or juice,
and raw mango pickle. Mango is rich in vitamin A and C, it has also rich medicinal
values in traditional medicine, and Mango leaves are mostly used during rituals since
these are having antibacterial activity against gram-positive bacteria. In recent times the
market value of Ethiopian mango is declined due to the uncontrolled use of pesticides;
hence it is the right time for the researchers to come up with ideas for early identification
of diseases and insect pests to control the use of dangerous pesticides which causes a
threat to human health. Figure 2.1 below shows that the sample images of the pure and
healthy mango fruits.

Figure2.1 Sample mango images


Identification of diseases or deficiency is usually carried out by farmers by frequent
monitoring of the plant leaves, flowers, fruits, or stem. For small-scale farmers, early
identification of disease is very much possible and able to control the insects by organic
pesticides or by the use of the minimal number of chemical pesticides. For large-scale
farmers, frequent monitoring and early identification of disease are not possible and it

9
results in a severe outbreak of the disease and pest growth which cannot be controlled by
organic means. In this situation, farmers are forced to use poisonous chemicals to
eradicate the disease to retain the crop yield. This problem can be solved by automating
the monitoring process through the use of advanced image processing and deep learning
techniques (Sethupathy & S, 2016).

2.2. Major mango diseases and insect pests


Mango plants can be affected by various kinds of fungal diseases like anthracnose,
powdery mildew, and insect pest like white mango scale. These fungal diseases and
insect pests usually show the syndrome in the embryonic leaves.

2.2.1. White mango scale insect (Aulacaspis tubercularis Newstead


(Hemiptera: Diaspididae))
The occurrence of the White Mango Scale (WMS), in Ethiopia, was first recorded, in
Oromiya, East Wallaga zone in Green focus Ethiopia private Ltd. farm at Loko in Guto
Gida district in 2010. It had remained confined to western Ethiopia where local mango
trees of old age were found until recently. Leaf samples infested by the pest were brought
to Melkassa Agricultural Research Center (MARC) for diagnosis in June 2014. At the
moment farmers were uprooting mango trees from their farms because there are no
available management strategies (Ayalew, Fekadu, & Sisay, 2015).
This insect pest is an important biotic factor that causes damage to the fruit resulting in
serious economic losses and making the fruits less quality. A white mango scale, as
shown in Figure 2.2 injures mangoes by feeding on the plant sap through leaves,
branches, and fruits, causing defoliation, drying up of young twigs, poor blossoming and
so affecting the commercial value of fruits especially to late cultivars where it causes
conspicuous pink blemishes around the feeding sites of the scales. Heavily infested
premature fruits dropping, and the mature fruits became small with a lack of juice.
Although the white mango scale causes only cosmetic damage to the fruit, it can result in
serious economic losses since the fruits are rendered unfit for export. The insect can be
distributed from infested to the free area via infected seedlings of mango since the insect
is too small it can be easily distributed upon wind direction. It’s expected that in the

10
meanwhile unless some management strategies have been developed, mango farming will
be collapsed in the region (TF, 2018).
In recent days it is becoming the most important limiting factor for mango production in
northern west Ethiopia. Most of the smallholder growers were not aware of this invasive
pest.

Figure 2.2: White mango scale (insect pest) symptoms on mango leaf
2.2.2. Powdery Mildew (Oidium mangiferae)
Also called in amedaye: Amharic; hammokshtay in Tigrigna; dakuya in Oromifa.
Powdery mildew is another fungus that afflicts leaves, flowers, and young fruit. Infected
areas become covered with whitish powdery mildew. Figure 2.3 shows these symptoms
of whitish powdery mildew on the leaf surface of a mango plant. As leaves mature,
lesions along the midribs or underside of the foliage become dark brown and greasy
looking. In severe cases, the infection will destroy flowering panicles resulting in a lack
of fruit set and defoliation of the tree.

It is one of the most serious diseases of mango affecting almost all varieties. The
characteristic symptom of the disease is the white superficial powdery fungal growth on
leaves, a stalk of panicles, flowers, and young fruits. The affected flowers and fruits drop
pre-maturely reducing the crop load considerably or might even prevent the fruit set.
Rains or mists accompanied by cooler nights during flowering are congenial for the
disease spread(Ermias Teshome & Kassahun Sadessa, 2020).

11
Figure 2.3 Powdery mildew symptoms on mango leaf
2.2.3. Anthracnose (Colletotrichum gloeosporioides)
Mangos are most seriously affected by anthracnose (fungal disease). Symptoms of mango
disease manifest as black, sunken, irregularly shaped lesions in the case of anthracnose,
resulting in blossom blight, leaf blot, fruit stain, and eventual rot. Rainy temperatures and
strong dew are promoting the disease(Grant, 2020).
It was of major economic importance causing damage that can lead to the production of
unmarketable fruits. It is the major fungal disease limiting fruit production in all mango
growing countries, especially where there is high humidity during the cropping seasons.
Anthracnose disease is of widespread occurrence causing serious losses to young shoots
(shown in Figure 2.4 (a)), flowers, and fruits (shown in Figure 2.4 (b)) under favorable
climatic conditions of high humidity, frequent rains, and a temperature of 24–32 0C.
Anthracnose also affects fruits during storage. And it is of major economic importance
causing damage that can lead to the production of unmarketable fruits (WA, SS, & MK,
2016).

12
(a) (b)
Figure 2. 4 Anthracnose symptoms (a) on mango leaf, and (b) on mango fruit

2.3. Plant disease and pest detection using image processing and
deep learning
Throughout the history of agricultural development, plant diseases and pests have always
been one of the main obstacles hindering the development of the agricultural economy.
Plant disease identification through the human eye is based on the visible symptoms on
the leaves. Many studies also confirmed that relying on pure naked-eye observation of
experts to detect and classify such diseases can be prohibitively expensive, especially in
developing countries.
The plant disease identification system based on digital image processing technology had
the characteristics of fast, accurate, and real-time, which can help the farmers to take
effective preventive measures in time. Thus, providing fast, automatic, cheap, and
accurate image-processing-based solutions for that task can be of great realistic
significance. As an important technical means in the field of image recognition, image
processing, and deep learning have broad application prospects.
If we use image processing combined with the Convolution Neural Network model can
able to identify the disease in the early stage, so that we can be able to prevent the disease
and pests from spreading into other parts of the tree (K, Sivakami, & M.Janani, 2019).

13
2.3.1. Convolutional Neural Networks
Convolutional neural networks, is a very unusual combination of biology and math with
a little CS sprinkled in, but these networks have been some of the most influential
innovations in the field of computer vision. 2012 was the first year that neural nets grew
to prominence as Alex Krizhevsky used them to win that year’s ImageNet competition
(basically, the annual Olympics of computer vision), dropping the classification error
record from 26% to 15%, an astounding improvement at the time. Ever since then, a host
of companies have been using deep learning at the core of their services. Facebook uses
neural nets for their automatic tagging algorithms, Google for their photo search, Amazon
for their product recommendations, Pinterest for their home feed personalization, and
Instagram for their search infrastructure. However, the classic, and arguably most
popular, use case of these networks is for image processing. Within image processing,
let’s take a look at how to use these CNNs for image classification.
The convolutional neural network, also called ConvNet, is a specialized type of artificial
neural network that roughly mimics the human vision system. It was first introduced in
the 1980s by Yann LeCun, a postdoctoral computer science researcher. LeCun had built
on the work done by Kunihiko Fukushima, a Japanese scientist who, a few years earlier,
had invented the neocognitron, a very basic image recognition neural network (Dickson,
2020).
CNN based applications became prevalent after the exemplary performance of AlexNet
on the ImageNet dataset in 2012. Next to AlexNet, VGGNet is invented by the Visual
Geometry Group (by Oxford University). This architecture is the 1st runner-up of ILSVR
2014 in the classification task while the winner is GoogLeNet.
Ever since then, a host of companies have been using deep learning at the core of their
services. Facebook uses neural nets for their automatic tagging algorithms, Google for
their photo search, Amazon for their product recommendations, Pinterest for their home
feed personalization, and Instagram for their search infrastructure. However, the classic,
and arguably most popular, use case of these networks is for image processing. Within
image processing, let’s take a look at how to use these CNNs for image classification. In
recent years, CNN has become pivotal to many computer vision and deep learning
applications.

14
2.3.1.1. Basic Building Blocks of CNN Architecture
The CNN architecture includes several building blocks, such as convolution layers,
pooling layers, and fully connected layers. As described in Figure 2.5, a typical
architecture consists of repetitions of a stack of several convolution layers and a pooling
layer, followed by one or more fully connected layers. The step where input data are
transformed into output through these layers is called forward propagation. A model’s
performance under particular kernels and weights is calculated with a loss function
through forward-propagation on a training dataset, and learnable parameters, i.e., kernels
and weights are updated according to the loss value through backpropagation with
gradient descent optimization algorithm (Yamashita, Nishio, Do, & Togashi, 2018).

CNN
Forward
Input image
propagation
Convolution + Relu

Convolution + Relu

Convolution + Relu

Convolution + Relu
Max Pooling

Max Pooling





F F
… Output
C C
Loss

Label

Kernels Kernels Weights

Back propagation
Update

Figure 2.5 An overview of convolutional neural network (CNN) architecture and the
training process(Singh et al., 2019).
Convolution layer: A convolution layer is a fundamental component of the CNN
architecture that performs feature extraction, which typically consists of a combination of
linear and nonlinear operations, (i.e., convolution operation and activation function). The
objective of the convolution operation is to extract the high-level features such as edges,
from the input image.

15
Convolution is a specialized type of linear operation used for feature extraction that
involves the multiplication of a set of weights with the input, much like a traditional
neural network. Given that the technique was designed for two-dimensional input, the
multiplication is performed between an array of input data called a tensor (i.e. the green
color in Fig 2.6) and a two-dimensional array of weights, called a filter or a kernel (i.e.
the yellow color in Fig 2.6). The kernel is passed over the image, viewing a few elements
or pixels at a time (for example, 3X3 or 5X5).

An element-wise product between each element of the kernel and the input tensor is
calculated at each location of the tensor and summed to obtain the output value in the
corresponding position of the output tensor, called a feature map (i.e. the pink colour in
Fig 2.6). This procedure is repeated by applying multiple kernels to form an arbitrary
number of feature maps, which represent different characteristics of the input tensors;
different kernels can, thus, be considered as different feature extractors.

Two key hyper parameters that define the convolution operation are the size and number
of kernels. The distance between two successive kernel positions is called a stride, which
also defines the convolution operation. The common choice of a stride is 1.

Figure 2. 6 Convoluting a 5x5x1 image with a 3x3x1 kernel to get a 3x3x1 convolved
feature.
Nonlinear activation function: The outputs of a linear operation (i.e. convolution) is then
passed through a nonlinear activation function. Although smooth nonlinear functions,
such as sigmoid or hyperbolic tangent (Tanh) function, were used previously, the most
common nonlinear activation function used presently is the rectified linear unit (ReLU),

16
which simply computes the function: f(x) = max (0, x) as shown in Figure 2.7. The main
objective of ReLU is to introduce non-linearity in the ConvNet.

Figure 2. 7 Relu operation

The other state of the art activation functions currently are Leaky Relu, and swish
activation functions. These activation functions eliminate the limitations of the ordinary
Relu activation.

Pooling layer: Similar to the Convolutional Layer, the Pooling layer is responsible for
reducing the spatial size of the Convolved Feature. This is to decrease the computational
power required to process the data through dimensionality reduction. Furthermore, it is
useful for extracting dominant features that are rotational and positional invariant, thus
maintaining the process of effectively training the model with a reduced number of
parameters and computation in the network. The pooling layer operates on each feature
map independently.
As shown in Figure 2.8 below, there are two types of Pooling: Max Pooling and Average
Pooling. Max Pooling returns the maximum value from the portion of the image covered
by the Kernel. On the other hand, Average Pooling returns the average of all the
values from the portion of the image covered by the Kernel.
Max Pooling also performs as a Noise Suppressant. It discards the noisy activations
altogether and also performs de-noising along with dimensionality reduction. On the
other hand, Average Pooling simply performs dimensionality reduction as a noise
suppressing mechanism. Hence, we can say that Max Pooling performs a lot better than
Average Pooling (Saha, 2018).

17
Figure 2. 8 Max-pooling and Average-pooling operations

The Convolutional Layer and the Pooling Layer, together form the ith layer of a

Convolutional Neural Network. Depending on the complexities in the images, the


number of such layers may be increased for capturing low-levels details even further, but

at the cost of more computational power.

After going through the above process, the model can successfully understand the

features. Moving on, the next phase is to flattening the final output and feeding it to a

regular Neural Network for classification purposes.

Fully connected layer: A fully connected layer is the last layer of CNN that flattens the

features identified in the previous layers into a vector and predicts probabilities that the

image belongs to each one of several possible labels. The output feature maps of the final

convolution or pooling layer are typically flattened, i.e., transformed into a one-

dimensional (1D) array of numbers (or vector), and connected to one or more fully

connected layers, also known as dense layers, in which every input is connected to every

output by a learnable weight. Once the features extracted by the convolution layers and

down sampled by the pooling layers are created, they are mapped by a subset of fully

connected layers to the final outputs of the network, such as the probabilities for each

class in classification tasks. The final fully connected layer typically has the same number

of output nodes as the number of classes. Each fully connected layer is followed by a

nonlinear function, such as ReLU.

Last layer activation function: The activation function applied to the last fully connected
layer is usually different from the others. An appropriate activation function needs to be

18
selected according to each task. An activation function applied to the multiclass

classification task is a Softmax function that normalizes output real values from the last

fully connected layer to target class probabilities, where each value ranges between 0 and

1 and all values sum to 1.


2.3.2. Support vector machine
A Support Vector Machine (SVM) is one of the most popular supervised machine
learning algorithms that can be employed for both classification and regression problems.
While they can be used for regression, SVM is most commonly used in classification
problems. The goal of the support vector machine algorithm is to find the best line or
decision boundary (i.e., the hyperplane) in N-dimensional space (N - the number of
features) that distinctly classify the data points. In other words, given labelled training
data (supervised learning), the algorithm outputs an optimal hyperplane that categorizes
new examples. Thus, it performs classification by constructing an N-dimensional
hyperplane that optimally separates the data into two categories.
SVM chooses the extreme data points/vectors that help in creating the hyperplane. These
extreme cases are called support vectors, and hence algorithm is termed a Support Vector
Machine. Consider Figure 2.9 below in which two different categories are classified
using a decision boundary or hyperplane:

Figure 2. 9 SVM hyperplanes

19
2.3.2.1. Hyperplane and Support Vectors in the SVM algorithm:
Hyperplane: There can be multiple lines/decision boundaries to segregate the classes in
n-dimensional space, but we need to find out the best decision boundary that helps to
classify the data points. This best boundary is known as the hyperplane of SVM.
The dimensions of the hyperplane depend on the features present in the dataset, which
means if there are 2 features (as shown in Figure 2.10 below), then the hyperplane will be
a straight line. And if there are 3 features, then the hyperplane will be a 2-dimension
plane. SVM always creates a hyperplane that has a maximum margin, which means the
maximum distance between the data points or supports vectors.
Support Vectors: The data points or vectors that are the closest to the hyperplane and
which affect the position of the hyperplane are termed as Support Vector. Since these
vectors support the hyperplane, hence called a Support vector. The distance between the
vectors and the hyperplane is called as margin. And the goal of SVM is to maximize this
margin. The hyperplane with maximum margin is called the optimal hyperplane.

Figure 2. 10 SVM optimal hyper plane


2.3.2.2. Types of SVM
SVM can be of two types:
Linear SVM: Linear SVM is used for linearly separable data, which means if a dataset
can be classified into two classes by using a single straight line, then such data is termed
as linearly separable data, and classifier is used called as Linear SVM classifier.
Non-linear SVM: Non-Linear SVM is used for non-linearly separated data, which means
if a dataset cannot be classified by using a straight line, then such data is termed as non-
linear data, and classifier used is called as Non-linear SVM classifier. SVM uses Kernel
functions, which transform non-linear spaces into linear spaces to make non-separable

20
data into separable data. Thus, it transforms data into another dimension so that the data
can be linearly divided or classified by a plane.

2.3.2.3. SVM Kernel Functions


Different SVM algorithms use different types of kernel functions. These functions can be
different types. For example, linear, polynomial, radial basis function (RBF), and
sigmoid. The most used type of kernel function is RBF because it has a localized and
finite response along the entire x-axis.

Linear Kernel: The Linear kernel is the simplest kernel function that can be used as a
normal dot product for any two given observations. The product between two vectors is
the sum of the multiplication of each pair of input values.

𝑘(𝑥𝑖 , 𝑥𝑗 ) = 𝑥𝑖 ⋅ 𝑥𝑗

Polynomial kernel: A polynomial kernel is a more generalized form of the linear kernel.
The polynomial kernel can distinguish curved or nonlinear input space. It is well suited
for problems where all the training data is normalized. Equation is:
𝑑
𝑘(𝑥𝑖 , 𝑥𝑗 ) = (𝑥𝑖 ⋅ 𝑥𝑗 + 1) , Where d is the degree of the polynomial. d=1 is similar to the
linear transformation. The degree needs to be manually specified in the learning
algorithm.

Gaussian radial basis function (RBF): The Radial basis function kernel is a popular
kernel function commonly used in support vector machine classification. RBF can map
an input space in infinite-dimensional space. It is a general-purpose kernel; used when
there is no prior knowledge about the data. The most popular RBF kernel Equation is:
2
𝑘(𝑥𝑖 − 𝑥𝑗 ) = exp (−𝛾‖𝑥𝑖 − 𝑥𝑗 ‖ )

, Here gamma is a parameter, which ranges from 0 to 1. A higher value of gamma will
perfectly fit the training dataset, which causes over-fitting. Gamma=0.1 is considered to
be a good default value. The value of gamma needs to be manually specified in the
learning algorithm.

21
2.4. Image pre-processing
Image pre-processing is a fundamental step in image processing and computer vision. It
is a method to perform some operations on an image, to get an enhanced image, or to
extract some useful information from it. It includes primitive operations to resizing,
contrast enhancement, reduce noise, image smoothing and sharpening, and advanced
operations such as image segmentation (Adatrao & Mittat, 2016).
Image processing involves the manipulation of images to extract information to
emphasize or de-emphasize certain aspects of the information, contained in the image or
perform image analysis to extract hidden information. The processing of an image
comprises an improvement in its appearance and effective representation of the input
image suitable for the required application. Pre-processing improves the quality of the
data by reducing artifacts. There are various types of techniques for image pre-
processing, and they are described below.

2.4.1. Image Resizing


Re-sizing of an image is performed by the process of interpolation. It is a process that re-
samples the image to determine values between defined pixels. Thus, the resized image
contains more or fewer pixels than that of the original image. The intensity values of
additional pixels are obtained through interpolation if the resolution of the image is
increased.
Nearest Neighbor Interpolation: The nearest neighbor is the simplest approach that
needs the least processing time of all interpolation algorithms since only one pixel is
considered. In this technique, the value of the nearest sample point in an input image is
assigned to each interpolated output pixel as shown in Figure 2.11. Replacing every pixel
with several pixels of the same color: The resulting image is larger than the original, and
preserves all the original image detail, but has (possibly undesirable) jaggedness.
Although this method is very efficient, the quality of an image is very poor i.e. the
resulting image may contain jagged edges.

22
Figure 2. 11 Image resizing operation using nearest-neighbor interpolation
Bilinear Interpolation: Unlike other interpolation techniques such as nearest-neighbor
interpolation, bilinear interpolation uses only the 4 nearest pixel values which are located
in diagonal directions from a given pixel to find the appropriate color intensity values of
that pixel. It reflects the closest 2x2 neighborhood of known pixel values surrounding the
unknown pixel. It then takes a weighted average of these four pixels to arrive at its final
interpolated value. Bilinear interpolation (consider Figure 2.12 below) is usually used for
enlarging or zooming images and it is a default image resizing parameter. This algorithm
reduces some of the visual distortion caused by resizing an image. However, bilinear
interpolation appears to produce a greater number of interpolation artifacts such as
blurring, and edge halos. This algorithm takes more time and also more complex than the
nearest neighbor technique.

Figure 2. 12 Images resizing operation using bilinear interpolation


Bicubic Interpolation: Bicubic interpolation goes one step beyond bilinear interpolation
by considering the nearest 4x4 neighborhood of known pixels for a total of 16 pixels. Its
impact sphere is extended to its 16 adjacent pixels for the unknown pixel P in the

23
amplified image, and then the color value of P is determined by these 16 pixels according
to their distance from P. Since these neighboring pixels are at different distances from P,
closer pixels are given a greater weighting in the calculation. Bicubic produces notably
sharper images which are better results than the previous two techniques and is, therefore,
the ideal combination of processing time and output quality. Because of this, it is a
standard in many images in-camera interpolation.

2.4.2. Histogram Equalization


A histogram is a graphical representation of the intensity (i.e. Intensity refers to the
amount of light or the numerical value of a pixel.) distribution of an image. In simple
terms, it defines the color distribution of an image by plotting the frequencies in the 3
color channels for each pixel size. It is used to display the occurrences of each intensity
value in the image. There are 256 (i.e. 0-255) different possible intensities (intensity
levels) for an 8-bit grayscale image, and so does the histogram will represent (display)
256 (0-255) numbers graphically showing the pixel distribution among those grayscale
values. The horizontal axis of the histogram represents the tonal variations, while the
vertical axis represents the number of pixels in that particular tone. The left side of the
horizontal axis represents the black and dark areas, the middle represents medium grey
and the right-hand side represents light and pure white areas (Bagade & Shandilya, 2011)
and (Sonam & Dahiya, 2015).

Histogram Equalization is a computer image processing technique used to improve


contrast in images by assigns the pixel intensity values in the input image so that the
output image contains a uniform intensity distribution. It accomplishes this by effectively
spreading out the most frequent intensity values, i.e. stretching out the intensity range of
the image. This method usually increases the global contrast of images when its usable
data is represented by close contrast values. This allows for areas of lower local contrast
to gain a higher contrast. In image classification, histograms can be used as a feature
vector with the assumption that similar images will have similar color distribution. This
technique can be applied to an entire image or only a section or portion of an image
(Sudhakar, 2017).

24
A color histogram of an image represents the number of pixels in each type of color
component. Histogram equalization cannot be applied separately to the Red, Green, and
Blue components of the image as it leads to dramatic changes in the image’s color
balance. However, if the image is first converted to another color space, like HSL/HSV
color space, then the algorithm can be applied to the luminance or value channel without
resulting in changes to the hue and saturation of the image (Sudhakar, 2017). In general,
Histogram Equalization can be divided into several types:
Classical Histogram Equalization (CHE): CHE is the principal image processing
technique, particularly when images at the grey level are considered. It is a global
operation, in which Equalization is applied to the whole image. The purpose of this
technique is to uniformly spread the given number of grey levels over a range, thereby
improving its contrast. The CHE attempts to generate a flattened histogram output image,
which implies a uniform distribution. A picture is created by the dynamic range of grey
level values. The entire grey levels are effectively denoted as being 0 to L-1. The
downside to this technique is that it does not take into account an image's average
brightness. Due to the expansion of the grey levels over the full grey level range, the
CHE technique may result in over enhancement and saturation artifacts.

Figure 2. 13 Histogram and its Equalized Histogram for H.E.


Adaptive histogram equalization (AHE): Adaptive histogram equalization is used to
boost image contrast. Since the ordinary histogram equalization simply uses a single
histogram for an entire image, it varies from ordinary histogram equalization in the
respect that the adaptive method computes many histograms, each corresponding to a
different section of the image, and uses them to redistribute the lightness values of the
image. It is appropriate to adjust the local contrast and to fetch clear details. The
disadvantage of this technique is that it can produce considerable noise that limits its use

25
for homogeneous images because it has a behavior of over- intensifying noise in some
homogeneous regions of an image. It also fails to preserve the brightness of the input
image. Contrast limited adaptive histogram equalization (CLAHE) is an advanced form
of adaptive histogram

Brightness Preserving Bi-Histogram Equalization (BBHE): BHE is introduced to


overcome the drawback introduced by CHE and the histogram of the original image is
separated into two sub histograms based on the mean of the histogram of the original
image. Then the sub-histograms are equalized independently using refined histogram
equalization, which produces a flatter histogram. Its purpose is to create a method
suitable for real-time applications, but this method has the same downside as CHE again
by inputting unwanted signals. A higher degree of brightness preservation is not possible
to avoid annoying artifacts.

Figure 2. 14 BI-histogram Equalization Method


Recursive Mean Separate Histogram Equalization (RMSHE): Another technique to
provide improved and scalable preservation of brightness for grayscale and color images
is Recursive Mean-Separate Histogram Equalization (RMSHE). Although in BHE the
separation is performed only once, RMSHE performs the separation based on their
respective mean recursively. It is analyzed mathematically that the mean brightness of the
output images will converge with the mean brightness of the input images as the
sum/number of recursive mean separation increases. Therefore, this technique allows a
desirable property to adjust the brightness level depending on the image requirement. The
recursive nature of RMSHE also allows scalable brightness preservation, which is very
useful in consumer electronics.

26
2.4.3. Image noise types and de-noising techniques
Noise is any unwanted information produced in the image during the image acquisition
process that results in pixel values that do not reflect the true intensities of the real scene.
The types of noise that contaminate the image can be classified into two groups including
linear noise and nonlinear noise. Gaussian noise is the well-known type of linear noise
whereas speckle noise, salt-and-pepper noise, and uniform impulse noise are nonlinear
noise. Gaussian noise can be characterized by its distribution with mean and variance
values. In a nonlinear group, salt-and-pepper noise corrupts the image’s pixels randomly
and sparsely which makes some pixel change to bright or dark. Uniform impulse noise is
considered by replacing a portion of some image pixel values with random values.
Speckle noise is characterized by signal-dependent noise where the noise corrupts the
image in the form of multiplicative noise (Langampol, Srisomboon, Patanavijit, & Lee,
2019).

2.4.3.1. Image noise types


Gaussian Noise: The Gaussian noise in an image is introduced during the acquisition of
digital images. It is evenly distributed over the signal. This means that each pixel in the
noisy image is the sum of the true pixel value and a random Gaussian distributed noise
value. Gaussian noise is independent at each pixel and independent of the signal
intensity. This noise can be modelled by adding random values to an image (Singh,
Singh, & Parveen, 2015) and (Dhruv, Mittal, & Modi, 2017). As the name indicates, this
type of noise is analytical noise whose probability density function is equal to Gaussian
distribution, which has a bell-shaped probability distribution function given by eq. (1).

1 (g − μ)2
𝐹(𝑔) = 𝑒−
√2𝜋𝜎 2 2𝜎 2

Where F(g) is the Gaussian distribution noise in an image, g represents the grey level
(Gaussian random variable), µ and σ is the mean and standard deviation respectively.

27
Salt and Pepper Noise: Salt and Pepper noise is affixed valued impulse noise which is
added to an image (8-bit image)by the addition of both random bright (with 255-pixel
value for salt noise) and random dark (with 0-pixel value for pepper noise) all over the
image. This model is also known as data drop noise because statistically, it drops the
original data values. It can be caused by dead pixels, analog-to-digital converter errors,
and transmitted bit errors (B & K, 2016).

Impulse noise pattern:

𝑃1, 𝑥=𝐴
P(x) = { 𝑃2, 𝑥=𝐵
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

Where P1, P2 are the Probabilities Density Function (PDF); P(x) is the distribution of salt
and pepper noise in the image, and A, B are the array size image.

Poisson Noise/ quantum (photon) noise or shot noise: The appearance of this noise is
seen due to the statistical nature of electromagnetic waves such as x-rays, visible lights,
and gamma rays. The x-ray and gamma-ray sources emitted several photons per unit of
time. These rays are injected into the patient’s body from its source, in medical x rays and
gamma rays imaging systems. These sources are having random fluctuation of photons.
Result gathered image has spatial and temporal randomness. This type of noise has a
probability density function of a Poisson distribution. Source - random fluctuation of
photons.

Speckle Noise: A fundamental problem in optical and digital holography is the presence
of speckle noise in the image reconstruction process. Speckle noise is a crude noise that
corrupts the element of medical images in general. Speckle is a granular noise that
inherently exists in an image and degrades its quality. Speckle noise can be generated by
multiplying random pixel values with different pixels of an image. The presence of
Speckle noise in an image hinders the image interpretation (Dhruv, Mittal, & Modi, 2017)
and (S, V.P, & Rangaswamy, 2016). Speckle noise can be modeled with the pattern:

𝑔(𝑥, 𝑦) = 𝑓(𝑥, 𝑦) ∗ 𝑛(𝑥, 𝑦) + 𝑛1(𝑥, 𝑦)

28
where g(x, y) is the observed image, n(x, y), and n1(x, y) is the multiplicative and additive
component of Speckle noise x, y denotes the axial and lateral indices of the image sample.
Considered only the multiplicative noise component and ignore the additive noise
component then the above equation becomes like the following:

𝑔(𝑥, 𝑦) = 𝑓(𝑥, 𝑦) ∗ 𝑛(𝑥, 𝑦)

2.4.3.2. Image de-noising Techniques


Image de-noising is one of the important steps in image processing that helps to remove
the noise or unwanted image fluctuation from the image while keeping the details of the
image preserved. To detect and then filter the image so that data can be analyzed for
further process. Image de-noising helps in noise reduction, interpolation, and resampling.
Image is filtered through various techniques that depend on the behavior and the type of
the image. There are two methods to remove noise named linear and non-linear methods
(Kaur, Kumar, & Kainth, 2016). Linear filtering is filtering in which the value of an
output pixel is a linear combination of the values of the pixels in the input pixel's
neighborhood. Whereas, nonlinear filters are filters whose output is not a linear function
of its input. Linear methods are fast as compared to non-linear methods however they are
not able to preserve the details of the image in comparison to non-linear methods.

Gaussian filter: Gaussian filter is a linear image de-noising technique that is usually used
to blur the image or to reduce noise. Gaussian filter smoothes the whole image
irrespective of its edges or details.

Median filter: Median filtering is a nonlinear method used to remove noise from images
while preserving edges. It is particularly effective at removing ‘salt and pepper’ type
noise. The median filter works by moving through the image pixel by pixel, replacing
each value with the median value of neighboring pixels. The pattern of neighbors is
called the "window", which slides, pixel by pixel over the entire image. The median is
calculated by first sorting all the pixel values from the window into numerical order, and
then replacing the pixel being considered with the middle (median) pixel value. Median
filter proves to preserve the edges and lines of an image in the best possible way thereby
removing the outliers. It can be stated as:

29
𝑦[ 𝑚, 𝑛 ] = 𝑚𝑒𝑑𝑖𝑎𝑛{ 𝑥 [ 𝑖, 𝑗 ], (𝑖, 𝑗)𝜖𝑤}

Where w is a neighborhood focused on the location [m, n] in an image.

2.4.4. Image Augmentation


Data augmentation is a common pre-processing technique in many machine learning
tasks, such as image classification, to virtually enlarge the training dataset size and avoid
overfitting. It is an automatic way to boost the number of different images that will be
used to train the deep learning algorithms. To train a CNN model from scratch
successfully, the dataset needs to be huge. The common techniques to generate new
images involve: flipping horizontally or vertically, rotating at some degrees, scaling
outward or inward, cropping randomly, translating, and adding Gaussian noises to
prevent overfitting and enhance the learning capability. CNN's generally perform better
with more data as it prevents overfitting.

2.5. Segmentation
Image segmentation is a process of dividing the image into homogenous, self-consistent
regions corresponding to different objects in the image. It separates the image into
meaningful regions. An image can be segmented using the basic properties of features of
the image like intensity, edge, or texture.
Segmentation is a common procedure for feature extraction in images and volumes.
Segmenting an image means grouping its pixels according to their value similarity. The
simplified-color image can then be used to render important features independently from
one another.
The main aim of segmentation is to extract the ROI (Region of Interest) for image
analysis. Segmentation plays an important role in image processing since the separation
of a large image into several parts makes further processing simpler. The result of image
segmentation is a set of regions that collectively cover the entire image, where each pixel
in a region is similar to some characteristic or computed property, such as color,
intensity, or texture.

30
Plant leaf image segmentation plays an important role in plant disease detection through
leaf symptoms. It is to partition the image into essential regions for the appropriate
locations (Kaur A. , 2014).
There are different image segmentation techniques like threshold-based, edge-based,
region-based, cluster-based, and neural network-based. One of the most used clustering
algorithms is k-means clustering (Sethupathy & S, 2016).
Image segmentation can be classified into the following types:-

Segmentation Region Based

Edge Detection

Tresholding

Clustering

Fuzzy Logic

Neural Network

Figure 2. 15 Image segmentation types

Threshold Segmentation: A threshold is the simplest and commonly used image


segmentation technique that is used to discriminate the foreground from the background.
This method is based on the threshold value calculated by converting the image (the
greyscale image) into a binary image. The binary image contains the whole necessary
data regarding the location and shape of the objects. Conversion to a binary image is
useful because it reduces the complexity of data.
Thresholding methods include Global and Local thresholding, which has methods like
Otsu global thresholding and Adaptive local thresholding (Janwale, 2017). The global
threshold method divides the image into two regions of the target and the background by
a single threshold. The local threshold method needs to select multiple segmentation
thresholds and divides the image into multiple target regions and backgrounds by
multiple thresholds.

31
The advantage of the threshold method is that does not require prior information of the
image. It is also Fast and simple for implementation. In particular, when the target and
the background have high contrast, the segmentation effect can be obtained. And the
disadvantage of this technique is that it is difficult to obtain accurate results for image
segmentation problems where there is no significant grayscale difference or a large
overlap of the grayscale values in the image. Since it only takes into account the gray
information of the image without considering the spatial information of the image, it is
sensitive to noise and greyscale unevenness, leading it often combined with other
methods (Janwale, 2017).
Region-based /similarity-based segmentation: The regional growth method is a typical
serial region segmentation algorithm, and its basic idea is to have similar properties of the
pixels together to form a region. Pixels of the same type are identified and grouped into
the same type of regions. The method requires first selecting a seed pixel and then
merging the similar pixels around the seed pixel into the region where the seed pixel is
located. The Main types of Region-based Segmentation are Region Growing, Region
Splitting, and Region Merging.
The advantage of regional growth segmentation is that it usually separates the connected
regions with the same characteristics and provides good boundary information and
segmentation results. It is more immune to noise, useful when it is easy to define
similarity criteria, and works well for images having good contrast between regions. It is
also simple and requires only a few seed points to complete. The growth criteria in the
growing process can be freely specified and it can pick multiple criteria at the same time.
The disadvantage of this method is that it takes a high computational cost. And the noise
and greyscale unevenness can lead to voids and over-division. Also, the shadow effect on
the image is often not very good.
Edge detection segmentation: This technique is intended for locating boundaries of leaf
within the image based on the rapid change of intensity value in an image because a
single intensity value does not provide good information about edges. It segments the
image by identifying the difference between intensities at the border. In edge-based
segmentation methods, first of all, the edges are detected and then are connected to form
the object boundaries to segment the required regions (Kaur & Kaur, 2014). The basic

32
two edge-based segmentation methods are Gray histograms and Gradient-based methods.
To detect the edges one of the basic edge detection techniques like Sobel operator, canny
operator, and Robert‟s operator, etc can be used. The result of these methods is a binary
image. Sobel, Canny, Laplacian, and fuzzy logic are some of the techniques used for edge
detection. Image Edge detection significantly reduces the amount of data and filters out
useless information, while preserving the important structural properties in an image and
it is effective for images having better contrast between objects. Conversely, it is not
suitable for wrong detected or too many edges (Manjula.KA, 2015).
Clustering-based segmentation: Clustering methods attempt to group patterns that are
similar in some sense by dividing the population (data points) into several groups, such
that data points in the same groups are more similar to other data points in that same
group than those in other groups. These groups are known as clusters. The image is first
converted into a histogram and then clustering is performed on it.
One of the popular clustering algorithms is K-means clustering, which is an unsupervised
algorithm and is used to segment the interest area from the background. The algorithm is
used when there is unlabelled data (i.e., data without defined categories or groups). The
goal is to partition the given image into K-clusters or find certain groups based on some
kind of similarity in the image pixels with the number of groups represented by K by
minimizing the sum of squared distances between all points and the cluster center. It
clusters the same pixels to segment the image and helps to improve high performance and
efficiency(Jayapriya & Hemalatha, 2019). This algorithm only needs to know how many
clusters are in an image, or, in other words, how many clusters we want an image to have.
With this information, it can automatically find the best clusters. K-means clustering is
computationally faster for small values of k. it also eliminates noisy spots and it better to
obtain more homogeneous regions. (Manjula.KA, 2015).

𝑘
𝑛
(𝑗) 2
𝑓 = ∑ ∑ ‖𝑥𝑖 − 𝐶𝑗 ‖
𝑖=1
𝐽=1

Where f is an objective function, k is the number of clusters, n is the number of cases, xi


case i, and cj is centroid for cluster j.

33
Watershed: The watershed-based methods use the concept of topological interpretation.
The watershed methods consider the gradient of an image as a topographic surface. The
pixels having more gradients are represented as continuous boundaries. The main purpose
of watershed segmentation is that it has more stable results and detected boundaries are
continuous. However, it is expensive to calculate the gradients.
Neural network-based segmentation: The artificial neural network-based segmentation
techniques work by simulating the learning strategies of the human brain for decision
making. It is used to separate the required image from the background. A neural network
is made of a large number of connected nodes and each connection has a particular
weight. This method has two basic steps: extracting features and segmentation by a
neural network. The advantage of this segmentation is that it is simple and no need to
write complex programs. However, it takes much time for training (Kaur & Kaur, 2014).
One of the most popular NN-based image segmentation techniques is Mask R-CNN
based segmentation.
Mask R-CNN segmentation: Mask R-CNN is an instance segmentation technique that
locates each pixel of every object in the image instead of the bounding boxes. It has two
stages: region proposals and then classifying the proposals and generating bounding
boxes and masks. It does so by using an additional fully convolutional network on top of
a CNN-based feature map with input as a feature map and gives a matrix with 1 on all
locations where the pixel belongs to the object and 0 elsewhere as the output. The
following figure shows the framework of Mask R-CNN for instance segmentation taken
from.

Figure 2. 16 Mask R-CNN framework for instance segmentation

34
It consists of a backbone network which is a standard CNN. The early layer of the
network detects low-level features, and later layers detect higher-level features. The
image is converted from 1024x1024px x 3 (RGB) to a feature map of shape 32x32x2048.
The Feature Pyramid Network (FPN) was an extension of the backbone network which
can better represent objects at multiple scales. It consists of two pyramids where the
second pyramid receives the high-level features from the first pyramid and passes them to
the lower layers. This allows every level to have access to both lower and higher-level
features.
It also uses the Region Proposal Network (RPN) which scans all FPN top to bottom and
proposes regions that may contain objects. It uses anchors which are a set of boxes with
predefined locations and scales itself according to the input images. Individual anchors
are assigned to the ground-truth classes and bounding boxes. RPN generates two outputs
for each anchor — anchor class and bounding box specifications. The anchor class is
either a foreground class or a background class.
Another module that is different in Mask R-CNN is the ROI Pooling. The authors of
Mask R-CNN concluded that the regions of the feature map selected by RoIPool were
slightly misaligned from the regions of the original image. Since image segmentation
requires specificity at the pixel level of the image, this leads to inaccuracies. This
problem was solved by using RoIAlign in which the feature map is sampled at different
points and then a bilinear interpolation is applied to get a precise idea of what would be at
pixel 2.93 (which was earlier considered as pixel 2 by the RoIPool).
Then a convolutional network is used which takes the regions selected by the ROI
classifier and generates masks for them. The generated masks are of low resolution-
28x28 pixels. During training, the masks are scaled down to 28x28 to compute the loss,
and during inferencing, the predicted masks are scaled up to the size of the ROI bounding
box. This gives us the final masks for every object.

35
2.6. Feature Extraction
Feature Extraction is one of the significant techniques in image processing. An image
feature is a distinguishing primitive characteristic or attribute of an image. One of the key
factors of image analysis is the extraction of sufficient information that leads to a
compact description of an examined image. Thus, feature extraction techniques are
applied to get the feature that will be useful in classifying and recognizing the images.
The aim is to reduce the data set of features which is valuable information present in an
image. Data present in an image are very complex and very high dimensional, it is a
necessary step to extract the informative feature from an image for object recognition and
segmentation. Besides lowering the computational cost, feature extraction is also a means
for controlling the so-called curse of dimensionality (Kunaver & asic, 2005).

Feature extraction methods are classified as low-level feature extraction and High-level
feature extraction. Low-level feature extraction is based on finding the points, lines, edge,
etc while high-level feature extraction methods use the low-level feature to provide more
significant information for further processing of Image analysis. Mostly high-level
feature extraction method uses the Artificial Neural Network (ANN)s to extract the
feature in multiple layers(Asogwa et al., 2007). CNN's learn data characteristics from
convolution operations, which is better suited to extract useful information from the
image.

36
2.7. Related Work
Lots of studies have been done recently to identify different plant diseases and pests
using various algorithms and techniques, especially in the identification and classification
of mango diseases. As per our understanding, the following are some studies we are
reviewing.

In (Arivazhagan & Ligi, 2018) a convolutional neural network was trained to identify
five common mango leaf diseases. They used a dataset of 1200 images of diseased and
healthy mango leaves was used of which 600 for training and 600 for testing. They used
data augmentation to increase their dataset by generating an artificial data. Even though
they achieved a better result, CNN takes much more training time, especially when using
augmented images. Thus to minimize the computational cost of CNN applying different
segmentation techniques and feeding the segmented image to CNN is preferable. And
combining it with machine learning techniques increases the performance of the
identification model.

In the study of (rdjan Sladojevic, Marko Arsenovic, Andras Anderla, Dubravko Culibrk,
& Darko Stefanovic, 2016), CNN has been tested for 13 different types of plant diseases.
In spites, they have used CNN as both classification and feature extraction for plant
diseases and achieved better results, here the main gap and the reason to get a better
result is they have used augmented images; CNN takes more training time for
identification. To overcome this problem, we consider different segmentation techniques
to select the region of interest on the mango leaf image and input it to CNN.

In another study conducted by (S.Veling, Kalelkar, Ajgaonkar, Mestry, & N.Gawade,


2019) was applied to identify four different types of mango diseases (i.e., Anthracnose,
Powdery Mildew, Red Rust, and Black Banded) and provides preventive mechanisms to
the farmers via some sort of user interface. They used Fast and Robust Fuzzy C-means
algorithm as their segmentation technique and Gray Level Co-occurrence Matrix
(GLCM) algorithm for feature extraction followed by SVM for classification of mango
diseases. However, considering GLCM features is limited with the texture and energy of
the pixels. So, instead of using GLCM, CNN performs well for feature extraction as well
as for classification(Raju, 2019). Because CNN replaces the feature engineering part
37
since it automatically extracts important features from raw data or images. Thus, if we
use CNN for feature extraction we don’t need to be worried about features and how to
select them.

2.8. Summary
In this chapter, we reviewed different studies on automated identification of mango leaf
diseases using image-processing, machine learning, and deep learning techniques which
are related to our study. We tried to show and explain the performance and the limitations
of each study and how we are going to fill these gaps. To the best of our knowledge, most
of the research works are conducted to identify mango leaf diseases using CNN without
pre-processing the images and other machine learning algorithms. Thus, in our study, we
designed a CNN model (to fill the gaps in previous or related works) combined with
multiclass SVM were applied to detect and classify mango leaf diseases and insect pests.
We have also applied different pre-processing, segmentation techniques, and Activation
functions to boost the classification performance of the proposed model.

38
Table 2. 1: Summary of related works

Author Title Year Method used Result Gap


Abien Fred M. An Architecture Combining 2019 CNN-Softmax 91.86% They used images under a controlled
Agarap Convolutional Neural Network CNN-SVM 90.72% environment, and there is no pre-
(CNN) and Support Vector processing or segmentation that can boost
Machine (SVM) for Image classification performance
Classification
Syed Inthiyaz, et al. Leaf Disease Detection Using 2020 ResNet50 97.42% They have used limited leaf images, 1500
ResNet50 However CNN with limited number of
images leads to overfitting.
Md. Rasel Mia, · Mango leaf disease recognition 2020 SVM, 80% - SVM without kernel cannot perform well
Sujit Roy, · Subrata using neural network and GLCM for for linearly non-separable data and multi-
Kumar Das,· Md. support vector machine feature class classification
Atikur Rahman extraction
Sampada Gulavnai, Deep Learning for Image- 2019 Transfer 91% - No pre-processing was applied
Rajashri Patil Based Mango Leaf Disease learning
Detection
Dr. P Rama Mango Plant Disease Detection 2020 MSVM 97% - No clear detail about # of classes, kernels
Koteswara Rao, Dr. Using Modified Multi Support used, and also using deep learning can
K Swathi Vector Machine Algorithm achieve better results

39
CHAPTER THREE
3. METHODOLOGY
3.1. Introduction
This section describes the methods and processes used in our proposed model step by
step. We proposed an architecture that will process the Mango leaves using the CNN and
SVM models. We have trained the model using the images which are publicly available
real mango leaf images. The steps involved in the disease and insect pest detection are
Digital image acquisition, Image pre-processing (histogram equalization, noise removal,
image resizing, and image transformation), image segmentation, Feature extraction, and
classification.
The first phase is the image acquisition phase. In this step, the images of the various
leaves that are to be classified were taken using a digital camera. In the second phase
image pre-processing was completed. In the third phase, segmentation using K-Means
clustering was performed to discover the actual segments of the leaf in the image. Later
on, feature extraction for the infected part of the leaf was performed using CNN. Finally,
classification was done using three different classification algorithms (CNN + Softmax,
SVM + kernel, and CNN-SVM) to compare the best classification performer algorithm.

40
3.2. Architectural Framework of the proposed model
Image
acquisition

Training Testing
phase phase
Image resizing

Image pre- Histogram adjustment Image pre-


processing processing
Image De-noising

Image augmentation

Image Image
segmentation segmentation
segmentation segmentation

Image resizing Image resizing


Model
CNN Feature
CNN Feature
extraction
extraction CNN

SVM+ Kernel

CNN+SVM

Out put

Figure 3. 1: Architectural framework of the proposed model

41
3.3. Image Acquisition
Because many cutting-edge classification models need a massive amount of labelled data
to attain a better result, sufficient image acquisition is the very first and essential step that
requires capturing an image using a digital camera. However, there is no open-source
mango leaf image (i.e. healthy and diseased) data set repository. The healthy and
unhealthy mango leaf pictures were captured manually from Weramit fruit and vegetable
research and training sub-center, and Bahir Dar city. Based on the severity of the disease
we select the collection of the top three mango leaf diseases such that white mango scale,
mango anthracnose, and mango powdery mildew. More than 1200 leaf images were
captured, 300 images for mango Anthracnose, 300 images for mango Powdery Mildew,
300 images for mango White Scale, and 300 images for Healthy mango leaf, and then
image augmentation is used to increase the number of the training dataset. The dataset
containing 4000 pictures of mango leaves has been utilized for training and testing the
proposed Deep CNN model. The images were captured in early January and February
2020 with a mobile phone and a digital camera. The images were stored in JPEG format
with a resolution of 5312 x 2988.
However, the datasets which are used for this research were not relatively clean. The raw
mango leaf images were affected by interference and background noise. This makes the
mango leaf feature a complex one. The intended classification model, CNN cannot
extract useful features of mango leaf images directly in an easy manner. For the diagnosis
of the mango leaf image, we had to perform data pre-processing to get ready for our data
for training the model and to rectify the mentioned issues. Image resizing, Histogram
equalization, noise removal, augmentation, and segmentation were performed on the raw
mango leaf images respectively. Then feature extraction using CNN and classification
using CNN, SVM, and CNN-SVM on raw and pre-processed data were performed
respectively. The dataset was split into 80% for training and 20% for validation/testing.
We used this ratio because it is the most commonly used ratio in neural network
applications.

42
3.4. Image Pre-processing

3.4.1. Image Resizing


Because the size of the acquired mango leaf images for this study were different, it was
necessary to normalize the images first. The images were compressed to a certain extent,
so that the subsequent steps throughout the classification could be carried out more
quickly while meeting the clarity of basic requirements. In this study the mango leaf
images used were resized twice. First the original acquired images were compressed to
470x470 image size using cv2.INTER_CUBIC interpolation which was automatically
computed by a written script in Python, using the OpenCV framework. We used
BICUBIC interpolation because it produces remarkably sharper images with an ideal
processing time and output quality than the other interpolation techniques. Then these
images passed through the subsequent pre-processing steps like histogram equalization,
image denoising, augmentation, and segmentation. Finally, the pre-processed images that
are used to train the model were resized to 260×260 using the same interpolation =
cv2.INTER_CUBIC. In the algorithm below the image on the left side, image is the
original image and the image on the right side indicates the resized using the OpenCV
BICUBIC interpolation algorithm.
Input: sample image i
Output: an array of resized images
1. Start
2. Image  original image i
3. Image  resize (image, dimension, interpolation INTER=CUBIC)
4. Image  image to array (image)
5. end

43
Figure 3. 2: Results of resized mango leaf images.
3.4.2. Histogram Equalization
Image enhancement is employed to accentuate or sharpen the image properties or features
of an image such as boundaries, edges, or contrast to make a display more recognizable.
In this study, we have used brightness preserving adaptive histogram equalization
(BPAHE) instead of basic histogram equalization to obtain better results. brightness
preserving adaptive histogram equalization works on small regions, unlike basic
histogram equalization which works on the entire image. For disease detection in mango
leaves, we require only the affected part so an adaptive histogram is more suitable in this
case. In the case of adaptive histogram equalization of a color image, the histogram is
expected to give the number of times a particular color has occurred in the image. The
histogram equalization is performed by first converting the filtered image in LAB color
space format. Then histogram equalization is performed only for the luminance
component. The A and B components are unaltered. Then in the histogram, the equalized
luminance component, unaltered A, and B components are converted back to RGB
format. Then the diseased part is enhanced for further analysis.
The following Figure 3.3 shows an adaptive histogram equalized image (the image on the
right side).

44
Figure 3. 3: Typical results of histogram enhanced mango leaf image

45
3.4.3. Image de-nosing
Due to the reflection of the mango leaves under sunlight, the instability of the camera
during shooting, and the influence of the natural environment, some noise will appear in
the images. Besides, the image may be interfered with by random signals during the
transmission process. It was thus necessary to enhance and denoise the mango leaf
images to get original information on the image. There are several types of de-noising
techniques to remove unwanted fluctuation in the image. From these different types of
image denoising techniques, we have used the median filter. The Median filter is a non-
linear filter that is most commonly used as a simple way to reduce noise in an image. Its
claim to fame (over other techniques for noise reduction) is that it removes noise while
keeping edges relatively sharp.

Figure 3. 4: Typical results of the median filtered mango leaf image

46
3.4.4. Data Augmentation
It is known that training deep neural networks from limited training data causes the over-
fitting problem. Data augmentation is a way to reduce over-fitting and increase the
amount of training data. It creates new images by transforming the ones in the training
dataset. To take advantage of different data augmentation various settings were applied to
the augmentation of images including random image flipping, padding, translation,
zooming, cropping, noise injection, rotation, and scaling. At last, the dataset containing
4,000 images for training has been created using data augmentation. Then both the
original image and created images were fed to train the model to address over-fitting and
we observed that using data augmentation can increase the performance of the model.

Figure 3. 5: Original mango leaf image

47
Figure 3. 6: Typical results of augmented mango leaf images

3.5. Image Segmentation


In this analysis, to get the relevant information from the damaged leaf area, three
different image segmentation techniques were applied to the mango leaf image. These are
K-means clustering, Mask R-CNN, and a hybrid of K-means clustering, and Mask R-
CNN.

3.5.1. K-means Segmentation


The first segmentation technique applied in this study was K-means clustering image
segmentation technique, which is an unsupervised learning algorithm to classify the
image patches based on attributes/features into K number of groups. K is a positive
integer number. The grouping is done by minimizing the sum of squares of distances
between data and the corresponding cluster centroid. Observations in the same cluster are
similar in some sense. K-means clustering is simple and computationally faster than other
clustering techniques and it also works for a large number of variables. But it produces
different cluster results for the different clusters and different initial centroid values. So,
it is required to initialize the proper number of cluster k and proper initial centroid. K in
this problem denotes the number of colored mango leaf lesions which results in a perfect
separation of infected parts leaving out the non-infected image pixels in other clusters.
The value of K has been chosen in run time based on the diseased leaf image.
Pseudo code
Input: sample image i
Output: segments images
1. Start
2. Image  original image i

48
3. Image  image to array (image) #convert it into a 2-dimensional array
4. K-means  kmeans(n_clusters, random_state).fit(Image) #fit the k-means
algorithm on this reshaped array and obtain the clusters.
5. clustered_imgImage.reshape(Image.shape[0], Image.shape[1], image.shape [2])
#bring back the clusters to their original shape
6. save segmented image
7. end

(a) (b)
Figure3. 7: Result of k-means segmentation on mango leaf image
3.5.2. Mask R-CNN
The second segmentation technique that has been used in this study was Mask R-CNN-
based segmentation which is an extension of the Faster R-CNN algorithm which comes
by adding a Conv Net for semantic segmentation at the pixel level for regions identified
by Faster R-CNN. This is due to the physical size of mango leaf lesions is smaller than
any other cereals applying other traditional segmentation results in an error for feature
extraction. To minimize the computational time for CNN as extracting representing
feature and as identifier segmentation is mandatory.

Then finally, we have applied the combined approaches of k-means and mask R-CNN
segmentation technique to enhance the performance of the identification accuracy.

49
3.6. Classification
The common structure of a CNN for image classification has two main parts: first, a long
chain of convolutional layers for feature extraction, and second a few (or even one) layers
of the fully connected neural network for classification.

For the feature extraction step, CNN with a sequential model with multiple layers is used.
It consists of seven layers of Conv2D, ReLu, Max-pooling2D, and a fully connected
layer. then to classify mango leaf diseases the extracted features are fed to three
classification models, which are CNN + Softmax, SVM + kernel, and CNN-SVM. Image
classification with raw data, data with pre-processing, and pre-processed segmented data
using the three classification models is applied. The steps in the classification process
include:

3.6.1. Building the Model


Case I: CNN + Softmax
Features in the preprocessed dataset are extracted to reduce their dimension and use other
classification models like SVM on it. The convolutional neural network architecture with
the sequential model is implemented with many layers such as convolutional, activation,
max-pooling to extract important features from the mango leaf image. Keras with the
TensorFlow backend system helps to construct this model layer by layer. The add ( )
function is used to add layers. The Conv2D layer is used for input images as 2-
dimensional matrices, and a dense layer is used for output images. As a filter matrix, the
kernel size was 3 x 3. As an activation function for this model, Tanh, ReLU, Leaky
ReLU, and swish were used. The size of the input images is (260, 260). That means the
height and weight of the images should be 260.

Pooling layers reduce the number of parameters when the images are too large. Here
MaxPooling2D of feature map matrix with stride value of 2 is applied to down sample the
convolved image. We used MaxPooling2D because in practice this has been found to
perform better than average pooling for image classification.it results in a down sampled
or pooled feature map highlighting the most relevant feature in the patch. To find the
probabilistic value, a flattening layer is used to convert three dimensionalities of an image

50
to a single one, followed by two fully connected dense layers containing a SoftMax
activation function for the highest likelihood classification.

Case II: SVM + Kernel


In the second case to use the SVM classifier over CNN features we removed the last
dense layer which contains the SoftMax activation function and then the extracted
features are saved as CNNfeatures1.csv file. Then the CSV file is inputted into the SVM
classifier.

3.6.2. Compiling the Model


Case I: CNN + Softmax

The parameters used for compiling the CNN model are optimizer, loss, and metrics. As
an optimizer, 'adam' has been used. It is a successful optimizer during the entire training
to adjust the learning rate. For the loss function,' Categorical cross-entropy' is used. The
'accuracy' metric is used to see the accuracy score on the validation set during the training
of the model to make it even easier to interpret.

Case II: SVM + Kernel

The support vector machine (SVM) is a linear binary classifier. The goal of the SVM is
to find a hyper-plane that separates the training data correctly in two half-spaces while
maximizing the margin between those two classes. Although SVM is a linear classifier,
which could only deal with linearly separable data sets, we can apply a kernel trick to
make it work for non-linear separable case. Thus, we applied a commonly used kernel
besides linear is the RBF kernel.

To read the CNN extracted features from a CSV file, we have used the read_csv ()
method of the pandas library. Then we divide the data into attributes and labels using the
drop() method. training and testing data sets were split using the 80:20 ratio for training
and validation sets. The train_test_split method of Scikit-Learn is used to divide data into
training and test sets.

3.6.3. Training and Testing the Model


Case I: CNN + Softmax

51
The 'fit ()' feature was used to train the CNN model. Since it is the most widely used in
CNN applications, we have used the 80/20 training/test ratio. The time () function is used
to measure the time needed for the training. We have set the batch size and number of
epochs in the fit function, which will loop the model through the data. The testing process
was processed after the training process had been completed to check the effectiveness of
the trained CNN model.

Case II: SVM + Kernel


The hyper-parameters for SVM include the type of kernel and the regularization
constant C. however in our experiment since we used the RBF kernel, there was an
additional parameter gamma (γ) for selecting which radial basic function to use. We used
this kernel function because it can map an input space in infinite-dimensional space. Thus
it performs best for linearly non-separable data than the other kernels.

To maintain regularization, parameter C is used. C is the penalty parameter, that


represents a misclassification or error term. The misclassification or error term tells the
SVM optimization of how much error is bearable. This is how the trade-off between
decision boundary and misclassification terms can be controlled. A smaller value of C
creates a small-margin hyperplane and a larger value of C creates a larger-margin
hyperplane. we tried to find the optimal value of C and gamma at which we get the best
performance. SVM is a binary classifier. However, we could use the one-vs-all or one-vs-
one approach to make it a multi-class classifier. Then fit method of the SVC class is
called to train the algorithm on the training data. the “predict ()” method of the SVC class
is used to make predictions on the test set.

52
CHAPTER FOUR
4. EXPERINENT, RESULT AND DISCUSSION
4.1. Introduction
In this chapter, the experimental evaluation of the proposed model for Mango disease and
insect pest detection using Enhanced imaging and Machine Learning techniques is
described in detail. Experimental evaluation approves the realization of the proposed
model or architecture. The dataset used and the implementation of the proposed model
are described thoroughly. Test results at the training and validation phase are compared.
The effect of different segmentation techniques and activation functions is evaluated and
compared with models disregarding these techniques. Besides, the CNN test results are
presented and compared with SVM and combined CNN-SVM classification models.

4.2. Experimental Setup


4.2.1. Dataset
For this research, mango leaf images labelled by plant disease experts were captured
using a digital camera. In the image acquisition phase, normal and diseased pictures are
taken. We have taken images of the mango leaf with three types of abnormalities (white
mango scale, anthracnose, and powdery mildew). To boost the quantity of the dataset and
quality of classification model image augmentation was applied to the original images.
All the images are in JPEG and have a pixel size of 5312 x 2988. The total number of the
dataset inputted to the CNN was 4000 images, 1000 images per class.
Table 4. 1: Dataset description

No. Grade of mango Source Resolution Image Quantity


leaf disease format
1 Healthy Manually captured 5312 x 2988 JPG 1000
2 White mango scale Manually captured 5312 x 2988 JPG 1000
3 Anthracnose Manually captured 5312 x 2988 JPG 1000
4 Powdery mildew Manually captured 5312 x 2988 JPG 1000

53
4.2.2. Implementation
In this section, the performance of the proposed models was evaluated using the data
described in the previous section. The models were developed using Keras (TensorFlow
of Python 3.7 as a backend), and anaconda spyder and the computing environment used
in these experiments was implemented with google colab, and a laptop computer with
Intel(R) Core (TM) i7-7500U CPU @ 2.70GHz 2.90GHz, 8.00 GB RAM using the
scikit-learn, Tensor flow, Keras, pillow, and OpenCV libraries, which use the python
programming language. The model is trained for 40 epochs, a batch size of 32, and a
starting or initial learning rate of 0.001 (1e-3). The data is partitioned into training and
testing datasets such that 80 percent of the data is allotted for training the model and 30
percent of the data is allotted for testing. Allocating 80% of the dataset for training is
close to optimal for reasonably sized datasets (greater than 100 images).
4.3. Result and Discussion
Two tests using the raw (without pre-processing), and pre-processed augmented mango
leaf images were carried out to verify the performance of the three learned models. To
evaluate the performance of the experimental result of the proposed model we Then test
results at the training and validation phase are evaluated. As we’ve recorded the accuracy
and loss of the model per epoch with the tables and plots below the gap between training
and validation accuracy indicates the amount of over-fitting. In the first phase, different
healthy and infected mango leaf image was inputted for classifying those. The images are
selected from the dataset folder. In the end, the expected classified outcome is found.

Case I: Validation accuracy of classification models with raw mango leaf images

Table 4. 2: CNN classification analysis with raw mango leaf images

Activation function CNN (%) SVM + RBF (%) Combined( CNN + SVM ) (%)
Tanh 92.75 95.2 96.12
Relu 92 94.91 95.75
Leaky Relu 95.23 95.53 96.32

In the figures listed below, Figure 4.1: shows the validation accuracy of CNN-SVM on
mango leaf disease and insect pest identification using the raw image dataset, while
54
Figure 4.2 shows the validation loss. At 40 steps, both model were able to finish training
in 1 hour and 58 minutes. The CNN-Softmax model had an average validation accuracy
of 95.23%, the SVM-RBF model had an average validation accuracy of 95.52% and an
average validation loss of 0.236794931, while the CNN-SVM model had an average
validation accuracy of 96.32% and an average validation loss of 0.268976859.

Figure 4. 1: Training and validation accuracy of CNN-SVM on raw mango leaf images

Figure4. 2: Training and validation loss of CNN-SVM on raw mango leaf images
Case II: Validation accuracy of classification models with preprocessed mango leaf
images

55
Figure 4. 3: validation accuracy of CNN-SVM on preprocessed mango leaf images

Figure 4. 4 validation loss of CNN-SVM classifier on preprocessed dataset


Figure 4.3: shows the validation accuracy of CNN-SVM on preprocessed mango leaf
disease and insect pest identification using the pre-processed segmented dataset, while
Figure 4.4 shows their validation loss. At 40 steps, both models were able to finish
training in 1 hour and 24 minutes. The CNN-Softmax model had an average validation
accuracy of 96.58%, the SVM-RBF model had an average validation accuracy of 97.37%
and an average validation loss of 0.13594931, while the CNN-SVM model had an
average validation accuracy of 99.78% and an average validation loss of 0.028976859.

56
Table 4. 3: Classification analysis of CNN model with different Activation functions on a
preprocessed image dataset

CNN (%) k-means Mask R-CNN Combined (k-means +


Mask R-CNN)
Tanh 93.025 96.012 95.523
Relu 94.89 94.99 96.01
Leaky Relu 95.7 96.01 96.58
Table 4. 4: Classification analysis of SVM + RBF model with different Activation
functions on a preprocessed image dataset

SVM + RBF (%) k-means Mask R-CNN Combined (k-means +


Mask R-CNN)
Tanh 93.915 95.53 96.13
Relu 94.6 96.02 96.9
Leaky Relu 96.62 97.01 97.37

Table 4. 5: Classification analysis of Combined ( CNN + SVM )model with different


Activation functions on a preprocessed image dataset

Combined ( CNN + k-means Mask R-CNN Combined (k-means +


SVM ) Mask R-CNN)
Tanh 96.025 96.51 98.23
Relu 96.89 96.9 98.9
Leaky Relu 97.62 98.01 99.78

After 40 training steps, both models were tested on the test cases of each dataset. The
tables above show the validation accuracies of CNN-Softmax, SVM-RBF, and CNN-
SVM on image classification using raw and preprocessed segmented datasets. When
CNN-SVM with preprocessed images are it shows an accuracy of 99.78 using leaky Relu
activation function and combined segmentation technique. The test accuracy on the raw
dataset does not corroborate the findings in, as it was CNN-SVM which had a better
classification accuracy than CNN-Softmax, and SVM-RBF. The result, easily shows that
the CNN-SVM classifier model with Leaky Relu activation successfully works and can
detect all diseases and healthy images with the highest validation accuracy.

57
CHAPTER FIVE
5. CONCLUSION AND RECOMMENDATION
5.1. Conclusion
Mango (Mangifera indica) is one of Ethiopia’s most delicious and valuable cultivated
fruit crops. It is shipped to many countries in the form of raw or mature fruits, and also in
the form of processed consumables such as mature slices of mango or juice, raw Pickle
mango, etc. mango is high in vitamins A and C, in herbal medicine it also has high
medicinal qualities. And mango leaves are mostly used during rituals as they are active
against gram-bacteria with antibiotical activity. However, recently the market value of
the Ethiopian mango has declined due to unregulated pesticide use. Image processing and
deep learning have a significant role in the early identification of diseases and insect pests
to control the use of hazardous pesticides. In this study, we proposed an enhanced mango
disease and insect pest detection system using hybrids of image processing and deep
learning. We collect our leaf image data set using a digital camera form from Amhara
Region main mango production areas such as Weramit fruit and vegetable research and
training sub-center, Bahir Dar city. And then we pre-process the images to get an
enhanced image dataset. Next, we apply K-means and mask R-CNN segmentation
techniques on the pre-processed images to get the region of interest. To increase the
model performance, we also applied the comparison of three learning functions such that
Tanh, Relu, Leaky Relu, and swish. Then CNN is used for feature extraction and
classification. Besides, other two additional models such that SVM and combined CNN-
SVM were compared. Based on the accuracy achieved by these models we can conclude
that the hybrid model CNN-SVM with Leaky Relu activation function has the best
performance for the identification of mango leaf diseases and insect pests with the
validation accuracy of 99.78% over the CNN and SVM models.

58
5.2. Recommendation
The mango leaf disease and insect pest identification models proposed in this research
work can be used for the detection of other plant diseases. We would conclude that the
model proposed matches the current requirement, although some problems need extra
work. This research work can be further improved or implemented to identify related
plant diseases. Therefore, we see two key points in this section that remain a challenge
and, of course, limit this research work. The first is that three segmentation techniques
were applied to improve the classification efficiency in this study. However, in the future
study considering other deep learning methods for segmentation can achieve better
results in detecting plant diseases. On the other hand, the best model can be built for
mobile devices in real-time mango leaf photos to monitor mango leaf diseases and insect
pests that lead to increase high-quality mango production.

59
References
Adatrao, S., & Mittat, M. (2016). An Analysis of Different Image Preprocessing Techniques for
Determining the Centroids of Circular Marks Using Hough Transform. 2nd International
Conference on Frontiers of Signal Processing (pp. 110-115). IEEE.

Anjulo, M. T. (2019). Perception of Ethiopian Mango Farmers on The Pest Status and Current
Management Practices for The Control of The White Mango Scale, Aulacaspis
Tubercularis (Homoptera: Diaspididae). JOURNAL OF ADVANCES IN AGRICULTURE.

Arivazhagan, S., & Ligi, S. (2018). Mango Leaf Diseases Identification Using Convolutional Neural
Network. International Journal of Pure and Applied Mathematics.

Ayalew, G., Fekadu, A., & Sisay, B. (2015). Appearance and Chemical Control of White Mango
Scale (Aulacaspis tubercularis) in Central Rift Valley. Science, Technology and Arts
Research Journal.

B, A. P., & K, G. S. (2016). Image Denoising Techniques-An Overview. IOSR Journal of Electronics
and Communication Engineering (IOSR -JECE) , 78-84.

Bagade, S. S., & Shandilya, V. K. (2011). USE OF HISTOGRAM EQUALIZATION IN IMAGE


PROCESSING FOR IMAGE ENHANCEMENT. International Journal of Software Engineering
Research & Practices, 6-10.

Bridgelall, R. (2017). Lecture Notes: Introduction to Support Vector Machines.

CSA. (2017/18). REPORT ON AREA AND PRODUCTION OF MAJOR CROPS. THE FEDERAL
DEMOCRATIC REPUBLIC OF ETHIOPIA; CENTRAL STATISTICAL AGENCY.

Dessalegn, Y., Assefa, H., Derso, T., & Tefera, M. (2014). Mango Production Knowledge and
Technological Gaps of Smallholder Farmers in Amhara Region, Ethiopia. American
Scientific Research Journal for Engineering, Technology, and Sciences (ASRJETS).

Dhruv, B., Mittal, N., & Modi, M. (2017). Analysis of Different Filters for Noise Reduction in
Images. Recent Developments in Control, Automation and Power Engineering(RDCAPE),
410-415.

Dickson, B. (2020, January 06). What are convolutional neural networks (CNN)? Retrieved May
20, 2020, from TechTalks: https://bdtechtalks.com/2020/01/06/convolutional-neural-
networks-cnn-
convnets/#:~:text=A%20brief%20history%20of%20convolutional,a%20postdoctoral%20
computer%20science%20researcher.&text=The%20early%20version%20of%20CNNs,)%
2C%20could%20recognize%20handwritt

FAO. (2019). January - March 2019; Postharvest Extension Bulletin. FAO Office in Ethiopia.

60
Janwale, A. P. (2017). Plant Leaves Image Segmentation Techniques: A Review. INTERNA TIONAL
JOURNAL OF COMPUTER SCIENCES AND ENGINEERING, 147-150.

Jurgen, G. (2003). Mango growing in Kenya. World Agroforestry Centre.

K, P. M., Sivakami, R., & M.Janani. (2019). Sooty Mould Mango Disease Identification Using Deep
Learning. International Journal of Innovative Technology and Exploring Engineering
(IJITEE), 402-405.

Kaur, A. (2014). A Review Paper on Image Segmentation and its Various Techniques in Image
Processing. International Journal of Science and Research (IJSR), 12-14.

Kaur, D., & Kaur, Y. (2014). Various Image Segmentation Techniques: A Review . International
Journal of Computer Science and Mobile Computing, 809-814.

Kaur, G., Kumar, R., & Kainth, K. (2016). A Review Paper on Different Noise Types and Digital
Image Processing. International Journal of Advanced Research in Computer Science and
Software Engineering, 6(6), 562-565.

Kunaver, M., & asic, J. T. (2005). Image feature extraction - An overview. IEEE Xplore.

Langampol, K., Srisomboon, K., Patanavijit, V., & Lee, W. (2019). Smart Switching Bilateral Filter
with Estimated Noise Characterization for Mixed Noise Removal. Mathematical
Problems in Engineering.

Manjula.KA. (2015). Role of Image Segmentation in Digital Image Processing For Information
Processing. International Journal of Computer Science Trends and Technology (IJCST),
312-318.

rdjan Sladojevic, Marko Arsenovic, Andras Anderla, Dubravko Culibrk, & Darko Stefanovic.
(2016). Deep Neural Networks Based Recognition of Plant Diseases by Leaf Image
Classification. Computational Intelligence and Neuroscience.

S, M., V.P, L., & Rangaswamy, D. S. (2016). Survey On Image Denoising Techniques. International
Journal of Science, Engineering and Technology Research (IJSETR), 2824-2827.

S.Veling, P. S., Kalelkar, M. S., Ajgaonkar, M. L., Mestry, M. N., & N.Gawade, M. N. (2019).
Mango Disease Detection by using Image Processing. International Journal for Research
in Applied Science & Engineering Technology (IJRASET), 3717-3726.

Saha, S. (2018, DEcember 15). A Comprehensive Guide to Convolutional Neural Networks — the
ELI5 way. Retrieved may 7, 2020, from towardsdatascience:
https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-
networks-the-eli5-way-3bd2b1164a53

61
Sethupathy, J., & S, V. (2016). OpenCV Based Disease Identification of Mango Leaves.
International Journal of Engineering and Technology (IJET).

Singh, R., Singh, P., & Parveen, F. (2015). BRIEF REVIEW ON IMAGE DENOISING TECHNIQUES.
International Journal of Science, Technology & Management , 336-344.

Sonam, & Dahiya, R. (2015). Histogram Equalization Based Image Enhancement Techniques For
Brightness Preservation And Contrast Enhancement. International Journal of Advanced
Research in Education Technology (IJARET), 83-89.

Sudhakar, S. (2017, July 10). Histogram Equalization. Retrieved April 23, 2020, from Towards
Data Science: https://towardsdatascience.com/histogram-equalization-5d1013626e64

Tegegne, T., & Birhanie, W. (2019). Knowledge Based System for Diagnosis and Treatment of
Mango Diseases. Information and Communication Technology for Development for
Africa (pp. 11-23). Bahir Dar: Springer Nature Switzerland.

Tewodros Bezu, Kebede Woldetsadik , & Tamado Tana. (2014). Production Scenarios of Mango
(Mangifera indicaL.) in Harari Regional State, Eastern Ethiopia. Wollega University:
Science, Technology and Arts Research Journal Sci. Technol. Arts Res.J (STAR).

TF, D. (2018). Newly Emerging Insect Pests and Diseases as a Challenge for Growth and
Development of Ethiopia: The Case of Western Oromiya. Journal of Agricultural Science
and Food Research.

Ullagaddi, S. B., & Raju, S. (2017). Disease Recognition in Mango Crop Using Modified Rotational
Kernel Transform Features. International Conference on AdvancedComputing and
Communication Systems.

WA, D., SS, W., & MK, N. (2016). Survey of anthracnose (Colletotrichum gloeosporioides) on
mango (Mangifera indica) in North West Ethiopia. Plant Pathology & Quarantine .

Yamashita, R., Nishio, M., Do, R. K., & Togashi, K. (2018). Convolutional neural networks: an
overview and application in radiology. Insights into Imaging , 611-629.

62

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy