0% found this document useful (0 votes)
14 views

Predicting Disease Percentage Optimally in Leaf Variants Using Machine Learning Algorithms

Uploaded by

prakash S
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Predicting Disease Percentage Optimally in Leaf Variants Using Machine Learning Algorithms

Uploaded by

prakash S
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 11

Predicting Disease Percentage Optimally in Leaf

Variants using Machine Learning Algorithms


V. Suresh Kumar1st K. Tarunika2nd K. Maharaja3rd
Department of Information Technology Department of Information Technology Department of Information Technology
Vel Tech Multi Tech Dr. Rangarajan Vel Tech Multi Tech Dr. Rangarajan Vel Tech Multi Tech Dr. Rangarajan
Dr. Sakunthala Engineering College Dr. Sakunthala Engineering College Dr. Sakunthala Engineering College
Chennai, India Chennai, India Chennai, India
ORCID: 0000000217081498 email: tarunikakumar09@gmail.com email: maharaja7251@gmail.com

G. Meenakshi4th S. Prakash 5th


Department of Information Technology Department of Information Technology
Vel Tech Multi Tech Dr. Rangarajan Vel Tech Multi Tech Dr. Rangarajan
Dr. Sakunthala Engineering College Dr. Sakunthala Engineering College
Chennai, India Chennai, India
email: meenakshi1706g@gmail.com email: prakash.sm0811@gmail.com

Abstract - Global agriculture is seriously threatened by plant delve into how ML algorithms which can provide precision,
diseases, which influence both food security and economic speed, and scalability revolutionize the way we identify,
stability. Quick response is made possible by real-time mitigate, and prevent plant diseases. By harnessing the
detection to prevent the spread of illness and protect capabilities of these intelligent systems, we aim not only to
agricultural production. Achieving high accuracy and low
safeguard global food supplies but also to elevate agricultural
latency detection is very important and it might be a
challenging process, particularly for computationally intensive
sustainability, minimize the waste of resources and provide
models. Since plant leaf image datasets can vary in size, the foundation for a more robust and lucrative future for
existing research ideas include analysis carried out using one agriculture. The strength of machine learning (ML)
algorithm only, which may not be suitable for all datasets. This algorithms combined with the advent of cutting-edge
work develops an automated method for the early detection of technologies in recent years provided a viable remedy. This
diseases harming plant leaves, providing an innovative research effort embarks on a journey into the world of
approach to this issue. In it, four distinct machine learning "Determining Optimal Machine Learning Algorithm for
algorithms – Support Vector Machine (SVM), K-Nearest Predicting Disease Percentage in Leaf Variants", where we
Neighbor (KNN), Convolutional Neural Networks (CNN), explore how the fusion of AI and agriculture can
Decision Tree are used. As MATLAB is a great tool for
revolutionize disease detection, prevention, and ultimately,
numerical computing, we made use of it to provide results with
the highest degree of precision possible. This system allows for the sustenance of global food supply. Through this paper, we
the execution of many analyses, with the best techniques being endeavor to shed light on a transformative approach that
used in accordance with the needs and the kind of datasets. holds the potential to enhance crop yields, reduce resource
Numerous findings derived from four distinct types of wastage, and foster a more resilient and sustainable
algorithms are included in the suggested system. The Proposed agricultural future.
system shows the analysis of different plant leaves diseases,
which predicts the percentage of diseased leaves. As a result, II. RELATED WORKS
we can select which algorithm is the best suited one to identify
Pranesh Kulkarni et al provides a thorough assessment of
the diseases in plant leaves at an early stage. By analyzing crop
images, it is possible to identify even the smallest signs of the numerous studies carried out in the area of plant disease
disease, allowing for timely intervention to halt the disease's assessment using both deep learning and cutting-edge,
course. If we lessen these interventions, sustainable agriculture custom features-based algorithms [1]. We solve the
may develop in harmony with the environment. difficulties in identifying plant diseases by employing
methods based on customized traits. The difficulties with
Keywords - Machine Learning, Support Vector Machine, handmade based on characteristics approaches are resolved
Decision Tree, Convolutional Neural Networks, K Nearest by applying deep learning-based techniques. This work
Neighbor, Agriculture, Leaf Disease Analysis provides a breakthrough in the identification of plant
diseases by moving from manually built feature-based
I. INTRODUCTION
models to deep learning-based models. We find that whereas
The foundation of our society is agriculture, which supports deep learning-based techniques achieve exceptional
all the people globally by giving them food and means for precision rates on a particular dataset, evaluating the system
subsistence. This essential industry faces a significant over several datasets or under field image conditions may
challenge in the form of plant diseases, which can devastate result in a considerable decline in the model's performance.
crops and lead to food shortages and economic losses. The
traditional methods of disease detection and management According to Madiwalar, Shririoop and Medha, a color
often fall short in terms regarding precision and image of mango leaves has been used to suggest a machine
effectiveness. In this era of data-driven decision-making, vision technique for identifying illnesses in plants [2].
where computational power and innovation flourish, we During the testing phase, the classifier is fed with a feature
vector comprising textural and color attributes of the input diseased are further classified into their specific diseased
images, which was created using the YCbCr converted names as demonstrated in below Fig. 1.
image. For the extraction of texture and colour features,
GLCM, a color-based approach, and the Gabor filter were
employed. There has been a comparison between the
outcomes from the Minimum Distance Classifier and the
Support Vector Machine (SVM). To determine the specific
outcomes for each feature extraction technique, an analysis
of the techniques was conducted.

Bharate et al proved that increased plant production and


efficient growth are essential for the Indian economy and
farmer's profit margin [3]. Farmers require subject matter
specialists for manual plant monitoring to do this.
Furthermore, domain specialists are costly since farmers
must pay fees that include travel expenses and are not
accessible in all areas. Therefore, it calls for the creation of
an effective, clever farming method that will promote
greater development and yield with fewer labor-intensive
processes. In this study, we evaluate techniques in the field
Fig. 1. Different types of leaves
of image processing that have been developed by different
researchers to identify illness in plants.

The study's objective is to look at the variables that make it


difficult for small-scale farmers to get timely, correct, and
appropriate farming knowledge and assistance, especially
when it comes to managing plant diseases through in-depth
leaf diagnostics [4]. In order to tackle this problem, the
study suggests creating a mobile solution that makes use of
image interpretation methods, convolutional neural
networks, and the rapidly advancing field of machine
learning, particularly supervised learning. The foundation of
the proposed prototype is based on the Design Science
Research Methodology, which offers an organized method
for developing a solution tailored to the unique requirements
of farmers [5]. The prototype intends to provide farmers
with an accessible platform for rapidly and accurately
identifying plant diseases, enabling them to take
preventative action to protect their crops and livelihoods. It
does this by combining state-of-the-art technology.

Three iterations of usability testing will be carried out to


evaluate the effectiveness and usability of the suggested
prototype [6]. These tests will determine whether the
prototype satisfies the farmers' requirements for timeliness,
relevance, and accuracy. The study aims to make sure that
the produced solution closely complies with the realistic
requirements and limitations experienced by small-scale
farmers in actual agricultural settings through iterative
testing and refining [7]. The ultimate objective is to produce
a mobile application that not only offers thorough advice on
controlling plant diseases but also easily fits into farmers'
current work processes, improving their capacity to make
Fig. 2.Dataset Collection
wise decisions and maximize agricultural practices for
increased sustainability and productivity.
B. Image Preprocessing
Images are loaded from directories, and before they are ready
III. METHODOLOGY for a Convolutional Neural Network (CNN), preprocessing
steps include resizing images to a fixed dimension (256x256)
A. Data Collection
using a custom function, computing histograms for colour
For this project we have used public dataset for plant leaves features, and using Local Binary Patterns (LBP) for texture
disease detection which has a total of 41676 images out of feature extraction after converting images to grayscale.
which it has been segregated into 5 types of plants and each Normalizing pixel values is a standard method in CNNs for
divided into sub folders healthy and diseased [19], the better convergence during training, albeit it isn't stated
explicitly in these implementations. Preprocessing generally
includes loading photos, calculating features (such as color
histograms or LBP features), and occasionally scaling images
for uniform input dimensions. Normalization is a standard
procedure, particularly in deep learning applications.
C. Image Segmentation
The deliberate process of dividing an image into several
pieces to make its representation simpler isn't done openly.
Rather, feature extraction and classification are the main
Fig. 4.Extracted Features from Diseased Guava Leaves
areas of interest for plant disease identification. The way that
each implementation handles image processing is broken E. Model Training
down as follows: For every RGB channel, color histograms
are calculated to show the distribution of pixel intensities. In In order to prepare the dataset for model training, images are
the same way, texture information is recovered from usually loaded from pre-designated directories, pertinent
grayscale photos using Local Binary Patterns (LBP), which features like color histograms or Local Binary Patterns (LBP)
compare each pixel's intensity to that of its neighbors. Images are extracted, or Convolutional Neural Networks (CNNs) are
are supplied into the CNN architecture at a fixed size of 256 employed to learn hierarchical representations of the images.
× 256 so that its convolutional layers may learn hierarchical Throughout training, the model divides the dataset into
features and implicitly identify patterns and structures in the training and validation sets and iteratively optimises its
images. parameters using optimisation techniques like stochastic
gradient descent (SGD) or variants. Training aims to
minimise a specified loss coefficient, which measures the
deviation between expected and ground truth labels. Epochs
are used in model training, in which the whole dataset is run
through the model repeatedly until convergence or a
predetermined stopping threshold is satisfied.
F. Model Evaluation
Metrics including accuracy, precision, recall, and F1-score
are commonly employed in model evaluation, which
measures the trained model's performance on unseen data.
To calculate a forementioned metrics, the trained model is
fed the validation or test dataset throughout the evaluation
phase. The trained model then generates predictions, which
are evaluated against ground truth labels. Further
information about the model's performance across various
classes or thresholds can be obtained by the use of confusion
Fig. 3.Segmented Images of an Apple Leaf matrices and receiver operating characteristic (ROC) curves.
To assess the model's capacity for generalization and
D. Feature Extraction pinpoint possible areas for development, model evaluation is
The feature extraction methods used in the various essential.
implementations that are offered show different methods for
processing and analyzing visual data to detect plant diseases. The ratio of the number of correctly predicted labels to
To efficiently characterize color properties, color histograms the total number of samples in the test set is used to compute
are used to depict images based on the distribution of pixel accuracy. Mathematically, the accuracy is computed using
intensities across the RGB channels. As a texture descriptor, the following formula:
Local Binary Patterns (LBP) are used to identify local
patterns in grayscale photographs and offer information on Number of Correct Predictions
textural changes that are important for identifying diseases. Accuracy=
Furthermore, Convolutional Neural Network (CNN) Total Number of Predictions
architecture is utilized, which allows for automatic feature
extraction during training via convolutional and pooling This is accomplished in MATLAB by calculating the
layers. All techniques are histogram-based, texture-oriented, mean of the logical array that emerges from the comparison
or deep learning-drive which offers unique benefits and of the anticipated and true labels element-by-element.
advance the thorough examination of plant photos for the G. Model Fine-Tuning
purpose of identifying diseases.
Model fine-tuning is a post-training process that seeks to
enhance the model's functionality by modifying
hyperparameters, improving the architecture, or adding more
regularization methods. Experimenting with various
configurations, like alternative learning rates, batch sizes, or
dropout rates, is a common step in fine-tuning models in
order to improve their behavior and resolve overfitting or
underfitting concerns.
Fig. 6. Colour Histogram Comparison for SVM

The mathematical formula used in the fitcsvm function,


which is part of MATLAB's Statistics and Machine
Fig. 5. Proposed Methodology
Learning Toolbox, involves the principles of Support Vector
IV. ALGORITHMS Machines (SVMs).

A. Support Vector Machine (SVM) Finding a hyperplane that best divides the classes in the
A strong machine learning technique for regression, feature space is the goal of support vector machines (SVM).
outlier detection, and both linear and nonlinear classification Given a training dataset (x1, y1), (x2, y2), …., (xn, yn), where
is the support vector machine (SVM). SVM can handle xi is a feature vector and yi is the corresponding class label
multifaceted data and relationships that are nonlinear and are (either +1 or -1).
versatile and productive in a large number of uses. When
attempting to identify the largest separation hyperplane The SVM's decision function can be written as:
among the several classes that comprise the goal feature, f ( x )=sign(w . x +b)
their techniques work incredibly well. where,
 x is the input feature vector,
The system describes a methodical way to use color  w is the weight vector perpendicular to the
histograms taken from leaf photos to train a Support Vector hyperplane,
Machine (SVM) model for classifying leaves into healthy
 b is the bias term,
and unhealthy categories. First, the code uses the
imageDatastore function to load images from two different  ⋅ denotes the dot product,
directories—the Healthy and Diseased folders—creating a  sign (⋅) is the sign function.
structured repository of image data for analysis. Using labels
of 0 for healthy leaves and 1 for diseased leaves, the method Finding the ideal w and b to maximize the variance between
makes sure the image collection is properly categorized. It the classes and minimize the classification error is the
then moves on to the important stage of feature extraction, optimization challenge for SVM. This problem can be
which is the computation of color histograms for every formulated as a constrained optimization problem:
image. To do this, each image must first be divided into its 1
n
minimize ‖w‖ +C ∑ ξi
2
red, green, and blue channels. Next, 256 bins must be
created for each channel in a histogram. These histograms 2 i=1
give important insights into the underlying color Based on the limitations:
composition by succinctly capturing the color distribution
inside each image. The SVM classifier uses the feature y i ( w ∙ xi + b ) ≥1−ξi ​, for i=1 , 2 ,… , n
vectors that are created by concatenating the histograms.
Based on the feature vectors extracted from the color
histograms, the SVM model, trained with a linear kernel ξi ​≥ 0 , for i=1 , 2 ,… , n
function and the 'fitcsvm' function, learns to differentiate
between healthy and diseased leaves. The algorithm then Here,
predicts the labels of mixed test photos from a specified  The trade-off between maximizing the margin and
folder after the model has been trained. The SVM model minimizing the classification error is managed by
uses the same process to generate color histograms from the regularization parameter C .
these images and then uses those features to predict labels.  ξi are free variables that permit some degree of
Lastly, the code provides a quantitative evaluation of disease misinterpretation, with ξi=0 indicating correct
prevalence by calculating the proportion of diseased leaves
classification and ξi>0 indicating misclassification.
in the mixed test photos. This well-organized workflow not
only demonstrates how machine learning techniques may be B. Convolutional Neural Network (CNN)
integrated to detect plant diseases, but it also highlights how One type of deep learning algorithm that excels at tasks
important feature extraction and model training are to
involving visual identification and interpretation is
automating agricultural processes and improving crop health
Convolutional Neural Networks (CNNs). It consists of
monitoring capabilities.
several layers: Pooling layers are used to down-sample the
feature maps and retain the most crucial data. The output of
the pooling layers is then transmitted via one or more fully
connected layers, which are employed to foresee or analyse C. K-Nearest Neighbor (KNN)
the image. Convolutional layers, which make up most of A reliable and user-friendly machine learning technique for
CNN, utilise filters on the source image to extract elements handling classification and regression issues is the K-
such as textures and shapes. Nearest Neighbor (KNN) algorithm. KNN uses its K nearest
neighbors in the training dataset to predict the tag or value
It describes a step-by-step procedure for convolutionally that corresponds to a new data point by using the similarity
training a neural network (CNN) to identify healthy and concept. Being non-parametric—that is, not predicated on
unhealthy leaves from pictures. The code first specifies the any basic presumptions about the distribution of data—it has
folder locations for photos of healthy, damaged, and mixed broad applicability in situations that are real (as opposed to
leaves. Then, using the imageDatastore method and a custom other methods, like GMM, which presume that the supplied
function called resizeImage to resize images to a fixed size of data have a Gaussian distribution) [13]. An attribute-based
256x256 pixels, it loads and resizes images from the Healthy prior data set (also known as training data) is provided to us,
and Diseased directories. To enable more efficient data allowing us to classify coordinates into groups.
handling and processing, the data from both folders are
merged into a single datastore. It categorises healthy and diseased leaves using the K-
Nearest Neighbours (KNN) method based on colour
Max pooling, ReLU activation, Convolutional, fully histograms that were taken from leaf photos. The code first
connected layers and batch normalisation comes next, it specifies paths to three distinct folders, labelled Healthy,
develops the CNN's architecture utilizing a series of layers Diseased, and Mixed, each of which holds pictures of either
that are followed by SoftMax and classification layers. A healthy or diseased leaves, or a combination of the two. Then
deep learning model composed of the above uses the imread function to read images into memory and the
mentioned layers is able to deduce intricate structures and dir function to retrieve file information from the Healthy and
characteristics from the input pictures. Diseased folders. For both healthy and diseased photos,
colour histograms are calculated, and each one shows the
Next, it uses the trainingOptions function to set up distribution of pixel intensities in the red, green, and blue
different training settings, including the maximum number of channels. After that, these histograms are concatenated to
epochs, the starting learning rate, and validation data to track create feature vectors, which comprise the KNN classifier's
the way the model functioned during training. To train the input data. Images are then given labels; the labels for the
healthy images are 0 and the labels for the sick images are 1.
CNN model, it makes use of the stochastic gradient descent
The KNN classifier is trained using an array made up of all
with momentum (SGDM) optimizer.
of these labels. The classifier is trained using the fitcknn
function, which takes five as the number of nearest
The CNN model is then trained using the trainNetwork neighbours (k). Images from the Mixed folder is loaded once
function, which makes use of the defined layers and images the model has been trained, and each image's colour
that have been supplied, in addition to the training options histogram is produced using the same procedure. Based on
that have been specified. The trained CNN model, or net, is the properties in the colour histogram, the trained KNN
obtained once training is finished. classifier is then used to predict labels for the mixed images.
Predicted labels for unhealthy leaves are incorporated
After training the model, the code loads and uses the together, then divided by the total number of images in the
resizeImage and imageDatastore functions to resize images mixed folder, the algorithm determines the percentage of
from the Mixed folder. Using the classify function, the diseased leaves among the mixed images. This percentage
trained CNN model (net) is then utilized to predict the labels sheds light on the mixed leaf dataset's prevalence of damaged
(healthy or diseased) for the mixed images. The code then leaves.
determines the proportion of unhealthy leaves among all the
photos in the mixed folder and counts the number of diseased
leaves based on the projected labels. In the mixed leaf
dataset, this proportion is used as a quantitative indicator of
disease prevalence.

Fig. 8. HOG Comparison for K-Nearest Neighbor (KNN)

A K-Nearest Neighbors (KNN) classifier is trained using


Fig. 7. Comparison of Epochs
the fitcknn function. KNN is an easy-to-understand method
that is employed in categorization assignments. Here's the
mathematical formula and explanation for the KNN
algorithm,

The KNN algorithm operates as follows given a training


dataset (x1, y1), (x2, y2)...., (xn, yn), where xi is a feature
vector and yi is the matching class label:

Training:
 During the training phase, the KNN algorithm
simply stores all the training data.

Prediction: Fig. 9. LBP Comparison for Decision Tree


 The algorithm determines the distances between a
new, unlabeled data point ( x test ) and each of the Here are the mathematical formulas and their
descriptions for building a decision tree classifier:
training data points.
 After calculating distances, the k closest neighbors
Entropy and Information Gain:
are chosen by the algorithm (closest data points) to
Entropy ( H ( S )):
x test .
 Entropy measures the impurity or disorder in a
 It then assigns the class indication to x test , based on dataset.
the majority class among its k nearest neighbors.  Formula:
 The new data point is classified into the H ( S )=− p 0 log 2 ( p0 ) −p 1 log 2 ( p1 )
neighbouring class if k = 1.

The mathematical formulation for KNN, especially where, p0and p1are the proportions of class 0 and class 1
regarding the prediction phase, involves calculating instances in the dataset S.
distances between data points. One common distance
measure is the Euclidean distance: Information Gain (IG (S, A)):

√∑
n  The diminution in entropy obtained by
2
Euclidean Distance= (x test [ i ] −x train [ i ] ) differentiating the dataset based on a specific
i=1 attribute can be determined by Information gain.
 Formula:
where n is the number of features x test [ i ] and x train [ i ] are the |S ϑ|
ith components of the feature vectors of the test and training IG ( S , A )=H ( S )− ∑ |S|
H (Sϑ )
ϑϵvalues ( A )
data points, respectively.
where, Sϑ represents the subset of S for which feature A
has value ϑ .
D. Decision Tree
For applications involving regression and classification, one Gini Impurity (G(S)):
of the finest supervised learning methods is the Decision  The likelihood that a randomly selected component
Tree algorithm. It builds a tree structure that resembles a of the dataset would be erroneously classified, if it
flowchart, with a class label for each leaf node (terminal were randomly annotated is measured by the Gini
node), an attribute test for each internal node, and a test impurity.
result for each branch. When a stopping demand, such as the  Formula:
lowest level of the tree or the lowest possible number of C
samples needed to split a node, is encountered, the training G ( S )=1−∑ p 2i
data is repeatedly divided into subsets based on the contents i=1
of the traits. The Decision Tree algorithm determines which
feature is best for dividing the data during training by where pi is the probability of selecting a sample of class i at
measuring the degree of impurity or unpredictability in the random from dataset S, and C is the number of classes.
subsets using a metric, such as entropy or Gini impurity. The The system shows how to use the Decision Tree method
goal is to identify the ascribe that optimizes the impurity to identify healthy and unhealthy leaves by using features
reduction or the information yield following the split. called Local Binary Patterns (LBP) that are taken from
photos of leaves. The code first specifies paths to three
distinct folders, labelled Healthy, Diseased, and Mixed, each
of which holds pictures of either healthy or diseased leaves,
or a combination of the two. Next, it uses the image
Datastore method to read images from the Healthy and
Diseased folders and arranges them into a datastore for
effective data management. The rgb2gray function is used to
convert RGB images to grayscale before LBP features are
retrieved from the grayscale images. The textural information
seen in the photos is captured by the Local Binary Patterns
(LBP) features, which offer important insights into the
underlying structures and patterns. The images are then 2424
given labels; the labels for the healthy images are 1, while Percentage of Late Blight = = 51.53%
the labels for the diseased images are 0. The retrieved
4704
features and their labels are integrated into a single dataset
that the Decision Tree classifier is trained on. The feature- By calculation we have proved, according to the dataset of
label pairs are used to train the decision tree model using the Potato Leaves, Early Blight and Late Blight diseases is of
fitctree function. The same process is used to extract LBP 51.53% which exactly matches with SVM and CNN
features from the photos in the Mixed folder following model algorithm very accurately. So we conclude that SVM and
training. The trained decision tree model is subsequently CNN algorithms can be used for potato leaves to predict the
applied to predict labels for the mixed photos based on their disease percentage.
LBP features. Predicted labels for unhealthy leaves are
B. Apple Leaves
summed up and divided by the total number of images in the
mixed folder, the algorithm determines the percentage of We have used a total of 9714 images from the dataset
diseased leaves among the mixed images. In the mixed leaf out of which 2510 belong to healthy images of apple leaves
dataset, this proportion is used as a quantitative indicator of and 7204 images belongs to diseased leaves. In diseased
disease prevalence. leaves we have further classified the images into specific
diseases such as Apple Scab which comprises of 2520
V. ANALYSIS OF PLANT LEAVES images, Black Rot of 2484 images and 2200 images of
A. Potato Leaves Apple Cedar Rust.
We have used a total of 7128 images from the dataset out
of which 152 belongs to healthy images of potato leaves and By implementing our system, we have done the analysis
4848 images belongs to diseased leaves. In diseased leaves by using four different algorithms and derived subsequent
we have further classified the images into specific diseases result below.
such as Early Blight which comprises of about 2424 images
and 2424 images of Late Blight.
By implementing our system, we have done the analysis
APPLE LEAVES DISEASES
by using four different algorithms and derived subsequent 60
result below.
50
40
30
POTATO LEAVES DISEASES
20
51.6
10
51.5
0
51.4 CEDAR RUST APPLE SCAB BLACK ROT

51.3 SVM KNN DT CNN


51.2
Fig. 12. Algorithm Comparison for Apple Leaves
51.1
51
50.9
EARLY_BLIGHT LATE_BLIGHT

SVM KNN DT CNN


Fig. 10. Algorithm Comparison for Potato Leaves

Percentage of Diseased Leaves =


No of Diseased Leaves
Percentage of Diseased Leaves =
Total Number of Leaves
No of Diseased Leaves
Total Number of Leaves 2520
Percentage of Scab Leaves = = 50.10%
5030
2424
Percentage of Early Blight = = 51.53%
4704 2484
Percentage of Black Rot Leaves = = 49.74%
4994
2200 Percentage of Diseased Leaves =
Percentage of Cedar Rust Leaves = = 46.71% No of Diseased Leaves
4710
Total Number of Leaves
By calculation we have proved, according to the dataset of
Apple Leaves, Scab Leaves constitute about 50.10%, Black 500
Rot of about 49.74% and Cedar Rust of about 46.71%. So Percentage of Anthracnose Leaves = = 50.00%
1000
we conclude that SVM algorithm can be used for Scab and
CNN for Cedar Rust leaves as well as for Black Rot disease
leaves to predict the disease percentage.
500
Percentage of Bacterial Canker Leaves = =
1000
50.00%
C. Mango Leaves
We have used a total of 3000 images from the dataset out of 500
which 500 belongs to healthy images of mango leaves and Percentage of Die Back Leaves = = 50.00%
2500 images belongs to diseased leaves. In diseased leaves
1000
we have further classified the images into specific diseases
such as Anthracnose which comprises of 500 images, 500
Percentage of Gall Midge Leaves = = 50.00%
Bacterial Canker of 500 images, Die Back of 500 images, 1000
Gall Midge of 500 images and 500 images of Powdery
Mildew. 500
By implementing our system, we have done the analysis Percentage of Powdery Mildew Leaves = =
1000
by using four different algorithms and derived subsequent
50.00%
result below.
By calculation we have proved, according to the dataset of
MANGO LEAVES DISEASES Mango Leaves, the diseased leaves constitute of about 50%
for all types of diseases. So we conclude that CNN
60.00 algorithm can be used for Anthracnose, Die Black, Gall
Midge and Powdery Mildew disease leaves and SVM for
50.00 Powdery Mildew disease leaves and Decision Tree
algorithm for Batcerial Canker disease leaves to predict the
40.00
disease percentage.
30.00
D. Grapes Leaves
20.00 We have used a total of 9027 images from the dataset
out of which 2115 belongs to healthy images of grapes
10.00 leaves and 6912 images belongs to diseased leaves. In
diseased leaves we have further classified the images into
0.00 specific diseases such as Black Rot which comprises of
SVM KNN DT CNN
2360 images, Black Measles of 2400 images and Leaf Blight
ANTHRACNOSE BACTERIAL CANKER of 2152 images.
DIE BACK GALL MIDGE By implementing our system, we have done the
POWDERY MILDEW analysis by using four different algorithms and derived
subsequent result below.
Fig. 13. Algorithm Comparison for Mango Leaves

GRAPES LEAVES DISEASES


60
50
40
30
20
10
0
BLACK_ROT BLACK_MEASLES LEAF_BLIGHT

SVM KNN DT CNN

Fig. 14. Algorithm Comparison for Grapes Leaves


TOMATO LEAVES DISEASES
90.00
80.00
70.00
60.00
50.00
40.00
Percentage of Diseased Leaves = 30.00
No of Diseased Leaves 20.00
Total Number of Leaves 10.00
0.00
2360 SVM KNN DT CNN
Percentage of Black Rot Leaves = = 52.74%
4475 BACTERIAL_SPOT EARLY_BLIGHT LATE_BLIGHT
LEAF_MOLD SPIDER_MITES TARGET_SPOT
2400 MOSAIC_VIRUS YELLOW_LEAF
Percentage of Black Measles Leaves = = 53.16%
4515
Fig. 15. Algorithm Comparison for Tomato Leaves
2152
Percentage of Leaf Blight Leaves = = 50.43%
4267
By calculation we have proved, according to the dataset of
Grapes Leaves, Black Rot disease leaves contribute of about
52.74%, Black Measles of about 53.16% and Leaf Blight of
about 50.43%. So we conclude that SVM and CNN can be
used for all the 3 diseased leaves for predicting the disease
percentage.

E. Tomato Leaves
We used a total of 11807 photos from the dataset, of
which 10661 images indicate diseased tomato leaves and
1146 images exhibit healthy tomato leaves. In diseased
leaves we have further classified the images into specific
diseases such as Bacterial Spot which comprises of 1532
images, Early Blight of 720 images, Late Blight of 1376
images, Leaf Mold of 686 images, Spider Mites of 1208 Percentage of Diseased Leaves =
images, Target Spot of 1012 images, Mosaic Virus of 269 No of Diseased Leaves
images and Yellow Leaf Curl of 3858 images. Total Number of Leaves
By implementing our system, we have done the analysis
by using four different algorithms and derived subsequent
1530
Percentage of Bacterial Spot Leaves = = 57.21%
result below. 2678

720
Percentage of Early Blight Leaves = = 38.59%
1866
1376
Percentage of Late Blight Leaves = = 54.56%
2522

686
Percentage of Leaf Mold Leaves = = 37.45%
1832

1208
Percentage of Spider Mites Leaves = = 51.32%
2354

1012
Percentage of Target Spot Leaves = = 46.90%
2158
269 disease leaves, while SVM is recommended for Powdery
Percentage of Mosaic Virus Leaves = = 19.01% Mildew disease leaves and Decision Tree algorithms for
1415
Bacterial Canker disease leaves. In the dataset of Grapes
Leaves, Black Rot disease leaves contribute roughly
3858 52.74%, Black Measles around 53.16%, and Leaf Blight
Percentage of Yellow Leaf Curl Leaves = = 77.10%
5004 about 50.43%. Considering these findings, we conclude that
SVM and CNN can effectively predict all three diseased
By calculation we have proved, according to the dataset of leaves percentages. Lastly, analysis of the Tomato Leaves
Tomato Leaves, Bacterial Spot disease leaves contribute of dataset reveals that Bacterial Spot disease leaves account for
about 57.21%, Early Blight of about 38.59%, Late Blight of 57.21%, Early Blight 38.59%, Late Blight 54.56%, Leaf
about 54.56%, Leaf Mold of about 37.45%, Spider Mites of Mold 37.45%, Spider Mites 51.32%, Target Spot 46.90%,
about 51.32%, Target Spot of about 46.90%, Mosaic Virus Mosaic Virus 19.01%, and Yellow Leaf Curl 77.10%.
of about 19.01% and Yellow Leaf Curl of about 77.10%. So Consequently, SVM algorithms are deemed suitable for
we conclude that SVM can be used for Mosiac virus, Leaf predicting Mosiac virus, Target Spot, Bacterial Spot, Leaf
Mould, Bacterial Spot, Early Blight, and Target Spot. Mold, and Early Blight, Decision Tree algorithms for Late
Decision Tree can be used for Late Blightwhereas CNN can Blight, and CNN for Bacterial Spot, Leaf Mold, Spider
also be used for Bacterial Spot, Leaf Mold, Spider Mites, Mites, Mosaic Virus, and Yellow Leaf Curl to accurately
Mosiac Virus and Yellow Leaf Curl for predicting the predict disease percentages.
disease percentage.
REFERENCES
VI. CONCLUSION [1] Plant Disease Detection Using Image Processing and Machine
The prompt detection of plant diseases is still a significant Learning , Pranesh Kulkarni1 , Atharva Karwande1 , Tejas Kolhe1 ,
Soham Kamble1 , Akshay Joshi1 , Medha Wyawahare1.
obstacle in the field of agriculture, with major effects on
[2] Madiwalar, Shriroop C. and Medha Wyawahare. “Plant disease
crop output and food security. With the help of machine identification: A comparative study.” 2017 International Conference
learning (ML)-based systems, farmers and researchers can on Data Management, Analytics and Innovation (ICDMAI) (2017):
now more accurately identify and contain disease outbreaks. 13-18.
By delving into the use of ML models to plant disease [3] Bharate, Anil A. and M. S. Shirdhonkar. “A review on plant disease
detection using image processing.” 2017 International Conference on
diagnostics and focusing on the painstaking inspection of Intelligent Sustainable Systems (ICISS) (2017): 103-109.
leaf conditions, this work adds to the expanding corpus of [4] A Survey on Different Plant Diseases Detection Using Machine
research in this area. The lack of a dependable and Learning Techniques , Sk Mahmudul Hassan 1 , Khwairakpam
reasonably priced commercial instrument for disease Amitab 2 , Michal Jasinski 3,* , Zbigniew Leonowicz 3 , Elzbieta
diagnosis, despite significant progress, emphasizes the Jasinska 4 , Tomas Novak 5 and Arnab Kumar Maji , MDPI
Electronics 2022.
continuous need for innovation in this field. To close this
[5] Plant disease detection in agriculture using Machine Learning – a
gap, our research presents a methodical investigation of survey , Sukhwinder Kaur, Saurabh Sharma , Eur. Chem. Bull. 2023.
machine learning-based techniques, opening the door to [6] Tlhobogang, B.; Wannous, M. Design of plant disease detection
improved disease management and surveillance tactics. system: A transfer learning approach work in progress. In Proceedings
A paradigm shift in crop management is anticipated of the 2018 IEEE International Conference on Applied System
Invention (ICASI), Chiba, Japan, 13–17 April 2018; pp. 158–161.
with the incorporation of ML algorithms into current
[7] B. Tugrul et al., "‘Convolutional neural networks in detection of plant
agricultural techniques, which offer more proactive and leaf diseases’ A review", Agriculture, vol. 12, no. 8, pp. 1192, 2022.
data-driven methods. [8] J. Lu and L. Tan, "Review on Convolutional Neural Network (CNN)
Ensuring the quality of agricultural products and controlling Applied to Plant Leaf Disease Classification", agriculture, pp. 1-18,
crop infections require early diagnosis of plant diseases. 2021.
Agriculture's understanding of disease has changed [9] Plant Leaf Disease Detection Using Machine Learning , Jayshree
dramatically because of the application of machine learning Hajgude1, Jayesh Kriplani2, Dhiraj Chhabria3, Anish Verliani4 ,
International Research Journal of Engineering and Technology
(ML) models, which have also created intriguing new (IRJET).
opportunities to oversee and prevent. This study examines [10] Plant diseases and pests detection based on deep learning , Liu and
several methods for identifying and categorizing plant Wang Plant Methods , Jun Liu and Xuewei Wang.
diseases, including a study of both well and sick leaves. [11] Identification of Plant Leaf Diseases Using CNN and Transfer-
We have demonstrated the disease percentages within Learning Approach Sk Mahmudul Hassan 1 , Arnab Kumar Maji 1,*,
Michał Jasi ´nski 2,* , Zbigniew Leonowicz 2 and Elzbieta Jasi ´nska
various datasets. In the dataset of Potato Leaves, Early MDPI Electronics 2021.
Blight and Late Blight diseases collectively account for [12] Machine Learning Technique for Precision Agriculture Applications
51.53%, aligning precisely with SVM and CNN algorithms. in 5G-Based Internet of Things C. Murugamani,1 S. Shitharth,2 S.
Consequently, we deduce that both SVM and CNN Hemalatha,3 Pravin R. Kshirsagar , 4 K. Riyazuddin,5 Quadri
algorithms effectively predict disease percentages for potato Noorulhasan Naveed,6 Saiful Islam,7 Syed Parween Mazher Ali,8 and
Areda Batu 9.
leaves. Analysis of the Apple Leaves dataset reveals that
[13] B. A. Wahab et al., "Detecting diseases in Chilli Plants Using
Scab Leaves constitute 50.10%, Black Rot stands at 49.74%, KMeans Segmented Support Vector Machine", 3rd International
and Cedar Rust amounts to 46.71%. As a result, we Conference on Imaging Signal Processing and Communication
conclude that SVM algorithms are suitable for predicting (ICISPC), pp. 57-61, 2019.
Scab, while CNN is optimal for Cedar Rust and Black Rot [14] V. Kanabur et al., Detection of Leaf Disease Using Hybrid Feature
Extraction Techniques and CNN Classifier, Springer, pp. 1213-1220,
disease leaves. Upon examination of the Mango Leaves 2019.
dataset, it is evident that diseased leaves constitute [15] Monzurul Islam, "Detection of potato diseases using image
approximately 50% across all types of diseases. Thus, we segmentation and multiclass support vector machine", 2017 IEEE
ascertain that CNN algorithms can effectively predict 30th Canadian conference on electrical and computer engineering
Anthracnose, Die Black, Gall Midge, and Powdery Mildew (CCECE), 2017.
[16] R. Anirudh Reddy, K. Sai Prasanna, B. Mani Teja, N. Deepak Reddy, Electronics and Sustainable Communication Systems (ICESC),
N. Pranay Sai Reddy, N. Shiva, "Plant Disease Detection by means of pp.1401-1406, 2021
K-Means Clustering and HSV Algorithm", 2023 International [18] Y. Aravind Reddy, Adimoolam M, "A Framework System for Plant
Conference on Sustainable Computing and Smart Systems (ICSCSS), Leaf Disease Detection using K-Nearest Neighbours and comparison
pp.904-910, 2023. of its features with Naive Bayes Classification", 2022 International
[17] Abdul Kareem, P. Brahmaji, A. Manoj Sai Reddy, A. Bharath Kumar Conference on Business Analytics for Technology and Security
Reddy, B. Lakshmi Sirisha, "Identification and Classification of Leaf (ICBATS), pp.1-4, 2022.
Diseases Using Agribot", 2021 Second International Conference on [19] https://www.kaggle.com/datasets/mohitsingh1804/plantvillage

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy