rapport
rapport
Titled
Présenté par:
Hammami Darine
Sourour Rahaf
1
General introduction.....................................................................................................................
Chapter I :....................................................................................................................................5
State of the Art............................................................................................................................5
Introduction.................................................................................................................................6
1.1 Previous Work.......................................................................................................................6
1.2 Functional Requirements......................................................................................................7
1.3 Technical Requirements........................................................................................................7
1.4 Method for Using CNN.........................................................................................................8
1.5 Development Environment.................................................................................................10
1.6 programming language:......................................................................................................10
Chapter II :................................................................................................................................12
Introduction...............................................................................................................................13
2.1 Data Source.........................................................................................................................13
2.2 Data Preprocessing..............................................................................................................13
2.3 Methods Tested...................................................................................................................14
2.4 Realization
14
2.5 Comparison of Models........................................................................................................18
Proposed Solutions for Disease.................................................................................................19
Conclusion................................................................................................................................21
Perspectives...............................................................................................................................22
2
Liste des figure
3
General introduction
Agriculture is a vital sector that directly impacts food security and economic stability.
However, the increasing prevalence of plant diseases poses a significant threat to crop yields
and quality, leading to financial losses for farmers and challenges in ensuring global food
supplies. Early and accurate detection of plant diseases is therefore crucial to mitigate these
risks and enable timely interventions.
Traditional methods of disease detection, such as manual inspection by experts, are often
time-consuming, expensive, and prone to errors due to human limitations. The advent of
artificial intelligence, particularly machine learning and computer vision, has revolutionized
this field by offering automated, efficient, and reliable solutions for plant disease detection.
These technologies allow for the rapid analysis of plant images, identifying symptoms of
diseases with high precision.
This project aims to develop a machine learning-based system for detecting plant diseases
using a labeled dataset of plant leaf images. By leveraging Convolutional Neural Networks
(CNNs) alongside other machine learning methods like K-Nearest Neighbors (KNN) and
Random Forest (RF), we assess the effectiveness of these models in accurately classifying
healthy and diseased leaves. Through a systematic approach involving data preprocessing,
model training, and performance evaluation, we identify the most suitable method for
practical agricultural applications.
In the first chapter, we will explore the state of the art in plant disease detection,
including an overview of previous work, functional and technical requirements, and the tools
and methods used for this project. In the second chapter, we will discuss the dataset
collection and preprocessing steps, followed by the realization and comparison of different
machine learning models to determine the most effective approach for disease detection.
4
Chapter I :
5
Introduction
Agriculture plays a pivotal role in ensuring food security and driving economic growth.
However, plant diseases significantly threaten crop productivity and quality, leading to
substantial economic losses and challenges in meeting global food demands. Accurate and
timely detection of plant diseases is crucial to mitigate these impacts and implement effective
solutions. Traditional methods of identifying plant diseases, such as manual inspection, are
labor-intensive, costly, and prone to errors, necessitating innovative approaches to address
these limitations.
In the first chapter, we will delve into the state of the art in plant disease detection,
reviewing previous research, identifying requirements, and describing the tools and methods
employed in this study. The second chapter will cover dataset collection and preprocessing,
the development and evaluation of machine learning models, and proposed solutions for
mitigating plant diseases like rusty and powdery mildew.
6
1.1 Previous Work
The field of plant disease detection has seen remarkable progress with the advent of
machine learning, particularly deep learning models. Among these, CNNs have emerged as
the most practical approach due to their ability to extract intricate patterns from image data. A
notable study by Shoibam Amritraj et al., titled "An Automated and Fine-Tuned Image
Detection and Classification System for Plant Leaf Diseases," demonstrated how fine-tuned
CNN models achieve a balance between accuracy and real-time performance. These models
were tailored to detect specific diseases using custom datasets, making them highly suitable
for agricultural applications.
Another significant contribution by Sangeetha et al., in their work titled "A Novel Smart
Approach to Plant Health - Automated Detection and Diagnosis of Leaf Diseases," employed
a combination of CNN, ResNet, and YOLOv7-X architectures. Their system effectively
analyzed diverse datasets to identify diseases like rusty and powdery mildew. This hybrid
approach achieved an accuracy of 85%, showcasing the potential of deep learning for
precision agriculture.
These requirements describe what the system must accomplish to meet its objectives:
Deep Learning Model:Use of CNN as the base model for object detection.
7
Balanced and Representative Dataset:
The model was trained using a loss function based on cross-entropy for classification tasks,
with the Adam optimizer to adjust the network weights. A validation set was used to evaluate
the model’s performance during training.
Filters in CNNs:
Definition: A filter (or kernel) is a small matrix (e.g., 3×33 \times 33×3, 5×55 \times
55×5) used to perform convolution operations on the input (image or feature map).
Function:
o Extract specific features (edges, textures, patterns) from the input image.
o Each filter detects a particular type of pattern in the data.
Learned Parameters: The filter values are learned during training through
backpropagation.
Layers in CNNs:
Layers structure the CNN architecture and utilize filters to transform the data at each stage.
8
a. Convolutional Layer:
b. Activation Layer:
Role: Introduces non-linearity into the network (e.g., with a ReLU function).
Connection to Filters: Activations are applied after convolution to emphasize
extracted features.
c. Pooling Layer:
Role: Reduces the size of the feature maps generated by the filters (e.g., through
MaxPooling or AveragePooling).
Effect: Helps retain important features while reducing computational complexity.
o If a layer has NNN filters, its output will have a depth (or number of channels) equal
to NNN.
o The output of one layer becomes the input for the next layer.
o Deeper layers capture more complex features based on the simpler features
extracted by earlier layers.
o Filters are adjusted during training to capture the most relevant features for the task
(e.g., classification, detection).
Advantages of CNN:
Automatic Feature Extraction: They automatically learn and extract features from raw
images, eliminating manual feature engineering.
9
Efficiency in Image Processing: CNNs process large images in smaller regions,
improving speed and efficiency.
End-to-End Training: CNNs learn directly from input images to output labels,
improving accuracy.
Scalability: CNNs scale well with large datasets, ideal for tasks like plant disease
detection.
Jupyter:
Project Jupyter is a project to develop open-source software, open standards, and services
for interactive computing across multiple programming language.(Project Jupyter -
Wikipedia)
10
Figure 2 : Logo de Python
libraries :
TensorFlow:
TensorFlow is a software library for machine learning and artificial intelligence. It can be
used across a range of tasks, but is used mainly for training and inference of neural networks.
(TensorFlow - Wikipedia)
Keras:
Keras is an open-source library that provides a Python interface for artificial neural
networks. Keras was first independent software, then integrated into the TensorFlow library,
and later supporting more.(Keras - Wikipedia)
OpenCV:
OpenCV is a library of programming functions mainly for real-time computer vision for
image processing and deep learning model construction.(OpenCV - Wikipedia)
11
12
Chapter II :
13
Introduction
The success of any machine learning project relies heavily on the quality and diversity of
the dataset used. For this plant disease detection project, the dataset serves as the foundation
for training, validating, and testing the models. By carefully selecting and processing the data,
we ensure that the machine learning algorithms can effectively learn patterns and make
accurate predictions.
The dataset, sourced from Kaggle, comprises 1,570 labeled images of plant leaves. These
images represent a range of healthy and diseased conditions across various plant species.
Proper organization of the data into training, validation, and test sets enables a systematic
approach to model development and performance evaluation.
In this chapter, we discuss the origins of the dataset, the preprocessing steps undertaken to
prepare it for analysis, and the methods used to extract meaningful insights. Additionally, we
explore the importance of preprocessing techniques, such as normalization and data
augmentation, which play a crucial role in improving model accuracy and generalization.
Finally, we detail the implementation and testing of different machine learning methods,
culminating in a comparative analysis of their effectiveness.
For this project, we used a dataset from Kaggle, a data science platform. The dataset
contains 1572 images of plant leaves, labeled as either healthy or diseased. These images are
divided into three sets:
14
Data preprocessing is a crucial step to ensure the model learns effectively. The
preprocessing steps included:
Resizing Images: All images were resized to a uniform size of 128x128 pixels to optimize
computation time.
CNN (Convolutional Neural Network): The CNN model performed the best, achieving an
accuracy of 91% on the test set.
KNN (K-Nearest Neighbors): The KNN model, while effective for simpler datasets,
achieved only 57% accuracy.
Random Forest (RF): The Random Forest model performed moderately well, achieving
78% accuracy.
2.4 Realization
The implementation involved training three different models: CNN, KNN, and Random
Forest.
CNN: The CNN model, consisting of convolutional, pooling, and fully connected layers,
achieved 91% accuracy on the test set.
KNN: The KNN model achieved 57% accuracy, underperforming on the dataset due to
its inability to capture complex patterns.
Random Forest: The RF model achieved 79% accuracy, showing better performance
than KNN but falling short of CNN.
15
Environment Setup
Capture 1: Screenshot of the Python environment (e.g., Jupyter Notebook) with necessary
libraries installed.
Data Loading
Capture 2: Code snippet and screenshot showing dataset loading and splitting.
Figure 4: Capture 2(Code snippet and screenshot showing dataset loading and splitting)
16
Model Development
Model Evaluation
17
Figure 6 : Capture 4(Confusion matrix for CNN predictions)
Comparison of Models
The results of the three models were compared using the following metrics:
18
Figure 8 : Result of the models Random Forest
Model
Type
CNN
KNN
Random Forest
Accuracy
Accuracy of 91%
Accuracy of 57%
Accuracy of 79%
Real-Time
Performance
19
high
Low
Medium
Hardware
Dependency
Moderat
e
Accessibility
Medium (requires
training)
Hig
h
Hig
h
Not
practical
20
Moderately
practical
Advantages
High accuracy,
effective feature
extraction
Simple, easy
to implement
Interpretable,
moderate
computational
needs
Disadvantages
21
Computationally
expensive
Poor accuracy
on large,
complex
datasets
Less effective
with high-
dimensional
data
22
Proposed Solutions for Disease
Rusty Disease
Rusty disease is a fungal infection that causes reddish-brown spots on plant leaves, often
leading to leaf drop and stunted growth. Here are some solutions to manage rusty disease:
1. Remove Infected Leaves: Cut off and dispose of infected leaves to prevent the spread
of the disease.
2. Space Plants: Provide adequate spacing between plants to enhance airflow and reduce
humidity around the leaves.
3. Avoid Overhead Watering: Water the base of the plants instead of overhead to
reduce moisture on leaves, which promotes fungal growth.
4. Control Watering: Avoid overwatering the plants. Maintain the soil slightly moist
but not soggy.
5. Use Milk Solution: Mix one part whole milk with nine parts water and spray on the
affected leaves to control fungal growth.
6. Baking Soda Spray: Mix one tablespoon of baking soda, one tablespoon of liquid
soap, and one liter of water. Spray this mixture on leaves to fight the rust infection.
7. Apply Fungicide: Use a rust-specific fungicide for more severe cases to stop further
damage.
Powdery Mildew
1. Remove Affected Leaves: Cut off and destroy infected leaves to reduce the spread of
spores.
2. Improve Air Circulation: Ensure proper spacing between plants to prevent humidity
buildup and improve airflow.
3. Use Fungicides: Apply sulfur-based fungicides or neem oil to the affected areas.
4. Maintain Proper Watering: Avoid direct water contact with leaves. Water at the
base to reduce the chance of infection.
23
Conclusion
The detection and diagnosis of plant diseases are crucial for ensuring agricultural
productivity and food security. This project demonstrated the potential of machine learning,
particularly Convolutional Neural Networks (CNNs), to provide efficient and accurate
24
solutions for identifying plant diseases. By leveraging a high-quality dataset from Kaggle and
employing preprocessing techniques such as normalization and data augmentation, we were
able to enhance the performance of our models.
The comparative analysis of the CNN, KNN, and Random Forest models revealed that
CNN significantly outperformed the others, achieving an accuracy of 91%. This success
highlights the superiority of deep learning for handling complex image-based tasks due to its
ability to automatically extract intricate features and learn spatial hierarchies.
While the results are promising, there are opportunities for further improvement.
Expanding the dataset, exploring advanced deep learning architectures, and integrating the
model into mobile or IoT platforms could enhance the system's scalability and accessibility
for real-world applications.
25
Perspectives
While the current solution offers significant potential, there are several avenues to further
enhance its applicability and impact:
Developing a fully integrated system where IoT sensors and cameras continuously
monitor crops and provide real-time disease detection and analysis.
Combining disease detection with other environmental factors such as weather and soil
data can improve decision-making for farmers by predicting disease outbreaks and
recommending preventative actions.
User-Friendly Applications
Creating mobile or web-based applications with a simple interface where farmers can
upload leaf images and receive instant disease diagnoses and treatment suggestions.
26
27