Contemporary Machine Learning Applications in Agriculture
Contemporary Machine Learning Applications in Agriculture
DOI: 10.1002/cpe.6940
RESEARCH ARTICLE
Atif Mahmood1 Amod Kumar Tiwari1 Sanjay Kumar Singh2 Sandeep S. Udmale3
KEYWORDS
agriculture, classification, deep learning, machine learning, recognition
1 INTRODUCTION
Agriculture plays a pro vital role in the world’s economy. Day by day expansion of population and shrinking of farming land is now a matter of concern
for the fulfillment of food supply for individuals across the globe. To prevent the food crisis in the future, people are required to either enhance the
agrarian land or improve the production of the crop through precision farming. However, the advancement of farm appliances and the emergence
of computational techniques viz. machine learning has brought a new ray of hope. Machine learning has risen with big data advancement and has
become pervasive in every field due to its robustness in applications.1,2 In the recent past decades, machine learning has attracted the researcher’s
attention to maximizing food production and minimizing the cost and efforts in farming.3 Researchers have applied machine learning approaches
in various aspects of agriculture such as crop growth estimation, weed recognition, crop disease identification, crops or food classification, food
quality recognition, and so forth, for the improvement of agricultural productivity. There is a large number of machine learning-based review articles
available in the scientific databases but the majority of them are limited to the individual subcategories of agriculture. Recent review work relating
Concurrency Computat Pract Exper. 2022;e6940. wileyonlinelibrary.com/journal/cpe © 2022 John Wiley & Sons, Ltd. 1 of 22
https://doi.org/10.1002/cpe.6940
2 of 22 MAHMOOD ET AL.
to some of the trending agricultural sectors, such as disease detection is analyzed in Reference 4. Similarly, different crop yield prediction works
are analyzed in Reference 2, livestock farming works are analyzed in Reference 5, crops quality evaluation works are analyzed in Reference 6, and
weed detection works are analyzed in Reference 7. The only article that summarizes the maximum of nine agricultural subcategories is reported in
Reference 8. Also, there are some subcategories where the least survey works have been done, particularly for soil observation, water monitoring,
and species breeding. In the light of the present status of available review works relating to the agricultural sector, we have tried our level best to
report the survey of maximum subcategories of the agricultural sector altogether in one place. The following research questions (RQs) have been
identified to perform the systematic review.
RQ3. What other tools or techniques are employed along with ML models/algorithms?
RQ5. Did the functional performance of the employed ML model/algorithms was compared with other ML models/algorithms?
This paper aims to present the systematic survey committed to contemporary applications of ML in agricultural frameworks and discover
where it is heading shortly; through analyzing the researcher’s recent contributions to agriculture, utilization of multiple ML models/algorithms,
and applicability of freely available agricultural dataset. The finding demonstrates different ML techniques and concepts used to benefit precision
agriculture and explores future expectations and prospects of ML techniques for the advancement of the agricultural system. Sufficient num-
bers of abbreviations are used in the paper related to ML algorithms/models and statistical/general context are specified in Table 1 and Table 2,
respectively.
Papers investigated were sorted into four broad categories (a) field condition management, which includes subcategories: soil observation
and water monitoring; (b) crop management, which includes subcategories: yield prediction, disease recognition, weed spotting, and crop quality
evaluation; (c) livestock management, includes subcategories: livestock welfare and livestock production; and (d) species management, includes sub-
categories: species breeding and species recognition.9,10 Figure 1 demonstrates the categories and subcategories of ML in agricultural frameworks.
The conceivable outcomes of each analyzed work are addressed noticeably by highlighting the applied ML strategies, database details, problem
description, outcomes, and the limitations or future scope.
The significance of our review can be summarized as follows:
1. Primarily this paper endeavors to identify ML applicability to diverse agricultural-related problems, and assort them into relevant categories
and subcategories.
2. The majority of the survey so far is committed to a specific category or limited category where ML is being applied. This prescribes a narrow
or limited scope to understand the research associated with agriculture and food system utilizing ML. Pursuantly, our survey addresses mul-
tiple possible research areas in one place for a better perception of the scope and sub-areas of agriculture and food systems where ML can
be applied.
3. As per our investigation, even though some researchers have put their efforts to address multiple subcategories together but limited to common
major categories only. As far as we know, our review paper highlights the maximum number of possible sub-areas and their current status in
agriculture where ML is being applied.
4. This review traces the trajectory of ML algorithms/models in multifold subcategories of agriculture. This shows the applicability and adaptability
momentum of ML in diverse aspects of the agriculture system.
5. The year-wise contributions of researchers to each category/subcategory of agriculture are presented, which determines the pressing cat-
egories/subcategories of agriculture based on the researcher’s attention at present and in upcoming years. Also, determines the cate-
gories/subcategories where research lags or is less favorable.
6. Freely available agricultural datasets corresponding to varied categories under study are being searched out and presented for ease to access
for the academics, practitioners, and researchers for their research and further up-gradation of the dataset.
Within this context the remainder of the work is as follows: Primarily in Section 2, the methodology adopted for searching the relevant paper
in three scientific databases is discussed. Further in Section 3, an overview of ML, its types, and models are presented. Section 4 demonstrates
the analyzed work based on categories and subcategories of ML application in agriculture with detailed descriptions. Sections 5 and 6, discusses
and conclude the presented work effectively. Finally, the summary of investigated literature work belonging to each agricultural subcategory is
presented in Tables A1–A10 under Appendix.
MAHMOOD ET AL. 3 of 22
- Soil Observation
Field Condition
Management - Water Monitoring
- Yield Prediction
Crop
Management - Disease Recognition
- Weed Spotting
2 METHODOLOGY
A systematic review effort is a time-consuming process that requires mainly two steps: the first is the collection and filtering of relevant articles
and the second is a detailed analysis of selected articles. In the bibliographic investigation of our study, we searched for journal and conference
papers from three popular scientific databases (i.e., ScienceDirect, IEEE-Xplore, and Google Scholar) for collecting and filtering relevant work. These
three scientific databases are leading academic databases in the area of computer science and engineering, and one can search for not just jour-
nal articles but also conference papers, books chapters, and case studies. For searching the articles appropriate keywords-based advance search
relating to each subcategory was performed. Articles were collected based on keywords similarity and synonyms in the abstract section of the
retrieved articles belonging to distinct subcategories. Search string descriptions for each of the databases under different categories are exhibited
in Table 3.
As it is impossible to manually include all the research articles collected for the domain under study, filtering needs to be performed. Out of
a total of 190 collected articles, we filtered out 81 studies based on novelty, relevancy, discarding the survey articles, and discarding the dupli-
cate articles found in another database. Figure 2; depicts the percentage count for 81 analyzed studies collected from different databases. After
the collection and filtering step detailed analysis of all the 81 studies was performed based on research questions stated in the introduction
section.
3 ML OVERVIEW
ML is an integral part of artificial intelligence techniques that employ statistical and probabilistic methods to enable the machines to improve with
experience. Any ML system consists of two components: data and algorithm. Data is what we allocate to the system in a form of input, while an
algorithm is a method that works on the data and performs the analysis.
3.1 ML types
ML algorithms, in general, are classified into four main categories depending upon the learning types: supervised, unsupervised, semi-supervised,
and reinforcement learning.91,92 In supervised learning labeled data (i.e., data with inputs and corresponding outputs) are given to the algorithms,
and algorithms generate the function that learns to map the input to the pertinent output. Problems that lie beneath supervised learning are clas-
sified typically into regression and classification. Examples of the algorithm under supervised learning are LRo , LRi , OLSR, MARS, KNN, NB, DT,
MAHMOOD ET AL. 5 of 22
TA B L E 3 (Continued)
Database Sub-category Search string description
Google Scholar Soil observation all: ((Machine Learning) OR (Deep Learning)) AND ((soil observation) OR (soil monitoring) OR (soil
temperature) OR (soil moisture)); words occur: anywhere; dated between: 2010–2020
Water monitoring all: ((Machine Learning) OR (Deep Learning)) AND ((Water monitoring) OR (Water observation) OR
(evapotranspiration) OR (dew point)); words occur: anywhere; dated between: 2010–2020
Yield prediction all: ((Machine learning) OR (Deep learning)) AND ((Yield prediction) OR (Yield estimation)) AND ((crops)
OR (fruits)); words occur: anywhere; dated between: 2010–2020
Disease recognition all: ((Machine learning) OR (Deep Learning)) AND ((disease recognition) OR (disease identification) OR
(disease classification)) AND ((crops) OR (fruits)); words occur: anywhere; dated between: 2010–2020
Weed spotting all: ((Machine Learning) OR (Deep Learning)) AND ((weed spotting) OR (weed recognition) OR (weed
identification)); words occur: anywhere; dated between: 2010–2020
Crop quality evaluation all: ((Machine Learning) OR (Deep Learning)) AND ((quality prediction) OR (quality estimation) OR
(quality classification)) AND ((crops) OR (fruits)); words occur: anywhere; dated between: 2010–2020
Livestock welfare all: ((Machine Learning) OR (Deep Learning)) AND ((Livestock) OR (animals) OR (birds)) AND ((health) OR
(disease)); words occur: anywhere; dated between: 2010–2020
Livestock production all: ((Machine Learning) OR (Deep Learning)) AND ((Livestock) OR (animals) OR (birds)) AND ((weight
estimation) OR (growth prediction) OR (production prediction)); words occur: anywhere; dated
between: 2010–2020
Species breeding all: ((Machine Learning) OR (Deep Learning)) AND ((genome) OR (phenotype)) AND ((crops) OR (plants));
words occur: anywhere; dated between: 2010–2020
Species recognition all: ((Machine Learning) OR (Deep Learning)) AND ((species identification) OR (species recognition) OR
(species classification)) AND ((crops) OR (plants)); words occur: anywhere; dated between: 2010–2020
25%
Science Direct
21% Google Scholar
54%
IEEE Explore
RF, SVM, ANN, and so forth. In unsupervised learning, unlabeled data (i.e., data without corresponding output labels) are given to the algorithms,
where the algorithm tries to extract some features on its own to predict the desired output. Problems lie under unsupervised learning are classi-
fied typically into clustering and association. Examples of the algorithm under unsupervised learning are the Apriori algorithm, k-means, k-medians,
hierarchical clustering, PCA, and so forth. Semi-supervised learning combines the capabilities of both unsupervised and supervised learning, where
large unlabeled data and limited labeled data are fed to the algorithm. Initially; unsupervised learning (i.e., clustering) is performed over unla-
beled data, then clustered unlabeled data are labeled applying supervised learning (i.e., classification). The end goal of semi-supervised learning is
the same as that of supervised learning. Reinforcement learning is the last sort of learning; which is a computational perspective to understand
and automate goal-oriented learning and makes a good sequence of a decision.93 Examples of reinforcement learning algorithms are- the Markov
decision process, Q-learning, deep-Q-network (DQN), state-action-reward-state-action (SARSA), deep-deterministic policy gradient (DDPG),
and so forth.
The term “model” and “algorithm” are quite confusing and frequently used interchangeably. According to Reference 94, the “model” is a spe-
cific representation of the ML algorithm/method run on data, and an “algorithm” is a procedure that runs on data to create an ML model.
More simply, it can be stated that ML “algorithms” are implemented codes that run on data, while ML “models” are output by an algorithm and
MAHMOOD ET AL. 7 of 22
consist of both algorithms and data. Although there are numerous ML algorithms available, in our paper we presented only those which are reported
in the review. There are around 50 ML algorithms utilized for solving issues in precision agriculture relating to different subcategories. All the uti-
lized algorithms were grouped into nine different ML models depicted in Figure 3. Apparently, we included some algorithms to the models beyond
the review, for a better understanding and representation of model-algorithm relation to the readers. The brief introductions about models are
discussed next.
3.2.1 Regression
Regression is a supervised type of learning model associated with the statistical process to map dependent (output) variables with independent
(input) variables. Regression deals with predicting only continuous values. Some of the salient algorithms include; LRi, MLR, LRo , stepwise regression
(SR), OLSR, MARS, MRA, locally estimated scatterplot smoothing (LESM), and cubist.95
3.2.2 Clustering
An unsupervised form of learning model, clustering is particularly used for grouping data of similar types together in a form of clusters. Some
commonly used clustering algorithms are- k-mean, k-median, hierarchical clustering, GMM, and expectation–maximization techniques.96
SVM algorithms/models belong to a supervised learning category and are effectively applied for the two-class problem of classification. However,
SVM can solve regression and clustering problems too. The capabilities of traditional SVM classifiers are being enhanced through multiple ker-
nels functions (i.e., linear kernel, sigmoid kernel, non-linear kernel, RBF, Gaussian kernel, polynomial kernel, etc.).97 Some frequently used SVM
algorithms are SVR, linear SVM, quadratic SVM, Gaussian SVM, SVC, multiclass SVM, least-square SVM, and successive projection algorithm
SVM (SPA-SVM).98
3.2.4 Bayesian
The BM is a probability-based supervised learning model, and capable to solve both classification and regression problems. NB, Gaussian naïve
Bayes (GNB), Bayesian belief network (BBN), Bayesian network, multinomial naïve Bayes (MNB), averaged one-dependence estimators (AODE) and
the mixture of Gaussian are prominent algorithms of the BM.96
8 of 22 MAHMOOD ET AL.
DT is a supervised learning model that resolves both classification and regression-based problems. DT has a hierarchical structure and follows the
divide and conquer technique to solve the problems. The commonly applied algorithms of the DT are- Iterative dichotomiser-3, CART, MT, C4.5,
chi-squared automatic interaction detection (CHAID), and conditional-DT.92
Ensembles are a powerful and more advanced type of supervised learning model. Ensemble modulates the capabilities of multiple algorithms to
improve the prediction outcomes. However; the Ensemble approach helps to generate a better prediction model compare to a single predictive
model, but it is hard to understand the ensemble of multiple algorithms. Some well-known EM algorithms are- RF, GBM, AdaBoost, BG (bagging),
boosting, stacked generalization (blending), random subspace, and boosting gradient.99
Instance-based learning sometimes identified as memory-based or lazy learning; learns and builds the hypothesis from training data
according to some similarity measures. Instead of estimating for entire instances, it uses only specific instances to perform regres-
sion or classification tasks. The instance-based model requires huge memory space to store data which leads to a complex struc-
ture. Some popular Instance-based algorithms are- KNN, learning vector quantization (LVQ), KNN-regression, and logically weighted
learning (LWL).99,100
ANN is a human brain-inspired information processing model. Similar to humans; ANN learns from examples and performs real-time
tasks such as image processing, pattern recognition, classification of objects, natural language processing, and so forth. ANN is a super-
vised type of learning model which usually solves classification and regression problems. ANN architecture consists of basically three lay-
ers; input, output, and one or multiple hidden layers. Data gets into the network through the input layer; learning is performed in hid-
den layers, and prediction results are obtained at output layers. A vast number of ANN algorithms are reported in the literature such
as perceptron, SOM, back-propagation (BP), MLP, Hopfield network (HN), ANFIS, RBF-network, SKN, X-Y fusion, RBF, ELM, ENN, resilient
back-propagation (RBP), counter propagation, generalized regression neural network and self-adaptive evolutionary extreme learning machines
(SAEELM).98
DL sometimes referred to as the deep neural network (DNN) is a subset of ML, resembles ANN. DL architecture consists of multiple layers (i.e.,
more than simple ANN) to perform complex processing of data and have capabilities to extract features on their own; and can be utilized to solve
supervised, unsupervised, and semi-supervised learning problems. Now a day’s DL architectures/models are frequently used by researchers and
artificial intelligence experts due to their wide range of application scenarios. CNN, RNN, LSTM, deep belief networks (DBNs), and deep stacking
networks (DSNs) are some popular DL architectures.101
Dimensionality refers to features associated with the input data, and DR is a technique to reduce the features of the dataset. DR algorithms can
be employed for all types of learning problems (i.e., supervised, unsupervised, and semi-supervised) and applied before regression, classification,
or clustering models to reduce the dimensionality.99 Commonly used DR algorithms are- LDA, quadratic discriminant analysis (QDA), PCA, partial
least squares regression (PLSR), and principal component regression (PCR).
4 RELATED WORK
There are many different areas of agriculture where ML is being applied triumphantly. All the identified works related to ML application in agricul-
ture are classified systematically into four broad categories viz. field condition management, crop management, species management, and livestock
MAHMOOD ET AL. 9 of 22
management. ML application in field condition management is classified into two classes: soil observation and water monitoring. ML application in
crop management is further classified into four classes: yield prediction, disease recognition, weed spotting, and crop quality evaluation. ML applica-
tion in livestock management is further classified into two classes: livestock production and livestock welfare. And finally, ML application in Species
management is classified into two classes: species breeding and species recognition.
The first subcategory of the review concerns ML in soil observation. Soil chemical parameters (i.e., pH, Ca, Mg, K, H, P, etc.) and physical param-
eters (i.e., temperature, moisture, drying, etc.) are factors that directly affect the crop yield and quality. Manual analysis and computation of soil
parameters is a daunting task. Therefore, it is essential to apply modern computational techniques like ML to accurately predict or detect the
soil parameters to improve soil observation for a particular soil type. A total of five pieces of literature were considered in the survey; the first
three studies are related to soil property prediction, fourth and fifth study deals with the prediction of temperature and moisture respectively.
The first study related to soil observation is the work of Khanal et al.,11 where authors demonstrated that infield five soil properties (i.e., cation
exchange capacity (CEC), potassium (K), magnesium (Mg), pH, and soil organic matter (SOM)) can be predicted from terrain data and high-resolution
multispectral image data utilizing ML algorithms. The experiment includes data such as vegetation indices, calculated from multispectral images,
and the soil data taken over 1-acre intervals at depth of 18-cm from bare fields. It was observed that to forecast the soil properties, no model
usually outperforms other models. The second study12 demonstrates the prediction of three soil nutrients: available phosphorous (AP), avail-
able potassium (AK), and total nitrogen (TN) based on remote sensing images using ANN. Hyperspectral images taken for the Zengcheng area in
China were used as auxiliary variables for the ML algorithms. The finding shows that BPNN with Ordinary Kriging (OK) achieves the best accu-
racy in comparison to other models. Again in the paper,13 the prediction of three soil properties such as moisture content (MC), total nitrogen
(TN), and organic carbon (OC) was performed with Vis–NIR spectroscopy using regression and SVM models. Simulation outcomes conclude that
least square SVM achieves the best prediction accuracy for MC and OC, while Cubist for TN. In the fourth literature,14 An innovative method
was suggested by the authors to estimate the daily soil temperatures at six varying depths (i.e., 100, 50, 30, 20, 10, and 5 cm) for two dissimilar
climate areas of Iran (i.e., Kerman and Bandar Abbas). The self-adaptive evolutionary-extreme learning machine (SaE-ELM) model was exhib-
ited for the accurate estimation of soil temperatures. In the last article15 of the soil observation subcategory, the authors deploy the three ML
models to determine the moistness in winter wheat soil in Baoji city, China. Prediction of soil moisture at two different depth layers (0–20 cm
and 20–40 cm) was based on 15 predictors of three factors: weather, terrain, and soil. It was observed that weather is the most effective fac-
tor for predicting moisture in winter wheat soil followed by terrain and soil properties. Table A1 depicts the summary of all the five studies
discussed above.
Water monitoring assumes a critical role in agronomical, climatological, and hydrological balance, and became a matter of concern for the crops
area where the occurrence of rain is less or rare. The water monitoring problem could be tackled by adopting an ML-powered irrigation system.
The most developed ML-based applications so far used today are related to the assessment of monthly, weekly, and daily ET and forecasting of
day-to-day dew point temperature to determine the water requirement in the crops. Six articles in total related to the water monitoring system
were found in the survey based on the estimation of ET and daily dew point. A data-driven model (i.e., SVM) for predicting ET and crop-coefficient
(K c ) for two crops (watermelon and pepper) under different irrigation conditions was presented in Reference 16. A large dataset during 10 sea-
sons (6 years) based on lysimeter measurement under plastic mulch conditions was used in the study. The result obtained was quite encouraging
and suggested exploring the development of the SVM model for other open field crops. The next article17 is also based on the estimation of ET,
where authors deploy the ELM model that uses weekly weather data to determine the accurate weekly estimation of ET for two arid regions
(i.e., Jodhpur and Pali) of India. The study shows that the deployed ELM model is a practical choice to estimate the ET with limited data. Another
study,18 presents the monthly estimation of mean reference ET based on climatic data for 44 stations using the empirical equation and soft
computing approaches. Firstly, empirical equations were calibrated through the standard method, then monthly average ET was predicted using
MARS, SVM-polynomial, SVM-RBF, and gene expression programming (GEP). In a study,19 the authors developed two ML models for the esti-
mation of ET in rain-fed maize croplands under plastic-film-mulching (MFR) and non-mulching (CK) conditions. Metrological and crop data were
harnessed for model training under MFR and CK in two ways: firstly, training for models (SVM1 and GANN1) happens with data consisting of
a combination of meteorological and crop data; secondly, training for models (SVM2 and GANN2) happens with only meteorological. Here, the
GANN model is an ANN model optimized by GA. The performance of both models represents that GANN is marginally better than SVM for the
10 of 22 MAHMOOD ET AL.
assessment of ET. The next study corresponds to the estimation of ET is presented in Reference 20, where ET estimation based on two ANN
models utilizing only temperature data is demonstrated. To estimate daily ET at six meteorological stations of the Sichuan basin of China two
scenarios were considered: In scenario-1 training and testing of the model is performed through local station data, while in scenario-2 train-
ing of the model is performed using pooled data collected from all the six stations. The last study dedicated to the water monitoring category,
predicts day-to-day dew point temperature through the EML model utilizing daily average climatic data from two Iranian weather stations.21
The potential of the deployed ELM model was analyzed with ANN and SVM models and the results firmly favor the ELM as an efficient model
in predicting day-to-day dew point temperature. Table A2 summarizes all the literature discussed above associated with the water monitoring
subcategory.
Accurate yield prediction plays an effective contribution to the monetary development and food sufficiency of the agro-based nation. Yield pre-
diction includes foreseeing the harvest of the crops from accessible verifiable information like climate parameters, soil parameters, pesticides, and
historic crop yield. As yield prediction is a daunting task due to its dependability over many interrelated factors, hence farmers rely on crop rota-
tion, pesticides, and fertilizers in the applicable weather conditions for predicting their crop yield. ML is a pressing tool nowadays for accomplishing
pragmatic and influential circumvention for the yield expectation issue such as matching crop supply with consumer demand and proper crop man-
agement to increase crop productivity. In the study,22 the NN technique is applied to predict the corn crop yield relying on four vegetation index
strengths. All the four vegetation indexes (i.e., GVI, NDVI, SAVI, and PVI) during the years 1998, 1999, and 2001 were investigated for corn yield
using quadrant Ariel data captured from the Oakes irrigation region of North Dakota, USA. The PVI technique was noticed to be superior to other
vegetation index techniques for corn yield prediction. Authors of Reference 23, suggested a hybrid MLR-ANN framework to investigate the forecast
of paddy crop yield for the Tamil Nadu state in India. In the suggested model, MLR coefficients, as well as intercept, were used to initialize the bias
and weight of the FFBN input layer irrespective of a random selection of weight and bias. The accuracy of the hybrid model is compared with MLR,
SVR, ANN, KNN, and RF algorithms in light of RMSE, AMA, and R values. And the prediction accuracy of the suggested model overtakes the other
models. Another study for yield prediction is proposed in Reference 24, where stepwise regression analysis and ENN were utilized. The architecture
of the proposed work consists of multiple network architectures each having dissimilar hidden layers and neurons, which were trained and tested
over collected data about metrological, environmental, and economic factors for the five runs. The observational results exhibit that ENN gives the
top accuracy with a low error rate followed by BPN and MRA. Reference 25 predicts the yield of six crops for 10 different areas in Bangladesh using
two supervised learning techniques viz. DTL-ID3 and KNNR. Dataset for the research contains three attributes such as temperature, rainfall, and
yield rate for the duration of 2004 to 2015. Data for the year 2004 to 2013 was used for training, and the remaining data for the years 2014 and
2015 were used for accuracy analysis. While comparing the accuracy of both the algorithms, it was discovered that in most of the cases error rate
of DTL-ID3 is lesser than the KNNR algorithm. Authors of Reference 26 introduce a system composed of RNN to estimate the yield of soybean and
maize crops for Brazil and the USA. The system utilizes satellite-derived precipitation data, soil properties data, air temperature data, and historical
observed yields of both crops in USA and Brazil as labels from various agencies of the country. The motivation of work was to forecast the yield based
on lesser data requirement compare to other existing system and to forecast the yield prior to the beginning of crop season. In the work of Refer-
ence 27, three ML models were established to compare the estimated biomass of managed grassland for two sites in the Ireland. All three model
uses vegetation index and two spectral bands (RED and NIR) with the in-situ measurements as input features for the training and testing purpose.
A biomass estimation capability of different model shows that performance of ANFIS is better when contrasted with FFBPN and MLR. In Refer-
ence 28, the authors attempted to combine the key properties of soil with crop development referred to as NDVI to train three SOM-based models
for predicting wheat yield in the tillable area. To acquire soil and crop parameters advanced sensing techniques were applied and it was observed
that among all the models, the SKN model was perhaps used to foresee yield with higher accuracy. In the literature,29 MV-system was employed
to numerate and discern the non-harvestable and harvestable fruits on the coffee branch. The MV framework consisting of an image acquisition
operation and an image manipulation algorithm gives the determination coefficients higher than 0.93 at the early stage of coffee fruit development.
Also, NB, KNN, and SVM classifiers were applied for classifying the harvestable (semi-ripe, ripe, and overripe) fruits and non-harvestable (unripe)
fruits. In the study,11 the performance of multiple ML methods for predicting soil characteristics and corn harvest was compared in terms of RSME
and R2 . The study integrates the distantly sensed data and collected field statistics on five soil characteristics (i.e., magnesium, potassium, cation
exchange capacity, soil organic matter, and pH) from seven fields near London, Ohio. Based on model performance for crop yield, RF surpasses the
other five models with R2 = 0.53 and RSME = 0.97. At last authors of Reference 30 deployed a fresh end-to-end framework based on DL fusion
to foresee rice yields of summer and winter for 81 territories in China with the combination of the eight types of metrological and area data. The
model consists of three stages; in the early-stage pre-processing of original area data and metrological data takes place. In the middle stage, BPNN
and independently RNN was used to learn deep spatial features and temporal features. And finally, another BPNN amalgamates two types of deep
MAHMOOD ET AL. 11 of 22
features and learns the connection among features and rice yield to make predictions. The summary details of the paper discussed above are given
in Table A3.
Crop disease is a significant menace to global food production. In crops, generally, diseases are caused by bacteria, fungus, pathogens, and viruses,
which directly affect crop productivity and human health. Traditional ways to protect the crops from disease require regular monitoring of crops by
farmers, which is time-consuming. Another way is using pesticides, which are costly and toxic to humans, animals, and birds and also contaminate
soil and groundwater. Thus, it is vital to adopt a modern ML technique that helps to identify crop disease and its level to take preventive measures or
consequent actions against the spreading or growth of the diseases. In the literature,31 the classification of healthy plants of tomato and six different
diseases is done over 13,262 image samples using VGG16 and AlexNet classifier. And performance is evaluated by modifying three hyper-parameters
viz. mini-batch size, weight, and bias learning rate. Another paper32 presents a method for the recognition of Microbotryum Silybum disease in
Silybum marianum plants, utilizing field spectroscopy and three hierarchical self-organizing models (i.e., CP-ANN, SKN, and XY-F). An artificially
inoculated and healthy Silybum marianum plants were used to obtain leaf spectra in the experimental field. The overall accuracy achieved for cor-
rectly detecting the Microbotryum Silybum disease with the XY-F model was superior to the other two models. Authors of Reference 33 developed
an approach to detect and classify the infected salad leaf (i.e., spinach) from the healthy leaves utilizing ML classifier based on hyperspectral sensing
technique. A small spectral dataset was prepared at a salad farm located in Australia using a compact ASD-FieldSpec4 Spectroradiometer that com-
putes wavelengths in the range of 350–2500 nm. ML classifier correctly classifies the infected leaves with an accuracy of 84%. Again in Reference
34, SOM-based SKN, CP-ANN, and XY-F supervised learning models are applied to hyperspectral imaging data for the assortment of yellow rust
disease, nitrogen stress, and healthy crop to consider the health conditions of the winter wheat crop. The harmful thrips parasite detection method
in strawberry plants is developed in Reference 35. Different parasites of strawberries are identified through image processing technique and then
an SVM classifier is applied to classify them to detect the target (thrips) parasite. Also, MSE, RMSE, MAE error parameters were used for the evalua-
tion of classification. In Reference 36, the author proposed an ANN algorithm to identify the six different diseases of the jujube tree. For the analysis
purpose, four texture features, nine color features, and eleven morphological features were extracted and further reduced to twelves features col-
lectively. ANN consists of a two-layer network using tan-sigmoid/log-sigmoid activation function along with one hidden layer. Whereas, the input
and output layers of ANN possess twelve and six neurons respectively. As reported by the study,37 detection and an assortment of healthy and two
popular banana diseases, that is, black Sigatoka and black speckle are automated using LeNet-CNN, and the effectiveness of the deployed system is
computed through parameters such as accuracy, recall, precision, and F1 score. The work of Gómez-sanchis et al.,38 discusses the early recognition
of two types of fungi (i.e., penicillium genus) in citrus fruits based on hyperspectral imagery system and ML. Experimental work contains physical
devices, data processing, and ML algorithms. Geometric correction and band selection was performed under the data-processing phase. A huge no.
of band features was extracted and further reduced to 10 relevant features using Minimum Redundancy Maximal Relevance (MRMR) method. The
ANN method outperformed the CART method by achieving an overall accuracy of 98.30%. Reference 39 introduces the largest dataset of plant
leave images of 12 different species and proposed a novel method (PDNet architecture) for detecting plant diseases with an accuracy of 93.67%. In
Reference 40, an automated system for identifying diseases in potato plants is discussed utilizing image segmentation followed by multiclass SVM.
Initially, 300 image samples of potato leaves are acquired from the “plant village dataset,” and segmentation is applied for the region of interest (dis-
ease) by greenness removal mask and further statistical texture features are extracted through GLCM. Finally for the assortment of healthy, early
blight, and late blight disease; multiclass SVM was employed. In the article,41 the diagnosis of four pomegranate diseases (i.e., fruit spot, fruit rot,
bacterial blight, and leaf spot) is done over 500 image samples. The whole process of diagnosis is done by segmenting the samples through K-mean
clustering; using the GLCM method, texture feature is extracted and detection of disease is done through ANN. Authors of Reference 42 present the
system to recognize and categorize the two most widespread tea leave diseases in Bangladesh. Method possesses an image processing system con-
sisting of image acquisition, pre-processing and image processing steps, further features are selected and extracted, and finally, an SVM classifier is
used for the classification task. In the work of Ferentinos et al.,43 several CNN models were exploited to identify the plant-disease combination for
58 different classes for 25 plant species including healthy plants, where VGG architecture achieved a higher accuracy of 99.53%. Authors of Refer-
ence 44 suggested an approach to discriminate infected and healthy rice seedlings using MV-system. The point of the investigation was to define and
quantify the color and morphological traits of rice seedlings at duration of 3 weeks and using these traits to develop an ML classifier to differentiate
infected and healthy seedlings. In the study,45 the authors proposed a novel method based on image enhancement and CNN for maize leave disease
recognition. Initially, features of the maize infected leaves are enhanced through a novel WT-DIR algorithm, then DMS-Robust AlexNet was con-
structed for identification and classification purposes. DMS-Robust AlexNet comprises 5 convolution layers, together with a multiscale convolution
module and 3 fully-connected layers. Also, the AdaBound optimizer and PReLU activation function were selected in the constructed CNN archi-
tecture. Again tomato plant disease classification is done in the work of Brahimi et al.,46 where deep learning (CNN) model was constructed for the
assortment of nine different tomato plant diseases, focusing on leaves only. Also, disease regions in the leaves are identified utilizing the occlusion
technique. The proposed approach uses two D.L models viz. AlexNet and GoogleNet, and to check the effectiveness of the proposed model two types
12 of 22 MAHMOOD ET AL.
of comparisons are made. The first comparison is between “deep models without feature extraction” and “shallow models using handcraft features
(SVM & RF)”, where the accuracy of shallow and deep models are reported 95.47% and 99.18% respectively. And the second comparison is made
between “deep models with fine-tuning” and “deep models without fine-tuning”, where accuracy after fine-tuning for AlexNet progresses to 98.66%
from 97.35% and for GoogleNet progresses to 99.18% from 97.71%. Authors of Reference 47, discusses about the novel approach to automatically
detect and categorize the plant diseases into 13 category. The developed model uses CaffeNet, a deep CNN architecture containing 8 learning lay-
ers, 5 convolution and 3 fully connected layers, having potential to recognize leaf and discriminate between 13 infected leaves and healthy leaves.
Also, new database of 30,880 leaf images was created through augmentation process from the 4483 original images collected through internet
search. The overall accuracy of the trained model with and without fine tuning found to be 96.30% and 95.80% respectively. Finally in the literature,48
authors applied AdaBoost algorithm, which automatically selects the minimal best useful features out of total 399 candidate features used for input
training to distinguish the blemished and non-blemished potatoes. A minimalistic classifier using just 10 important features achieves the success
accuracy of 89.5% and 89.6% for red and white potatoes respectively. The experimental data were captured using Sony DSLR-A350K camera for
the red and white potatoes. Data set for the white potatoes and red potatoes consists of 102 and 22 images respectively including single blemish,
two distinct blemish, three blemish, and exceeding three blemish types. Summary of each identified work under disease recognition subcategory
are given in Table A4.
After disease recognition, crop weeds are the most significant menace to crop production. Weeds are bothersome plants that flourish with the
crop yield and compete with crop resources such as sunlight, water, soil nutrition, and so forth, causing losses and trouble to the prime crops. The
most serious issue with weeds combating is that they are difficult to recognize and segregate from crops. ML techniques can improve the identi-
fication and separation of weeds without causing adverse impacts on crops and the environment. In Reference 49, an automatic weed detection
learning model was proposed, deploying CNN with the training samples gathered from spinach and bean fields. The model detects crop rows and
crop inter-row weeds and further these inter-row weeds were consumed as a training sample for detecting the weed and crops in the images. The
accuracy of the obtained results was compared with supervised training sample labeling with a difference of 6% and 1.5% for bean and spinach
fields respectively. Reference 50 introduces a model based on texture information for discriminating weeds from pasture. The model consists
of a vision system that can be mounted to any field vehicle for capturing images of weed and pasture, and multiple ML techniques for classi-
fication purposes. The result displays that CNN achieved an accuracy of 96.88% and outperforms the other traditional ML algorithms. A weed
identification model in Reference 51, combines the K-mean feature learning with CNN to identify the three major weed types (i.e., Digitaria,
bindweed, and cephalanoplos) in soybean seedlings. The model replaces the random initial weights of CNN with unsupervised feature learning
of K-means as pre-training and concatenates multilayer and fine-tuning of parameters of CNN to perform weed identification. The result shows
that the accuracy of CNN with pre-training using K-mean is 1.82% better than accuracy using the random weight initialization method. Authors
of Reference 52 proposed an automated system consisting of segmentation and SVM decision process to reduce the herbicide quantity to be
sprayed against the weed of cereal crops, with minimum computational power and memory. The segmentation phase consists of several steps
such as image acquisition, image binarization, detecting crop lines, cell partitioning, and feature extraction, while the decision process decides
if the cell is to be sprayed or not, based on the prepared database. In Reference 53, the authors developed software to detect weeds in images
of the soybean crop and discriminate between broadleaf and grass weeds. More than 15,000 samples were collected, which were composed of
weeds, soybean, and soil images to perform training and testing. The performance of CaffeNet (CNN) was compared with AdaBoost, RF, and SVM,
and it was discovered that CaffeNet achieved a higher accuracy of above 98%. In the study,54 the feasibility of deploying different DCNN mod-
els for detecting different weed species in the bermudagrass was demonstrated. GoogleNet and VGGNet were trained for detecting weeds such
as Hedyotis corymbose, Hydrocotyle spp., and Richardia scabra in the bermudagrass, where VGGNet outperform the GoogleNet consistently.
Also, DetectNet, GoogleNet, and VGGNet were trained for detecting poa-annua weed in the bermudagrass, where DetectNet outperforms the
other two models. Again in Reference 55, a system was proposed that discriminates broadleaf, soybean and soil through BPNN and SVM clas-
sifier separately for spraying of herbicides correctly. The system uses histograms of grayscale images as a feature vector based on access red
index (i.e., color index) to train both the classifier. And the overall accuracy of BPNN and SVM is found to be 96.60% and 95.07%, respectively.
Reference 56 demonstrates a vigilant learning system to recognize maize crop and 10 different weed species according to spectral reflectance
differences. Hyperspectral imaging devices were used to extract the spectral features and detection of crops and weeds was done through con-
structed one-class classifiers (i.e., only target class information was known to the classifier). The crop recognition accuracy for MOG and SOM
based one-class classifier was 100% each, while the recognition accuracy of weed species for MOG and SOM-based one-class classifier varies
between 31%–98% and 53%–94%, respectively. Another article57 investigates the harnessing of SVM techniques to classify the chilly crops and
weeds from images. Classification is based on 14 features extracted from each sample images of weed and chilly. Results of SVM classifier achieved
the high accuracy of 97% over a small dataset consisting of five types of weed images as well as chilly images. In the final study,58 identification
and mapping of weed (i.e., silybum marianum) and other vegetation type utilizing three SOM based classifiers were reported. Texture layer and
MAHMOOD ET AL. 13 of 22
three spectral bands: red, green, and near-infrared were used as the input variables to the classifiers. Although accuracy reported for all the clas-
sifiers was high but CP-ANN performs slightly better than SKN and XY-F classifiers. Table A5 represents a brief overview of the papers discussed
above.
The precise recognition and order of crop quality traits can expand product costs and lessen squander. In comparison with human specialists, ML
can utilize seemingly diverse data and interconnections to reveal new characteristics playing role in crop quality detection and their grading. In
the very first study,59 a hyperspectral image system is combined with successive projection algorithm (SPA) and SVM to discriminate pear fruits
as deciduous-calyx fruits (DCF) which are rich in vitamin-C, soluble sugars and possess good shape; and persistence-calyx fruits (PCF) which is of
low quality. A successive projection algorithm was used to choose the effective wavelengths corresponding to pears and another algorithm was
utilized to recognize the calyx part in pears. Thus, the optimal wavelengths and shape factor (i.e., circularity) was used to discriminate the pear
types with the accuracy of 97.3% and 95% in the calibration and prediction set respectively. In another study,60 the authors build a computer
vision-based contactless quality evaluation system for table grapes. The system comprises image segmentation and RF technique for classify-
ing the two varieties (i.e., Italia and Victoria) of table grapes into five different quality levels (QL) classes. Cross-validation classification through
RF classifier on automatically chosen features for both the varieties of grapes was performed and the assortment precision of the Italia culti-
var is found to be surpassing the Victoria cultivar. In the work of Reference 61, an assortment system for cape gooseberry fruits according to
their ripeness stage was developed, combining four ML algorithms (ANN, SVM, DT, and KNN) with three color spaces (HSV, RGB, and L*a*b*).
SVM Model based on all the three-color spaces gives the maximum accuracy of 92.57%. In Reference 62, the authors deployed SVM and LDA
algorithms in association with PCA to evaluate the ripeness classes of tomatoes. Primarily, PCA was used to extract the color feature vectors
and HSV histograms, and further SVMs and LDA were used to assort the tomato into five distinct ripeness classes. In the experimental results,
one-against-one SVM with LKF achieved a high accuracy of 90.80%. Authors of Reference 63 compared the six ML models for accessing the
coconut sugar quality according to RGB values. The quality of the coconut was determined by classifying the coconut samples into three classes
superior, good, and rejected. Overall, higher accuracy and lesser execution time were found for the SGD model and SVM model respectively. In
the study,64 an automated system was proposed for cashew nuts grading according to their texture, color, size, and shape features. Five ML clas-
sifiers were deployed to compare the classification performance. It was observed that contrast improvement and filtering of samples boost the
performance of different classifiers. The best optimal result with 96.8% accuracy was obtained with the BPNN classifier. Further, in the work of
Zhang et al.,65 a hyperspectral imaging device and classifier were utilized to identify and categorize the different foreign matters within cotton
lint. Fourteen sorts of foreign matter were collected and sandwiched between cotton lint webs and a further assortment of foreign matter at
pixel and spectral levels were done through LDA and SVM classifier. The authors of the Reference 66 proposed a non-destructive approach for
the classification of healthy and damaged fruits using the PNN classifier. Experimental work was performed for the two cases over the 65 col-
lected image datasets consisting of healthy and damaged fruits. In case-1 and case-2, 350 features and 403 features respectively, belongs to the
intensity, and geometric features from HSV and RFG color space were extracted. Before applying PNN, Features were further reduced to 9 and
5 for case-1 and case-2 respectively using the sequential forward selection fisher algorithm. In recent research,67 LIBS in conjunction with ML
was applied to classify the olive oil specimen taking into account geographical origin and level of acidity. In the proposed system spectroscopic
data of plasma emission were collected when a powerful laser beam comes in contact with the olive oil sample. To lessen the dimensionality of
spectroscopic data PCA was employed and three ML classifiers were employed for the assortment of samples based on acidity criteria. Classifica-
tion accuracies of the models lie between 90% and 99.2%, where LDA gives the best accuracy of 99.2%. The final work for the quality evaluation
subcategory is reported in Reference 68, where the prediction of two geographical sources of rice crop using classification models according to
the chemical components of rice was proposed. To depict the rice sample elements, plasma mass spectrometer was applied and total 21 com-
ponents were found for each sample. Results shows that some of the components (i.e., cadmium, rubidium, magnesium, and potassium) capable
of discriminating the samples of two field with good accuracy. Table A6 encapsulate the above discussed articles under crop quality evaluation
subcategory.
The term livestock welfare is related to growth, weight gain, and good health condition of livestock, which rely on the behavioral pattern of live-
stock’s such as rumination, walking, standing, feeding, drinking, food intake types, facial expression, and so forth. Classification and prediction
of behavioral patterns using ML techniques can easily determine the stress, diseases, growth, and weight gain in livestock with high accuracy.
14 of 22 MAHMOOD ET AL.
There is a total of six research articles relevant to the livestock welfare subcategory were encountered. Primarily in the study,69 the authors pro-
posed the method for calving prediction in farm cattle based on ML by analyzing the cattle behavioral data accessed from two technologies: HR
Tag and IceCube. Both technologies were fitted to each cattle to monitor cattle activities such as—rumination, neck activity, standing time, lying
time, and so forth. Data accessed are further used in ML algorithms as variables to predict the calving in cattle. Another study70 aims to classify the
rumination behavior of graminivorous animals to understand the selection of fodders, as fodder is related to the nutrition, health, and growth of
animals. To measure the mandibular stress while chewing different fodder types, “Fiber Bragg Grating” sensors were employed to the jaws of calf
aging 2 months. Data acquired using in vivo optical extensometry were further segmented to get the particular data sample for each type of chewing
movement. It was observed that performance of DT-C4.5 algorithm to classify the movements into five classes: dietary supplement, hey, ryegrass,
rumination was promising. Reference 71 explores the possible use of ML algorithms to diagnose the risk of limpness in dairy herds using periodi-
cally collected farm based records. Results show that DT based algorithms performs better than multivariate logistic regression (i.e., GLM), whereas
ensemble models: RF, GBM, and XGB had the marginal improvements over the CART. In the work of Okinda et al.,72 an MV-based framework was
developed to recognize and predict the new castle disease virus in broiler chickens through accessing the shape descriptor and walk speed of chick-
ens. Feature variables such as circle variance, elongation, complexity, convexity, eccentricity, and walk speed were captured using a depth camera for
280 broiler chickens. Performance comparison of the developed model indicates that RBF-SVM checkmates the other models. Authors of Reference
73 demonstrate the working system for classifying cow behavior patterns using a sensor-based collar system and ensemble classifiers. Classifiers
learn the cow behavior using collar sensor data and ground facts and classify them into five behavior classes: grazing, searching, resting, ruminating,
and scratching/urinating. Ensemble classifiers along with conventional classifiers such as Binary Tree, LDA, NB, and KNN were applied for classifi-
cation, and Ensemble Bagging with tree learners achieved the highest average accuracy of 96% as a correct classification. In the final study,74 the
Sheep face image dataset and classification models for sheep faces based on facial pain expression were proposed. Seven types of CNN architec-
tures were used along with data augmentation, L2 regularization, and fine-tuning to classify the sheep facial database into two classes (i.e., normal
sheep class and abnormal sheep class), and the accuracy achieved by different models was found to be 93.10% to 100%. Table A7 summarizes each
study discussed above.
Livestock production deals with the production of animal/bird products such as wool, milk, egg, and meat by monitoring and identifying their breeds
and quality traits. ML provides fast and accurate predictions of livestock parameters for early estimation of product quantity and quality, which
enhances the farmer’s economic competency and also helps in fulfilling the food demands related to the livestock production system. A total of
seven research study was identified in this subcategory. In the primary study,75 yearling wool, health traits, climate data, and pasture data related
to sheep were utilized by ML models to predict the quality traits and wool growth in the adult sheep. Data records utilized in the study were gath-
ered for 8 years, to predict greasy fleece weight, clean fleece weight, staple length, fiber diameter, and staple potency of wool in adult sheep. The
prediction performance of models was compared and it was observed that BG and MT models perform nearly the same and better in comparison
to other models. In the second paper,76 the bovine weight estimation model is developed which exploits the past growth of a herd’s weight trajec-
tories over time. SVM model enables herds with few weights to estimate the future growth of herds accurately. Results of SVM were compared
with LRi model and found to be better and more accurate. The next two research study is based on the classification of animals according to their
breeds. In Reference 77, automatic classification of four sheep breeds exploiting MV and DL were employed. The research was performed with two
different types of the dataset (i.e., full sheep images and cropped facial images) and after fine-tuning of last 6 VGG layers for 10 epochs, maximum
accuracy was found to be above 95%. In Reference 78, a hierarchical model was proposed to classify the 12 Spanish goat species using 9 morpho-
logical traits and 3 aptitudes. The first hierarchy level of the model uses KNN/MLP for classifying the goats based on aptitude only and further in the
next hierarchy level breeds were examined again using three new KNN/MLP classifiers one for each aptitude (meat, milk & dual purpose). A combi-
nation of KNN-KNN, KNN-MLP, MLP-MLP, and MLP-KNN were employed to examine the capabilities of the different hierarchical models. Further,
in the work of Acu,79 a computer vision system along with CNN and baseline classifiers were deployed to classify the non-pollen and pollen-carrying
bees. A video capturing system was installed to observe the foraging bees at the entry and exit of hives and high-quality images of non-pollen and
pollen-carrying bees were extracted. Baseline classifiers (KNN, NB, and SVM) and CNNs (ResNet50, VGG19, and VGG16) were applied to classify
the high-quality extracted pollen and non-pollen carrying bee’s image samples. The last two research of this subcategory is based on recognition
systems to identify the individual livestock, to monitor the farm animals. In Reference 80, CNN based imaging system was proposed to recognize
the pig from their faces. To collect the face images of pig data acquisition, pre-processing and augmentation were applied for 10 individual pigs, and
collected data were trained and tested over three different face recognition models. Another paper81 discusses the identification of individual beef
cattle through a combined approach of CNN (InceptionV3) and LSTM methods using cattle video data. CNN extracts the visual features from video
samples and the LSTM model is trained with the extracted features, to recognize the individual cattle. The proposed (CNN + LSTM) framework out-
performs the framework using only the CNN method. Brief details about each of the identified works under the livestock production subcategory
are given in Table A8.
MAHMOOD ET AL. 15 of 22
Crop species selection is a complicated procedure of scanning specific genomes that define the impact of water and supplement use, adjustment to
environmental change, disease obstruction, as well as nutrient contents for a superior taste. Artificial intelligence, specifically, ML; processes many
years of farm data to examine crop performance in different climatic conditions and construct a probabilistic model that would foresee genomes
contributing to a gainful attribute to a plant. The first study82 discusses the detection of the seed counts per pod in soybean crops and their classi-
fication into three classes based on seeds count (2, 3, or 4). A deep learning method was applied over collected data sample images of soybean and
performance was compared with the SVM model that uses tailored feature extraction. In the next study,83 the authors compared the two classifiers:
PNN and MLP for selecting the maize and wheat individuals belonging to phenotypic target classes. Classifier classifies the continuous traits into
3-classes (i.e., upper, middle, and lower), based on two percentiles 15%–85% and 30%–70%. The precision-recall curve (AUCpr ) was applied to pre-
dict the accuracy of both the classifiers for 15%–30% upper class/ lower classes and 40%–70% middle class for the wheat and maize dataset. An
AUC criterion shows that PNN outperforms the MLP for choosing individuals of target class for all the 33 datasets of wheat and maize. Final work84
reported under the current subcategory utilizes pre-trained DL architecture to recognize and assort the plants according to their phenological
stages. The outcome of the model was compared with the classical NB classifier which was trained with manually extracted combined textural fea-
tures premised on HOG features and GLCM features. Results demonstrates that overall accuracy obtained for different plants ranges from 68.97%
to 82.41% and 73.765 to 87.14% for CNN and NB classifier respectively. Table A9 demonstrate the overview about all the three study reported
under the species breeding subcategory.
Species recognition corresponds to the recognition and assortment of plant or tree species, which is vital for biodiversity conservation and protec-
tion and contributes to our understanding of ecosystems. The traditional human approach for Species recognition is time-consuming and demands
an experienced person who is slightly available. To conquer these problems, multiple ML algorithms have been recommended to assist the auto-
mated recognition of plant species. A total of six research is being identified relevant to the species recognition subcategory, In the primary
study,85 the authors demonstrate that transfer learning along with deep features(DF) and fine-tuning (FT) improves the DL plant species classifi-
cation model’s accuracies. The performance of multiple transfer learning models such as FT-CNN, DF-CNN/SVM, DF-CNN/LDA, and CNN-RNN
models was investigated over the four publicly available datasets. The potential of the work was compared with the working DL models pre-
sented in the literature and it was observed that for each dataset at least one transfer learning model achieves better accuracies. In the second
literature,86 two computer vision and ANN-based methods are introduced and compared for classifying the Iranian chickpeas varieties. The first
method applies a hybrid of ANN and Particle Swarm Optimization (PSO) for classification using texture and color features, and the second method
applies three layers of BPNN which uses RGB pixel values as input for classification. The next paper87 discusses the recognition and assortment
of three leguminous plants namely, soybean, red bean, and white bean, based upon leaf vein morphological images using CNN. The paper shows
that vein morphology possesses properties that are ideal for the assortment of legumes. In the process of the discussed paper, initially, vein seg-
mentation is performed over original color images of legume leaves and then the central patch was extracted and finally, CNN with five layers was
utilized to assort the leaves into three classes. In another research,88 automatic recognition of 15 medicinal plant species relying on texture and
color features using five ML algorithms was demonstrated. Also, a new dataset was introduced and the classification accuracy for all five algo-
rithms shows that RF and MLP with backpropagation achieve higher accuracy. Authors of Reference 89 present a novel vegetable dataset named
“CropDeep,” gathered from greenhouses using cameras and equipment which represents real-world data to examine the performance of sev-
eral DL frameworks for the classification and detection purpose. In the final work90 of current category, plant seedlings classification for early
recognition of plant species to conduct weeding at initial growth stage was discussed. CNN classification model was applied over the dataset and
performance was compared against two traditional algorithms. Table A10 summarizes all the six-study reported under the species recognition
subcategory.
5 DISCUSSION
In the present review paper, we dealt with systematic analysis and the upcoming prospects of ML applications in the agricultural framework. To
this perspective total of 81 research articles were analyzed, dedicated to 10 subcategories of the agriculture system. The analyzed articles were
obtained from three scientific databases, where 44 articles (i.e., 54%) were from “ScienceDirect”, 20 articles (i.e., 25%) from “Google Scholar” and
17 articles (i.e., 21%) from “IEEE-Xplore.”
16 of 22 MAHMOOD ET AL.
Among the category/subcategory-wise articles distribution: a maximum of 48 papers are related to crop management, 13 papers are related
to livestock management, 11 papers are related to field condition management, while the minimum number (i.e., 9 papers) are related to species
management. Category and subcategory-wise distribution of articles is presented in Figure 4.
The analysis shows that applying ML in crop management was the most prominent and encouraging area for the researchers in the past decade,
especially for the disease recognition subcategory which possesses 22% research contributions, followed by yield prediction, weed spotting, and
quality evaluation of crops possesses 12% each. Livestock management with 16% research contribution was observed as the next most adopting
and growing research area after crop management. Livestock management constitutes livestock welfare and livestock production subcategories
sharing 8% and 9% of research efforts respectively. Field condition management with 14% research contribution is found to be the next prevailing
area for the researchers giving chase to livestock management. It possesses two interesting subcategories; water monitoring and soil observation
devoted to precision agriculture with 8% and 6% research contributions respectively. The species management with an 11% research endeavor was
observed as the least influential category compared to the other categories. Although subcategory; species recognition under species management
being dominant with 7% research efforts over species breeding with just 4% effort contributions.
The year-wise research contribution for each subcategory is depicted in Table 4, which describes the current status of different subcategories.
It was observed that in the past one-decade total 65 numbers of articles were published during 2016–2020, while only 16 articles were published
Species
Management Disease Recognition
11%
Yield Prediction
Livestock Weed Spotting
Management
4%7% Quality Evaluation
16% 9% 22%
7% Water Monitoring
6% 12% Soil Observation
Field Condition 8% Crop
Livestock Welfare
Management 12% 12% Management
59% Livestock Production
14%
Species Breeding
Species Recognition
2010 1 1 2
2011 1 1
2012 1 1 2
2013 1 1
2014 1 2 3
2015 2 1 2 2 7
2016 3 1 3 2 1 2 1 1 14
2017 2 2 6 3 2 1 1 17
2018 2 1 2 3 3 1 3 1 1 17
2019 1 1 1 2 1 2 4 12
2020 1 1 1 2 05
Colored entries of matching tone shows the subcategories belong to each agricultural categories, and bold highlights all the categories name and their
respective subcategories.
MAHMOOD ET AL. 17 of 22
0 5 10 15 20 25 30 35
during 2010–2015. It shows that in recent years ML application in the agriculture system had attracted the researcher’s attention drastically espe-
cially for crop management, while less attention was given towards the other categories. It can be presumed that although a lot of valuable work has
been done for crop management it still needs to be improved for better precision agriculture to accomplish the food demands. But much attention is
being required in the remaining categories especially in species breeding under the species recognition category for the betterment of crop quality
based on food nutrients and taste. Categories like field condition management, livestock management, and species management are the significant
areas that could be more emphasized for the betterment of precision agriculture and food supplement as well as enhancement of the economic
growth of the country.
In context to ML algorithms/models utilization in agriculture, after the analysis, it was found that more than 50 ML algorithms were applied
by the author of different pieces of literature, which belongs to a sum of nine different models. The trajectory of ML algorithms/models in the
agriculture system depicts many sorts of information about the applicability and adaptability of ML models concerning different subcategories
of agriculture. The traditional ML model especially SVM and current ANN/DL are the most frequently adopted model in all the subcategories.
Both the models were implemented for the utmost time in the literature devoted to every single subcategory giving the best or satisfactory
results. Models such as EL, RM, DT, and IBL are the further most adopted models after SVM and ANN/DL model and applied for 8 to 5 sub-
categories. Although these four models are eminent and produce good results in recent past years, they are being normally implemented to
compare the performances with SVM or ANN/DL models. The EL model is applied to almost every subcategory except water monitoring and
species breeding. The RM is applied for subcategories such as soil observation, water monitoring, yield prediction, weed spotting, quality evalua-
tion, livestock welfare, and livestock production. The DT model is applied for subcategories such as yield prediction, disease recognition, quality
evaluation, livestock welfare, livestock production, and species recognition. And the IBL model has applied for subcategories: yield prediction,
weed spotting, quality evaluation, livestock production, and species recognition. Furthermore, BM and DR models are applied for just 3 to 4
subcategories and.
CM is the least applied model limited to only two subcategories. BM is deployed for subcategories such as yield prediction, livestock produc-
tion, and species breeding. Similarly, the DR model is deployed for subcategories: disease recognition, quality evaluation, livestock production, and
livestock welfare. And the clustering model is accompanied by disease recognition and weed spotting sub-subcategories. However, the modern ML
(i.e., DL) was an excessively practiced model and gaining momentum now a day in the agriculture sector. In recent past years, several CNN and RNN
architectures under the DL model are competing with popular traditional ML models (i.e., SVM and ANN) and performance-wise leading in most of
the subcategories except a few like soil observations, water monitoring, and quality evaluation. Figure 5, demonstrates the nine models with their
count against each subcategory belonging to four major categories.
A total of seventeen (17) publicly available datasets were discovered in the analyzed papers dedicated to the different subcategories. The avail-
ability of the dataset was verified and checked. Most of the article utilizes the dataset as a whole or just a portion of the dataset for the training and
testing purpose in their proposed learning model. A maximum of 4 datasets was related to species recognition; 3 datasets each for yield prediction
and species breeding; 2 datasets each for livestock welfare and weed spotting; and 1 each for disease recognition, soil observation, and livestock
production. Table 5, depicts the category-wise publicly available datasets.
18 of 22 MAHMOOD ET AL.
TA B L E 5 Publicly available datasets for the different subcategories of four major categories
Yield prediction SoilGrids.org Dataset of soil with a resolution of 250 m per pixel for clay,
sand, silt contents, and fine earth and coarse fragments
bulk density information.
Disease recognition www.plantvillage.org A plant-village dataset with 54,306 images of healthy and
unhealthy plant leaves for 14 crop species
Weed spotting https://github.com/AlexOlsen/DeepWeeds The deep weed dataset contains 17,509 labeled images of
8 weed species of Australia.
Livestock welfare https://github.com/Mrborchers/Machine-learning- Lying and rumination behavior dataset for calving
based-calving-prediction-from-activity-lying-and- prediction in dairy cattle.
rumination-behaviors
Livestock production https://github.com/piperod/PollenDataset Pollen and non-pollen bearing high-resolution bees image
dataset.
Species breeding http://image-net.org/ Images of plants, flowers, trees, vegetables, and so forth
Species recognition (https://www.theimagingsource.com/products/ Dataset consists of 1019 Iranian chickpeas images of five
industrial-cameras/gige-color/dfk23gm021/ varieties.
6 CONCLUSION
This review paper methodically examines the ML-based agricultural-related study to benefit precision agriculture and categorizes them into
four main categories and each category into their subcategories. All the examined research efforts were summarized in the view of problem
description, ML algorithm/model, dataset, outcomes, and future work or limitations associated with each article. Our review paper reports the
multiple possible categories/subcategories of agriculture altogether at one place for better comprehension of the scope and multiple areas of
the agriculture system where ML can be applied. This paper can serve as an investigatory guide to many stakeholders such as researchers, aca-
demicians, practitioners, engineers, manufacturers, policymakers, and of course the farmers. Any researcher, practitioner, or academician may use
this survey to understand the significance of ML in diverse agricultural sub-domains and apply them for the development and enhancement of
the agricultural system. While the engineers and manufacturers may use this survey to plan and design the proposed working models as a prac-
tical model to improve the agribusiness enterprises and also to assist the farmers. Also, this survey may help the farmers and policymakers in
understanding the current and upcoming agricultural growth and take decisions for the betterment of the agri-system and fulfillment of food
demands.
MAHMOOD ET AL. 19 of 22
Through the analysis of pieces of literature, it was observed that CNN, ANN, and SVM are prominent models being applied to agriculture
and are producing promising results. But conclusion about the best model cannot be drawn in favor of any particular model as it depends upon
the type of problem, dataset size, and dataset type. Many recent pieces of literature briskly employ DL models (i.e., DNN, CNN, and LSTM) in
precision agriculture. In continuation of this, it can be said that DL models are taking over ML models at a good pace. Besides all the opportu-
nities and growth of ML/DL in the agricultural system, there are yet significant difficulties. Initially, the innovation will keep on venturing into
new application zones later on, and there will be further technical challenges that are required to be resolved. Secondly, it is fundamental to
gather huge data set corresponding to the agricultural sector for better training and agricultural precision. Thirdly, with the quick improvement
of agricultural automation, the expert’s demand will keep on growing. Finally, the robust models/techniques of related advances in different com-
plex situations will likewise confront difficulties. We believe that later on in the future, machine and computer vision technology along with big
data will be joined with savvy technologies, viz. DL and ML technology, be applied to every aspect of farming, be more broadly harnessed to take
account of the current farming issues, and better improve the monetary, general, and powerful execution of agricultural automation frameworks in a
smarter way.
The future work aims to upgrade the subcategories of agriculture where ML/DL are approaching, and add recent articles to the existing cat-
egories based on novelty. Also to perform a comparison of results obtained using DL models and ML models for various categories of agriculture
systems.
ACKNOWLEDGMENTS
This research received no specific grant funding from any funding agency, commercial, or not-for-profit sectors. The authors express their gratitude
in advance to the reviewers for their significant suggestions and comments.
CONFLICT OF INTEREST
The authors declare no potential conflict of interest.
AUTHOR CONTRIBUTIONS
Atif Mahmood contemplated, designed the structure of the article, and accomplished paper writing. Amod Kumar Tiwari and Sanjay Kumar Singh
were involved in the planning and supervision of the article; Sandeep S. Udmale contributed to the suggestion and final revision of the article. All
authors have read and approved the manuscript.
ORCID
Atif Mahmood https://orcid.org/0000-0002-8220-1708
REFERENCES
1. Ratta P, Kaur A, Sharma S, Shabaz M, Dhiman G. Application of blockchain and internet of things in healthcare and medical sector: applications,
challenges, and future perspectives. J Food Qual. 2021;21(1):1-20. doi:10.1155/2021/7608296
2. Van Klompenburg T, Kassahun A, Catal C. Crop yield prediction using machine learning: a systematic literature review. Comput Electron Agric.
2020;177(January):1-18. doi:10.1016/j.compag.2020.105709
3. McQueen RJ, Garner SR, Nevill-Manning CG, Witten IH. Applying machine learning to agricultural data. Comput Electron Agric. 1995;12(4):275-293.
doi:10.1016/0168-1699(95)98601-9
4. Ahmad J, Sparsh W, Muzamil M, Ahmed S, Sharma S, Singh S. Machine learning and deep learning based computational techniques in automatic
agricultural diseases detection: methodologies, applications, and challenges. Arch Comput Methods Eng. 2021;29:641-677. doi:10.1007/s11831-021-
09588-5
5. Li G, Chen Z, Purswell J, Linhoss J. Practices and applications of convolutional neural network-based computer vision Systems in Animal Farming: a
review. Sensors. 2021;(21:(February):1-44. doi:10.3390/s21041492
6. Bhargava A, Bansal A. Fruits and vegetables quality evaluation using computer vision: a review. J King Saud Univ – Comput Inform Sci.
2018;33(3):243-257. doi:10.1016/j.jksuci.2018.06.002
7. Wang A, Zhang W, Wei X. A review on weed detection using ground-based machine vision and image processing techniques. Comput Electron Agric.
2019;158(February):226-240. doi:10.1016/j.compag.2019.02.005
8. Benos L, Tagarakis AC, Dolias G, Berruto R, Kateris D, Bochtis D. Machine learning in agriculture: a comprehensive updated review. Sensors.
2021;21(4):1-55. doi:10.3390/s21113758
9. Machine Learning in Agriculture: Applications and Techniques | by Sciforce | Sciforce | Medium. https://medium.com/sciforce/machine-learning-in-
agriculture-applications-and-techniques-6ab501f4d1b5. Accessed January 10, 2020.
10. Kamilaris A, Prenafeta-Boldú FX. Deep learning in agriculture: a survey. Comput Electron Agric. 2018;147(February):70-90. doi:10.1016/j.compag.2018.
02.016
11. Khanal S, Fulton J, Klopfenstein A, Douridas N, Shearer S. Integration of high resolution remotely sensed data and machine learning techniques for
spatial prediction of soil properties and corn yield. Comput Electron Agric. 2018;153(January):213-225. doi:10.1016/j.compag.2018.07.016
20 of 22 MAHMOOD ET AL.
12. Song Y, Zhao X, Su H, Li B, Hu Y, Cui X. Predicting spatial variations in soil nutrients with hyperspectral remote sensing at regional scale. Sensors;
2018;18:1-18. doi:10.3390/s18093086
13. Morellos A, Pantazi XE, Alexandridis T, et al. Special issue: proximal soil sensing machine learning based prediction of soil total nitrogen, organic carbon
and moisture content by using VIS-NIR spectroscopy. Biosyst Eng. 2016;152:1-13. doi:10.1016/j.biosystemseng.2016.04.018
14. Nahvi B, Habibi J, Mohammadi K, Shamshirband S. Using self-adaptive evolutionary algorithm to improve the performance of an extreme learning
machine for estimating soil temperature. Comput Electron Agric. 2016;124:150-160. doi:10.1016/j.compag.2016.03.025
15. Nie H, Yang L, Li X, Ren L, Xu J, Feng Y. Spatial prediction of soil moisture content in winter wheat based on machine learning model. 2018 26th Int. Conf.
Geoinformatics; WIT Press; 2016:1-6.
16. Shrestha NK, Shukla S. Support vector machine based modeling of evapotranspiration using hydro-climatic variables in a sub-tropical environment.
Agric for Meteorol. 2015;200:172-184. doi:10.1016/j.agrformet.2014.09.025
17. Patil AP, Deka PC. An extreme learning machine approach for modeling evapotranspiration using extrinsic inputs. Comput Electron Agric.
2019:121(March):385-392. doi: 10.1016/j.compag.2016.01.016
18. Mehdizadeh S, Behmanesh J, Khalili K. Using MARS, SVM, GEP and empirical equations for estimation of monthly mean reference evapotranspiration.
Comput Electron Agric. 2017;139:103-114. doi:10.1016/j.compag.2017.05.002
19. Tang D, Feng Y, Gong D, Hao W, Cui N. Evaluation of artificial intelligence models for actual crop evapotranspiration modeling in mulched and
non-mulched maize croplands. Comput Electron Agric. 2018;152(September):375-384. doi:10.1016/j.compag.2018.07.029
20. Feng Y, Peng Y, Cui N, Gong D, Zhang K. Modeling reference evapotranspiration using extreme learning machine and generalized regression neural
network only with temperature data. Comput Electron Agric. 2017;136:71-78. doi:10.1016/j.compag.2017.01.027
21. Mohammadi K, Shamshirband S, Motamedi S, Petkovic D, Hashim R, Gocic M. Extreme learning machine based prediction of daily dew point tempera-
ture. Comput Electron Agric. 2015;117:214-225. doi:10.1016/j.compag.2015.08.008
22. Panda SS, Ames DP, Panigrahi S. Application of vegetation indices for agricultural crop yield prediction using neural network techniques. Remote Sens.
2010;2:673-696. doi:10.3390/rs2030673
23. Maya Gopal PS, Bhargavi R. A novel approach for efficient crop yield prediction. Comput Electron Agric. 2019;165(June):104968. doi:10.1016/j.compag.
2019.104968
24. Kuo T, Chen C, Tsai P. Accuracy analysis mechanism for agriculture data using the ensemble neural network method. Sustainability.
2016;8(August):1-11. doi:10.3390/su8080735
25. Shakoor T, Rahman K, Rayta SN, Chakrabarty A. Agricultural production output prediction using supervised machine learning techniques. 2017 1st
International Conference on Next Generation Computing Applications (NextComp); IEEE; 2017:19-21.
26. Oliveira I, Cunha RLF, Silva B, Netto MAS. A scalable machine learning system for pre-season agriculture yield forecast. 2018 IEEE 14th Int. Conf.
e-Science; IEEE; 2018:423-430. doi:10.1109/eScience.2018.00131
27. Ali I, Cawkwell F, Dwyer E, Green S. Modeling managed grassland biomass estimation by using multitemporal remote sensing data—a machine learning
approach. IEEE J Sel Top Appl Earth Obs Remote Sens. 2016;10(7):1-16.
28. Pantazi XE, Moshou D, Alexandridis T, Whetton RL, Mouazen AM. Wheat yield prediction using machine learning and advanced sensing techniques.
Comput Electron Agric. 2016;121:57-65. doi:10.1016/j.compag.2015.11.018
29. Ramos PJ, Prieto FA, Montoya EC, Oliveros CE. Automatic fruit count on coffee branches using computer vision. Comput Electron Agric. 2017;137:9-22.
doi:10.1016/j.compag.2017.03.010
30. Chu Z, Yu J. An end-to-end model for rice yield prediction using deep learning fusion. Comput Electron Agric. 2020;174(January):105471. doi:10.1016/
j.compag.2020.105471
31. Rangarajan AK, Purushothaman R. Tomato crop disease classification using pre-trained deep learning algorithm. Procedia Comput Sci.
2018;133:1040-1047. doi:10.1016/j.procs.2018.07.070
32. Pantazi XE, Tamouridou AA, Alexandridis TK, Lagopodi AL, Kontouris G, Moshou D. Detection of Silybum marianum infection with Microbotryum
silybum using VNIR field spectroscopy. Comput Electron Agric. 2017;137:130-137. doi:10.1016/j.compag.2017.03.017
33. Dutta R, Smith D, Shu Y, Liu Q, Doust P, Heidrich S. Salad leaf disease detection using machine learning based hyper spectral sensing. IEEE Sensors; IEEE;
2014:5-8. doi:10.1109/ICSENS.2014.6985047
34. Pantazi XE, Moshou D, Bochtis D. Detection of biotic and abiotic stresses in crops by using hierarchical self organizing classifiers. Precis Agric.
2017;18:383-393. doi:10.1007/s11119-017-9507-8
35. Ebrahimi MA, Khoshtaghaza MH, Minaei S, Jamshidi B. Vision-based pest detection based on SVM classification method vision-based pest detection
based on SVM classification method. Comput Electron Agric. 2017;137(October):52-58. doi:10.1016/j.compag.2017.03.016
36. Zhang W, Teng G, Wang C. Identification of jujube trees diseases using neural network. Opt – Int J Light Electron Opt. 2013;124(11):1034-1037. doi:10.
1016/j.ijleo.2013.01.014
37. Amara J, Bouaziz B, Algergawy A. A Deep Learning-based Approach for Banana Leaf Diseases Classification; BTW; 2017:79-88.
38. Gómez-sanchis J, Martín-guerrero JD, Soria-olivas E, Martínez-sober M, Magdalena-benedito R, Blasco J. Detecting rottenness caused by Penicillium
genus fungi in citrus fruits using machine learning techniques. Expert Syst Appl. 2012;39:780-785. doi:10.1016/j.eswa.2011.07.073
39. Arsenovic M, Karanovic M, Sladojevic S, Anderla A. Solving current limitations of deep learning based approaches for plant disease detection. SS
Symmetry. 2019;11(7):1-21.
40. Islam M, Dinh A, Wahid K. Detection of potato diseases using image segmentation and multiclass support vector machine. 2017 IEEE 30th Canadian
Conference on Electrical and Computer Engineering (CCECE); IEEE; 2017:8-11.
41. Dhakate M, Ingole AB. Diagnosis of pomegranate plant diseases using neural network. 2015 Fifth National Conference on Computer Vision, Pattern
Recognition, Image Processing and Graphics (NCVPRIPG); IEEE; 2015. doi:10.1109/NCVPRIPG.2015.7490056
42. Hossain S, Mou RM, Hasan MM, Chakraborty S, Razzak MA. Recognition and detection of tea leaf’s diseases using support vector machine. 2018 IEEE
14th International Colloquium on Signal Processing & Its Applications (CSPA); IEEE; 2018:9-10.
43. Ferentinos KP. Deep learning models for plant disease detection and diagnosis. Comput Electron Agric. 2018;145(February):311-318. doi:10.1016/j.
compag.2018.01.009
44. Chung C, Huang K, Chen S, Lai M, Chen Y, Kuo Y. Detecting Bakanae disease in rice seedlings by machine vision. Comput Electron Agric.
2016;121:404-411. doi:10.1016/j.compag.2016.01.008
MAHMOOD ET AL. 21 of 22
45. Lv M, Zhou G, He M, Zhang W, Hu Y, Chen A. Maize leaf disease identification based on feature enhancement and DMS-robust Alexnet. IEEE Access.
2020;8:57952-57966.
46. Brahimi M, Boukhalfa K, Moussaoui A, Brahimi M. Deep learning for tomato diseases: classification and symptoms visualization. Appl Artif Intell.
2017;31(4):299-315. doi:10.1080/08839514.2017.1315516
47. Sladojevic S, Arsenovic M, Anderla A, Culibrk D, Stefanovic D. Deep neural networks based recognition of plant diseases by leaf image classification.
Comput Intell Neurosci. 2016;2016(June)1-11. doi:10.1155/2016/3289801
48. Barnes M, Duckett T, Cielniak G, Stroud G, Harper G. Visual detection of blemishes in potatoes using minimalist boosted classifiers. J Food Eng.
2010;98(3):339-346. doi:10.1016/j.jfoodeng.2010.01.010
49. Bah MD, Hafiane A, Canals R. Deep learning with unsupervised data labeling for weed detection in line crops in UAV images. Remote Sens.
2018;10(11):1-22. doi:10.3390/rs10111690
50. Zhang W, Hansen MF, Volonakis TN, et al. Broad-leaf weed detection in pasture. 2018 IEEE 3rd Int. Conf Image, Vis Comput; IEEE; 2018:101-105. doi:10.
1109/ICIVC.2018.8492831
51. Tang J, Wang D, Zhang Z, He L, Xin J, Xu Y. Weed identification based on K-means feature learning combined with convolutional neural network
cephalanoplos digitaria bindweed soybean. Comput Electron Agric. 2017;135:63-70. doi:10.1016/j.compag.2017.01.001
52. Tellaeche A, Pajares G, Burgos-artizzu XP, Ribeiro A. A computer vision approach for weeds identification through support vector machines. Appl Soft
Comput. 2011;11:908-915. doi:10.1016/j.asoc.2010.01.011
53. Ferreira S, Freitas DM, Gonçalves G, Pistori H, Theophilo M. Weed detection in soybean crops using ConvNets. Comput Electron Agric.
2017;143(October):314-324. doi:10.1016/j.compag.2017.10.027
54. Yu J, Sharpe SM, Schumann AW, Boyd NS. Deep learning for image-based weed detection in turfgrass. Eur J Agron. 2019;104(March):78-84. doi:10.
1016/j.eja.2019.01.004
55. Abouzahir S, Sadik M, Sabir E. Enhanced approach for weeds species detection using machine vision. 2018 Int. Conf Electron Control Optim Comput Sci;
IEEE; 2018:1-6. doi:10.1109/ICECOCS.2018.8610505
56. Pantazi X, Moshou D, Bravo C. Active learning system for weed species recognition based on hyperspectral sensing. Biosyst Eng. 2016;146:1-10. doi:10.
1016/j.biosystemseng.2016.01.014
57. Ahmed F, Al-mamun HA, Bari ASMH, Hossain E, Kwan P. Classification of crops and weeds from digital images: a support vector machine approach.
Crop Prot. 2012;40:98-104. doi:10.1016/j.cropro.2012.04.024
58. Pantazi XE, Tamouridou AA, Alexandridis TK, Lagopodi AL, Kashefi J, Moshou D. Evaluation of hierarchical self-organising maps for weed mapping using
UAS multispectral imagery. Comput Electron Agric. 2017;139:224-230. doi:10.1016/j.compag.2017.05.026
59. Hu H, Pan L, Sun K, et al. Differentiation of deciduous-calyx and persistent-calyx pears using hyperspectral reflectance imaging and multivariate
analysis. Comput Electron Agric. 2017;137:150-156. doi:10.1016/j.compag.2017.04.002
60. Pietro D, Cefola M, Attolico G, Pace B, Francesco A. Non-destructive and contactless quality evaluation of table grapes by a computer vision system.
Comput Electron Agric. 2019;156(December 2018):558-564. doi:10.1016/j.compag.2018.12.019
61. Bazán K, Avila-george H, Member S. Classification of cape gooseberry fruit according to its level of ripeness using machine learning techniques and
different color spaces. IEEE Access. 2019;7:27389-27400. doi:10.1109/ACCESS.2019.2898223
62. El-bendary N, El Hariri E, Hassanien AE, Badr A. Using machine learning techniques for evaluating tomato ripeness. Expert Syst Appl.
2014;42(4):1892-1905. doi:10.1016/j.eswa.2014.09.057
63. Alonzo LMB, Chioson FB, Co HS, Bugtai NT, Baldovino RG. A machine learning approach for coconut sugar quality assessment and prediction. 2018
IEEE 10th Int. Conf Humanoid, Nanotechnology, Inf Technol Control Environ Manag; IEEE; 2018:1-4. doi:10.1109/HNICEM.2018.8666315
64. M. O. Arun, A. Nath, and Shyna, A, “Automated Cashew Kernel Grading Using Machine Vision. 2016 International Conference on Next Generation Intelligent
Systems (ICNGIS),” IEEE; 2016.
65. Zhang M, Li C, Yang F. Classification of foreign matter embedded inside cotton lint using short wave infrared (SWIR) hyperspectral transmittance
imaging. Comput Electron Agric. 2017;139:75-90. doi:10.1016/j.compag.2017.05.005
66. Vani A, Vinod DS. Automatic quality evaluation of fruits using probabilistic neural network approach. International Conference on Contemporary Comput-
ing and Informatics (IC3I); IEEE; 2014:308-311. doi:10.1109/IC3I.2014.7019807
67. Gazeli O, Bellou E, Stefas D, Couris S. Laser-based classification of olive oils assisted by machine learning. Food Chem. 2020;302(August 2019):125329.
doi:10.1016/j.foodchem.2019.125329
68. Maione C, Lemos B, Dobal A, Barbosa F, Melgaço R. Classification of geographic origin of rice by data mining and inductively coupled plasma mass
spectrometry. Comput Electron Agric. 2016;121:101-107. doi:10.1016/j.compag.2015.11.009
69. Borchers MR, Chang YM, Proudfoot KL, Wadsworth BA, Stone AE, Bewley JM. Machine-learning-based calving prediction from activity, lying, and
ruminating behaviors in dairy cattle. J. Dairy Sci. 2017;100(7):5664-5674. doi:10.3168/jds.2016-11526
70. Pegorini V, Karam LZ, Pitta CSR, et al. In vivo pattern classification of ingestive behavior in ruminants using fbg sensors and machine learning. Sensors.
2015;15(11):28456-28471. doi:10.3390/s151128456
71. Warner D, Vasseur E, Lefebvre DM, Lacroix R. A machine learning based decision aid for lameness in dairy herds using farm-based records. Comput
Electron Agric. 2020;169(January):1-7.
72. Okinda C, Lu M, Liu L, et al. A machine vision system for early detection and prediction of sick birds: a broiler chicken model. Biosyst Eng.
2019;188:229-242. doi:10.1016/j.biosystemseng.2019.09.015
73. Dutta R, Smith D, Rawnsley R, et al. Dynamic cattle behavioural classification using supervised ensemble classifiers. Comput Electron Agric.
2015;111:18-28. doi:10.1016/j.compag.2014.12.002
74. Noor A, Ai MS, Zhao Y, Koubaa A. Automated sheep facial expression classi fi cation using deep transfer learning. Comput Electron Agric.
2020;175(April):105528. doi:10.1016/j.compag.2020.105528
75. Shahinfar S, Kahn L. Machine learning approaches for early prediction of adult wool growth and quality in Australian merino sheep. Comput Electron
Agric. 2018;148(February):72-81. doi:10.1016/j.compag.2018.03.001
76. Alonso J, Villa A, Bahamonde A. Improved estimation of bovine weight trajectories using support vector machine classification. Comput Electron Agric.
2015;110:36-41. doi:10.1016/j.compag.2014.10.001
22 of 22 MAHMOOD ET AL.
77. Jwade SA, Guzzomi A, Mian A. On farm automatic sheep breed classi fi cation using deep learning. Comput Electron Agric. 2019;167(October):105055.
doi:10.1016/j.compag.2019.105055
78. Rodero E, González A, Dorado-moreno M, Luque M, Hervás C. Classification of goat genetic resources using morphological traits. Comparison of
machine learning techniques with linear discriminant analysis. Livest Sci. 2015;180:14-21. doi:10.1016/j.livsci.2015.06.028
79. Acu E. Recognition of pollen-bearing bees from video using convolutional neural network. 2018 IEEE Winter Conference on Applications of Computer Vision
(WACV); IEEE; 2018. doi:10.1109/WACV.2018.00041
80. Hansen MF, Smith ML, Smith LN, et al. Towards on-farm pig face recognition using convolutional neural networks. computers in industry.
2018;98(August):145-152. doi:10.1016/j.compind.2018.02.016
81. Qiao Y, Su D, Clark C, Lomax S, Clark C. Individual cattle identification using a Deep Learning Based Framework. IFAC PapersOnLine.
2019;52(30):318-323. doi:10.1016/j.ifacol.2019.12.558
82. Uzal LC, Grinblat GL, Namías R, et al. Seed-per-pod estimation for plant breeding using deep learning. Comput Electron Agric. 2018;150(December
2017):196-204. doi:10.1016/j.compag.2018.04.024
83. González-camacho JM, Crossa J, Pérez-rodríguez P, Ornella L, Gianola D. Genome-enabled prediction using probabilistic neural network classifiers.
BMC Genomics. 2016;17:1-16. doi:10.1186/s12864-016-2553-1
84. Yalcin H. Plant phenology recognition using deep learning: deep-pheno. 2017 6th International Conference on Agro-Geoinformatics.IEEE; 2017.
85. Kaya A, Seydi A, Catal C, Yalin H, Temucin H. Analysis of transfer learning for deep neural network based plant classification models. Comput Electron
Agric. 2019;158(January):20-29. doi:10.1016/j.compag.2019.01.041
86. Techniques CV. Automatic classification of chickpea varieties using computer vision techniques. Agronomy. 2019;9(11):1-12.
87. Grinblat GL, Uzal LC, Larese MG, Granitto PM. Deep learning for plant identification using vein morphological patterns. Comput Electron Agric.
2016;127:418-424. doi:10.1016/j.compag.2016.07.003
88. Pacifico LDS, Britto LFS, Oliveira EG, Ludermir TB. Automatic classification of medicinal plant species based on color and texture features. 2019 8th
Brazilian Conf. Intell. Syst; IEEE; 2019:741-746. doi:10.1109/BRACIS.2019.00133
89. Zheng Y-Y, Kong J-L, Jin X-B, Wang X-Y, Su T-L, Zuo M. CropDeep: the crop vision dataset for deep-learning-based classification and detection in
precision agriculture. Sensors. 2019:19:1-21. doi:10.3390/s19051058
90. Nkemelu D, Omeiza D, Lubalo N. Deep convolutional neural network for plant seedlings classification. arXiv:1811.08404. 2018;18(November):1-5.
91. Abdi A. Three Types of Machine Learning Algorithms; ResearchGate; 2016. doi:10.13140/RG.2.2.26209.10088
92. Alpayd𝚤n E. Introduction to Machine Learning. 2nd ed. The MIT Press Cambridge; 2010.
93. Sutton RS, Barto AG, Bach F. Reinforcement Learning: An Introduction. 2nd ed. MIT Press; 2017.
94. Brownlee J. Master Machine Learning Algorithms. 5th ed.; Jason Brownlee; 2016.
95. Cho HA, Golberg MA. Introduction to Regression Analysis. Southampton, Boston: WIT Press; 2010.
96. Rehman TU, Mahmud MS, Chang YK, Jin J, Shin J. Current and future applications of statistical machine learning algorithms for agricultural machine
vision systems. Comput Electron Agric. 2019;156(March 2018):585-605. doi:10.1016/j.compag.2018.12.006
97. Sharma S, Ahmed S, Naseem M, Alnumay WS, Singh S, Cho GH. A survey on applications of artificial intelligence for pre-parametric project cost and soil
shear-strength estimation in construction and geotechnical engineering. Sensors. 2021;21:463. doi:10.3390/s21020463
98. Oladipupo T. Types of machine learning algorithms. New Advances in Machine Learning. IntechOpen; 2010;19-49. doi:10.5772/9385
99. A Tour of Machine Learning Algorithms. https://machinelearningmastery.com/a-tour-of-machine-learning-algorithms/. Accessed November 24, 2019.
100. Sharma R, Kamble S, Gunasekaran A, Kumar V. A systematic literature review on machine learning applications for sustainable agriculture supply chain
performance. Comput Oper Res. 2020;119:1-17. doi:10.1016/j.cor.2020.104926
101. Jones MT. Deep Learning Architectures. IBM; 2017.
How to cite this article: Mahmood A, Tiwari AK, Singh SK, Udmale SS. Contemporary machine learning applications in agriculture: Quo
Vadis?. Concurrency Computat Pract Exper. 2022;e6940. doi: 10.1002/cpe.6940
APPENDIX
In Tables A1–A10, a summary of examined literature work belonging to each agricultural subcategory is presented.
TA B L E A1 Soil observation
MAHMOOD ET AL.
S. No Property Problem description Data used Models: Algorithms Outcomes Limitations/Future scope References
1. SOM, CEC, Mg, K, Soil properties prediction 200 soil samples at depth of SVM, ANN: RPROP Best accuracy achieved by models for Future work suggests 11
and pH based on remotely sensed 18 cm and remotely and EL: SGB different soil properties are: testing the
imaged data and ML. sensed image data were SVM = K (R2 = 0.21 and RMSE = 0.49) performance of other
captured from 7 bare fields SVM = Mg (R2 = 0.22, RMSE = 4.57) ML algorithms.
of Madison ANN = SOM (R2 = 0.64, RMSE = 0.44)
County, Ohio, USA. ANN = CEC (R2 = 0.67, RMSE = 2.35)
SGB = pH (R2 = 0.15, RMSE = 0.62)
2. AK, AP, and TN Hyperspectral remote A total of 1297 soil samples SVM, Regression: Best accuracy achieved with BPNN It is significant to predict 12
sensing-based forecast of at depths of 0 to 20 cm SLR, EL: RF, and combined with Ordinary Kriging (OK): soil nutrients with
soil nutrients with ML. were collected, and ANN: BPNN AK: RMSE = 49.67; R2 = 70.55% environmental factors
hyperspectral images AP: RMSE = 29.62; R2 = 69.30% in the future.
consisting of 115 bands TN: RMSE = 0.292; R2 = 68.51%
were procured from the
Chinese satellite.
3. TN, MC, and OC VIS–NIR spectroscopy-based A dataset composed of 140 Regression: Cubist Simulation outcome for the Cubist Using the same 13
forecast of soil properties soil samples taken at depth and SVM: LS-SVM method: procedure, a further
(i.e., TN, MC, and OC) using of 0 to 20 cm of an arable RMSEP(TN) = 0.071% validation process is
ML. field in Premslin, Germany; RPD(TN) = 1.96% required for other
and for the spectral Simulation outcome for LS-SVM types of soil.
measurement, AgroSpec method:
VIS–NIR spectroscopy was RMSEP(MC) = 0.457%
used. RPD(MC) = 2.24%
RMSEP(OC) = 0.062%
RPD(OC) = 2.20%
4. Temperature Accurate estimation of daily Iranian Meteorological ANN: SaE-ELM Performance for Bandar Abbas station: Future studies will try to 14
soil temperature for Organization (IMO) RMSE = 1.0958◦ C to 1.9029◦ C merge more
6-different depths in acquired and provided the R = 0.908 to 0.989 service-oriented
2-different Iranian regions. dataset from 1996 to 2005 Performance for Kerman station: architecture with soft
and 1998 to 2004 for the RMSE = 2.0017◦ C to 2.9018◦ C computing to achieve
Bandar Abbas region and R = 0.874 to 0.983 maximized results.
Kerman region
respectively.
5. Moisture Winter wheat soil moisture Data for 15 predictors were SVM, EL: RF and Prediction accuracies of models for 0 to To enhance the model’s 15
prediction based on 3 ML considered for weather, ANN: BPNN 20 cm and 20 to 40 cm soil layers: accuracy,
models. terrain, and soil properties SVM: 0–20 cm = 92.89% environmental and
for the winter SVM: 20–40 cm = 92.65% organism factors will
wheat-producing area in RF: 0–20 cm = 87.63% be added and
Baoji, China. RF: 20-40 cm = 87.84% amalgamated with the
BPNN: 0–20 cm = 80.57% remotely sensed data.
23 of 22
S. No Property Problem description Data used Models: Algorithms Outcomes Limitations/Future scope References
24 of 22
1. Kc and ET Prediction of crop Kc and ET data for drip and sub-irrigated SVM Model performance for both the crops: A similar model can be utilized 16
coefficient (Kc ) and watermelon and pepper were collected for 10 SVMPepper = MAE(0.026), MSE(0.034) and R2 (0.71) for open field crops in the
ET under different seasons based on a lysimeter installed at the SVMwatermelon = MAE (0.213), MSE(0.116) and future.
irrigation and University of Florida. And Meteorological data R2 (0.82)
climate condition for was collected from UF/IFAS Florida Automated
the two crops. Weather Network.
2. ET Accurate estimation of Weekly weather data (i.e., max-min air ANN: ELM Model performance for both stations: Further studies to evaluate the 17
weekly ET for two temperature, max-min relative humidity, solar RSMEJodhpur = 0.43 mm d−1 model for sparse weather
arid regions in India. radiation, and wind speed) of 40 years during RSMEPali = 0.33 mm d−1 stations can be undertaken
(1970–2010) was obtained for both the weather by using the ET value of one
stations. station along with
temperature data of sparse
weather stations.
3. ET Monthly estimation of Climatic data (i.e., minimum, maximum, and mean Regression: MARS, Model performance based on meteorological N/A 18
mean reference ET temperature, wind speed, relative humidity, and SVM: SVM-Poly, parameters:
for 44 stations in solar radiation) for 44 stations from 1951 to SVM-RBF, and MARS: RMSE(0.07); MAE(0.05) and R2 (0.999)
Iran. 2010 was accumulated from the Islamic Republic GEP SVM-Poly: RMSE(0.51); MAE(0.41) and R2 (0.948)
of Iran Meteorological Organization (IRIMO). SVM-RBF: RMSE(0.36); MAE(0.28) and R2 (0.978)
GEP: RMSE(0.69); MAE(0.53) and R2 (0.899)
4. ET Actual ET modeling in Meteorological data including air temperature SVM and ANN: Model performance (RMSE) for ET estimation under N/A 19
MFR and CK maize (maximum, minimum and mean), relative GANN MFR:
croplands with humidity (maximum, minimum and mean), solar SVM1 = 0.236, SVM2 = 0.534
Artificial Intelligence radiation, precipitation, and wind speed) were GANN1 = 0.215 and GANN2 = 0.421
approaches. collected for three maize growing seasons in Model performance for ET estimation under CK:
Shanxi Province, China. Also, Plant heights, leaf SVM1 = 0.281, SVM2 = 0.536
length, and widths during 1 or 2-week intervals GANN1 = 0.269 and GANN2 = 0.469
were measured for 5 randomly selected maize
plants.
5. ET To understand the Data for daily meteorological properties including ANN: ELM and Model performance in both scenario: N/A 20
applicability of ANN air temperature (max. and min.), sunshine GRNN Scenario-1: ELM (RRMSE = 0.198, MAE = 0.267
models for duration, mean relative humidity, and wind speed, and NS = 0.89)
estimation of ET in for 6 meteorological station houses in Sichuan Scenario-2: GRNN (RRMSE = 0.198, MAE = 0.267
Sichuan Basin, China during 1961–2014 were procured and NS = 0.89)
basin, China. from the China Meteorological Administration.
6. Daily dew Daily dew point The data sets consist of daily average water vapor ANN: ELM ELM model performance for both the stations: N/A 21
point temperature pressure, average air temperature, atmospheric Bandar Abass station:
tempera- prediction based on pressure, relative humidity, and global solar MARE = 0.5203 ◦ C; RMSE = 0.6709 ◦ C; and
ture the ELM model for radiation; provided by the Iranian R = 0.9877
two Iranian Meteorological Organization (IMO) for two Tabass station:
meteorological stations during the 1998–2004. MARE = 0.3240 ◦ C; RMSE = 0.5662 ◦ C; and
stations: Bandar R = 0.9933
Abass and Tabass.
MAHMOOD ET AL.
TA B L E A3 Yield prediction
S. No Crop name Problem description Data used Models: algorithms Outcomes Limitations/Future scope References
1. Corn Investigation of key Quadrant Ariel images of Oakes irrigation ANN: BPNN Average corn yield prediction Varying crop factors such as 22
MAHMOOD ET AL.
vegetation indices to area in the USA were acquired. Total 287 accuracies in the years 1998, temperature, nitrogen
foresee corn yield using and 75 grid plots of information from 1999, and 2001 for all the models value, ground elevation,
neural network technique. different quadrants were used for training are; water content, disease,
and testing purposes respectively. GVI = 24.26% to 94.85% and so forth, were not
NDVI = 19.36% to 95.04% utilized in the image
SAVI = 19.24% to 95.04% information used for the
PVI = 83.50% to 96.04% developed model.
Transformed PVI = 90% to 98%
2. Paddy Predicting crop yield through Past thirty years Agricultural production Regression: MLR and The RMSE, MAE and R value for N/A 23
ML and statistical data and weather data for the paddy crop ANN: FFBN, different algorithms are:
model-based hybrid were gathered from the Statistical, SVM: SVR, IBL: MLR = 0.098, 0.069, 0.89
approach, and its Metrological, and Agricultural KNN, EL: RF ANN = 0.098, 0.064, 0.92
performance comparison Department. A total of 745 instances with SVR = 0.099, 0.065, 0.92
with MLR, conventional 16 different features were monitored KNN = 0.127, 0.089, 0.87
ANN, and other ML during the year. RF = 0.085, 0.055, 0.93
algorithms. Hybrid MLR-ANN = 0.051, 0.041,
0.99
3. Tomato Accuracy analysis of A total of 9953 records were collected DL: ENN, ANN: The average error rates of all the N/A 24
agricultural data based on between 1997 and 2014 from the BPNN and models during five runs are:
multiple network models Agriculture and Food Agency of the Regression: MRA ENN = 3.65%
with varying hidden layers Council of Agriculture in Taiwan. BPN = 6.03%
and neurons. MRA = 12.40%
4. Boro rice, Output prediction of 6 major Past 12 years data was collected for 6 DT: ID3 and IBL: Average MSE rates for all the crops Research usage limited 25
Aman rice, crops of Bangladesh different crops of 10 regions from the KNNR in the different regions using dataset and to acquire
Aus rice, utilizing supervised ML annual publication on agricultural DTL-ID3 and KNNR are: better precision for crop
wheat, jute, techniques. statistics and Bangladesh Agricultural Min. ER(DTL-ID3) = 3.77% prediction more data will
and potato Research Council. Max. ER(DTL- ID3) = 7.71% be included and tested
Min. ER(KNNR) = 7.94% over other ML algorithms
Min. ER(KNNR) = 13.6% in the future.
5. Soybean and ML-based system for Data for monthly precipitation, air DL: RNN The percentage error in yield The proposed system can be 26
maize pre-season forecasting of temperature, and soil properties are prediction for Brazil (soybean), employed to foresee the
crops through obtained from the CHIRPS dataset, US(soybean), and US(maize) using precision agriculture
satellite-derived ECMWF, and SoilGrids.org respectively. MAPE metric are 10.7%, 9.8% and services utilizing
precipitation, temperature, Also, actual yield data for maize and 11.31% respectively, and through conventional NDVI and
and soil properties. soybean crops at the country level i.e., RMSPE metric are 14.31%, identical datasets.
USA and Brazil were collected from the 12.85% and 15.28% respectively.
“United States Department of
Agriculture” and “Brazil Statistics and
Geography Bureau” respectively.
(Continues)
25 of 22
TA B L E A3 (Continued)
S. No Crop name Problem description Data used Models: algorithms Outcomes Limitations/Future scope References
26 of 22
6. Grass Biomass estimation of Remotely sensed data and field data were Regression: MLR, Evaluation results for different To achieve better accuracy 27
agricultural pastures obtained for the two sites (i.e., Morepark ANN: FFBPN and models: ANFIS requires a large no
grassland in Ireland with and Grange) of Ireland for 12 and 6 years ANFIS MLR (R2 Moorepark = 0.29, of data and in the future, it
the help of an ML approach respectively. A time-series data(46 RMSE Moorepark = 25.08; is required to understand
utilizing multi-temporal images/year) were downloaded from R2 Grange = 0.38, the effect of vegetation
remote sensing data. NASA Land Process Distributed Active RMSE Grange = 24.02) index value and saturation
Archive FFBPN (R2 Moorepark = 0.63, of the satellite signal
Center (https://lpdaac.usgs.gov/lpdaac/ RMSE Moorepark = 18.05; because these two factors
get_data/glovis) and grassland weekly R2 Grange = 0.59, might be a reason for the
biomass values were analyzed for field RMSE Grange = 20.43) model data
data. ANFIS (R2 Moorepark = 0.85, underestimating the actual
RMSE Moorepark = 11.07; biomass peak.
R2 Grange = 0.76,
RMSE Grange = 15.35)
7. Wheat Applying advanced sensing Soil parameters were captured through an ANN: SKN, XY-F, and The average overall prediction The inability of a developed 28
techniques for accessing online visible and NIR spectroscopy CP-ANN accuracy of wheat yield through system to represent
soil and crop properties to sensor. For crop parameters, UK-DMC-2 three different algorithms are: continuous output relation
foresee wheat yield satellite data were used for calculating SKN = 81.65% is the only limitation,
through three Self NDVI. CP-ANN = 78.3% which can be pulled off
Organizing Map models. And yield data were collected using a XY-F = 80.92% through smooth
yield sensor equipped with a New interpolation kernels.
Holland CX8070 harvester.
8. Coffee Identification and A database that incorporates 1018 coffee SVM, BM: NB and Overall success rate for classifying In the future, the fruit 29
classification of branches with 4300 images was acquired IBL: KNN harvestable(semi-ripe, ripe and detection algorithm should
harvestable and non- through the Samsung galaxy overripe) fruits and be improved and a mobile
harvestable fruits over (S5SM-G900M) mobile device. non-harvestable(unripe) fruits app to be developed for
coffee branch using an MV with different classifiers are: detecting the quality of
framework and ML SVM = 88.02% acquired images.
algorithms. Bayes = 83.57%
KNN = 83.09%
9. Corn Integration of ML techniques 200 samples from seven uncovered fields SVM: SVM with RBF Model performance (i.e., RSME and The performance of other ML 11
with distantly sensed data were collected for soil data, yield data for and LKF, R2 ) for crop yield prediction: methods (i.e., KNN, CNN,
for predicting soil corn were collected for only one field and Regression: Cubist SVMR = 0.45 & 1.05 RNN, etc.) will be examined
attributes and corn yield. multispectral images were captured and EL: GBM, RF, SVML = 0.33 & 1.16 to improve the yield
through onboard aircraft digital and ANN. Cubist = 0.52 & 0.98 prediction in future work.
camera(Leica ADS80) GBM = 0.41 & 1.08
RF = 0.53 & 0.97
ANN = 0.37 & 1.12
10. Rice A novel end to end model that The data sample contains 8 types of time ANN: BPNN and DL: Model performance (i.e., MAE and Future work will be to 30
integrates two BPNNs with series variables (metrological data) and RNN RMSE) for yield prediction when improve the model
an independent RNN to winter and summer rice produce data of the network layer reaches 6: performance by focusing
predict winter and summer 81 counties over the three years(i.e., Summer rice yield = 0.0044 & on other key features
winter rice produce of 81 2015, 2016, and 2017) 0.0057 which influence the yield
counties in the region of Winter rice yield = 0.0074 & and increasing the size of
MAHMOOD ET AL.
models.
2. Silybum Identification of Microbotryum Overall 207 spectra were acquired ANN: SKN, XY-F, and SKN & CP-ANN = 90% accuracy The method can be utilized as a tool 32
marianum Silybum disease in Silybum marianum consisting of 103 and 104 spectra for CP-ANN XY-F = 95.16% accuracy for applications such as fast
(milk plants, through field spectroscopic healthy and infected plants evaluation of biological control or
thistle) analysis and hierarchical-SOM. respectively. automated pest control practices.
3. Salad leaf Detection of infected salad leaves from 105 spectral datasets were acquired DR: LDA Accuracy = 84% The further experimental study will 33
the healthy leaves employing ML and using a handy ASD-FieldSpec4 F1-Score = 79% include gathering varieties of salad
hyperspectral sensing technique. Spectroradiometer. Specificity = 75% leaves and their correlated diseases
Sensitivity = 87% to test the effectiveness of the
developed model.
4. Winter Classification of hyperspectral imaging A total of 12,120 spectra were ANN: SKN, XY-F, and Accuracy(%) achieved in identifying healthy, N/A 34
wheat data to determine whether the crop randomly selected from six field CP-ANN nitrogen stressed and yellow rust affected
is healthy, nitrogen stressed, or plots using a hyperspectral imaging winter wheat are:
yellow rust infected. system. SKN = 79.79,100 & 95.22
CP-ANN = 96.36, 99.63 & 94.32
XYF = 97.27,99.63 & 99.83.
5. Strawberry Identification of different strawberry 100 image samples SVM Detection of thrips with an MPE of less than N/A 35
parasites and their classification to 2.5%
detect harmful thrips pests.
6. Jujube tree Digital image processing-based 50 image samples for each type of ANN: MLP Identification accuracies of the MLP model for Future research will include more 36
identification of six common jujube jujube tree disease were captured different diseases are: image samples of jujube leave
tree diseases using ANN. through a canon camera. Jujube rust = 91%, diseases.
Jujube anthracnose = 89%,
Jujube white Rot = 94%
Jujube fruit rust disease = 84%,
Ascochyta spot of jujube = 73%,
jujube witches broom = 81%
7. Banana leaf Identification and classification of 3700 image samples DL: LeNet The accuracy of color images for different In the future, more plant diseases to 37
banana leaf diseases (viz. black training and testing samples ranges from be tested using the developed
Sigatoka and black speckle) and 92.88%–98.61% model, and also automatic stiffness
healthy leaf. The accuracy of grayscale images for estimation of the infected disease
different training and testing samples ranges will be targeted.
from 85.94% to 94.44%
8. Citrus fruits Rottenness detection in citrus fruits A total of 120 samples consisting of 30 ANN: MLP and Overall accuracy: N/A 38
using ML. healthy samples, 30 trips defect DT: CART MLP = 98.30%
images, and 30–30 samples each CART = 93.71%
inoculated with two types of spore’s
defect were gathered.
9. Plant leaves New dataset collection of plant images 79,265 Plant leaf images (both healthy DL: PDNet Accuracy = 93.67% Future work will be the detection of 39
and classification of diseases. & diseased) disease phase and disease locations
in the plants.
27 of 22
(Continues)
TA B L E A4 (Continued)
S. No Crop Name Problem Description Data Used Models: Algorithms Outcomes Limitations/Future Scope References
28 of 22
10. Potato Classification of healthy and two 300 potato leaves form “plant SVM: Accuracy = 95% Integrate an excess number of 40
significant diseases of potato (late village” dataset Multiclass-SVM diseases of various species into
blight and early blight) using image the system and automatically
processing and ML approach. evaluate the severity of diseases
11. Pomegranate Diagnosis of four pomegranate plant 500 images of fruits and leaves Clustering: K-mean ANNAccuracy = 90% N/A 41
diseases namely, bacterial blight, and ANN: BPN
leaf spot, fruit rot, and fruit spot,
using image processing and ANN
techniques.
12. Tea leaf A developed image processing 300 tea leaves SVM Developing a system using SVM with fewer To improve the segmentation process 42
system for spotting two important features gives 93% accuracy compared and use different classification
diseases of tea leaf namely brown with a neural network which gives methods to classify the diseases.
blight and algal disease. 91%accuracy.
13. 25 different Detection of plant disease using a 87,848 images of leaves taken from DL: VGG Success rate = 99.53% In the future, the method would be 43
crops leave Deep Learning algorithm for the the open database. tested over the expanded
plant images which comprises 25 database by adding more plant
plant species. species samples.
And the testing data set was the
part of the training dataset that
was the issue that needs to be
resolved.
14. Rice Recognition of balanced infected 700 rice seedlings. SVM Accuracy of discriminating between The method can be used for 44
seedling and healthy seedling of infected and healthy seedling = 87.9% determining rice resistance,
rice at the duration of 3 weeks Positive Predictive Value (PPV) = 91.8% susceptibility in malady
utilizing MV. Negative Predictive Value(NPV) = 89.7% inspection, and resistance
breeding programs.
15. Maize Recognition and discrimination of 12,227 images from different DL: AlexNet Accuracy = 98.62% More types of diseases and 45
healthy and six different diseases sources. pest-related to maize crops will be
of maize leaves. recognized in the future.
16. Tomato Symptom visualization and 14,828 images of tomato leave taken DL: AlexNet, AlexNet Accuracy = 98.66% In the future size of the DL model and 46
classification of nine different from the plant village database. GoogleNet GoogleNet Accuracy = 99.18% computation need to be reduced
tomato leave diseases using the SVM and EL: RF SVM & RF Accuracy = 95.47% so that the proposed approach can
deep learning method. be made applicable for the
handheld systems like mobiles.
17. Plant leaves Recognition of 13 diverse plant 30,880 augmented images were DL: CaffeNet Accuracy for the separate class ranges The future aim will be to develop a 47
diseases from healthy leaves and formed from the 4483 original between 91% and 98%. fully trained model for smart
distinguishing them from their images collected through an And overall accuracy of the model is mobile devices, which can display
surroundings. internet search. 96.30% recognized diseases in plants
based on captured leave images
through a mobile camera.
18. Potato Detection of Blemishes (i.e., black White potatoes = 102 images EL: AdaBoost Success rate for white potatoes = 89.6% Future research is directed towards 48
dot, silver scurf, common scab, Red potatoes = 22 images Algorithm Success rate for Red potatoes = 89.5% the discrimination of blemish
powdery scab, and skin spot) in Total = 124 images types.
MAHMOOD ET AL.
S. NoProblem Description Data Used Models: Algorithms Outcomes Limitations/Future Scope References
1. Weed spotting from UAV images in the Acquired weed and crop images for the SVM, EL: RF and DL: Results for spinach field data using all features for Future work will utilize a 49
crop lines through deep learning with spinach field are 3626 and 4303 ResNet18 supervising labeling (SL) and unsupervised multispectral image which helps in
MAHMOOD ET AL.
unsupervised data labeling. respectively. labeling (UL) in AUC%: plant discrimination and improves
Similarly, acquired weed and crop images SVMSL = 93.35%, SVMUL = 90.70% the background segmentation.
for the bean field are 673 and 4861 RFSL = 96.99%, RFUL = 95.16%
respectively. ResNet18SL = 95.70% and
ResNet18UL = 94.34%
Results for bean field data using all features for
supervising labeling (SL) and unsupervised
labeling (UL) in AUC%:
SVMSL = 60.60%, SVMUL = 59.51%
RFSL = 70.16%, RFUL = 65.40%
ResNet18SL = 94.84% and
ResNet18UL = 88.73%
2. Detection of broad-leaf weeds in pastureA total of 6087 image samples were SVM: Linear, Weed detection accuracy of: Weed species identification 50
grass with a novel vision-based collected and labeled, consisting of 2007 Quadratic and SVM-L = 86% methods will be developed in the
system. And comparing the accuracy weed samples and 4080 grass samples. Medium Gaussian SVM-Q = 89.4% future for controlling weeds more
of weed and non-weed detection SVM, EL, IBL: KNN, SVM-G = 87.7% effectively.
through traditional ML algorithms and RM: LR, and DL: KNN = 84.3%
deep learning algorithms. CNN EL = 85.8%
LR = 87.5%
CNN = 96.88%
3. The weed identification model consists Overall 820 image samples were captured Clustering: K-Mean The average accuracy of weed identification using N/A 51
of K-mean feature learning combining through the Canon EOS 70D camera, for and DL: CNN CNN = 91.07%
CNN to detect three weed types the soybean (210) and three different The average accuracy of weed identification
associated with soybean seedlings. weed types (610). combining K-means pre-training with CNN after
weight optimization = 92.89%
The average accuracy of weed identification
combining K-means pre-training with CNN
before weight optimization = 63.15%
4. Automated computer vision system Collection of 86 digital pictures acquired SVM For a total of 86 images and 3096 cells, “Correct Varying illumination variance is a 52
consisting of segmentation and from an experimental farm of the Spanish Classification Percentage” and “Yule Coefficient” significant issue that needs to be
decision-making procedure to identify Research Council (SRC), Madrid in 2005. for test-3 and step-3 are 85% and 80% analyzed in the future to
“Avena sterilis” weed in cereal crops to respectively determine the firmness of the
reduce herbicides quantity to be proposed approach.
sprayed.
5. Applying CNN to recognize and Image samples consisting of soil (3249), SVM, DL: CaffeNet, The weighted average precision value of different Future work will address the 53
distinguish the weeds (i.e., grass and soybean (7376), grass (3520), and broad EL: Adaboost and classifiers are: evaluation of datasets taken from
broadleaf) in soybean crop field grass (1191) were captured through a RF SVM = 0.98 different locations and the height
images for precision herbicides phantom DJI3 drone at the Campo CaffeNet = 0.99 of acquisition.
spraying. Grande, Brazil. Adaboost = 0.98
RF = 0.96
29 of 22
(Continues)
30 of 22
TA B L E A5 (Continued)
S. NoProblem Description Data Used Models: Algorithms Outcomes Limitations/Future Scope References
6. Detection of different weeds to perform For multiple weed species detection, 18,000 DL: GoogLeNet, Validation and testing performance (F1 score) for To evaluate weed control with the 54
precision herbicides spraying in images with weeds (positive) and 18,000 VGGNet, and multi weed species detection: presented DCNN models,
bermudagrass through three DCNN images without weeds (negative) were DetectNet GoogLeNet ≤ 0.8272 research will be conducted in the
models. used for training purposes. While for VGGNet ≥ 0.9542 future.
single weed (Poa Annua) 6000 positive Validation and testing performance (F1 score) for
and 6030 negative images were used for single weed (Poa Annua) detection:
the training purpose. GoogLeNet = 0.6667
VGGNet = 0.9641
DetectNet = 0.9978
7. Discrimination of weed, crop, and Total 11,816 image datasets (i.e., broadleaf, SVM and ANN: BPNNCorrectly broadleaf (weed) detection using BPNN Performance can be affected due to 55
soil-based on color indices to perform soybean, and soil) used were procured and SVM are: outdoor illumination changes in
on-field spraying of herbicides with from research53 SVMPrecision = 82% color indices. Hence further
high precision. SVMRecall = 77% research is required under varying
BPNNPrecision = 72% illumination conditions and using
BPNNRecall = 83% different crops.
8. Hyperspectral sensing-based active Hyperspectral imaging apparatus was used SVM, BM: MOG, The recognition accuracy of crop and weed species Further tuning of algorithms is 56
learning system to recognize crop and to capture the images of maize crops and ANN: SOM and DL: corresponding to the different classifier: required to improve accuracy and
weed species. 10 different weed species. 110 spectra Autoencoder SVMcrop = 29.63% execution time as well.
were utilized for each of the one-class MOGcrop = 100%
classifiers for the training purpose. SOMcrop = 100%
Autoencoder crop = 59.26%
SVM weeds = 12.96%–68.52%
MOG weeds = 31.48%–98.15%
SOM weeds = 53.70%–94.44%
Autoencoder weeds = 12.96%–83.33%
9. Classification of chilly crops and weeds. 224 image samples were captured using SVM Classification accuracy = 97% Future tasks will include a more 57
OLYMPUS FE4000 digital camera robust image pre-processing
method for noises introduced
through the working environment.
10. Effective recognition of Silybum UAS with a multispectral camera ANN: CP-ANN, SKN Obtained classification accuracy for silybum N/A 58
marianum weeds and other (green-red-NIR) were used to acquire the and XY-F marianum and other vegetation:
vegetation types using hierarchical images, wherein a total of 860 pixels were CP-ANN S.Marianum = 98.9%
SOM classifiers. used for the validation purpose SKN S.Marianum = 98.6%
comprising 441 of Silybum marianum and XY-F S.Marianum = 98.6%
419 of other vegetation types. CP-ANN other vegetation = 97.6%
SKN other vegetation = 98.1%
XY-F other vegetation = 98.6%
MAHMOOD ET AL.
TA B L E A6 Crop quality evaluation
S. No. Crop name Problem description Data used Models:Algorithms Outcomes Limitations/Future scope References
1. Pears Discrimination of 240 images of pear fruits SVM PCF and DCF classification performance: Model performance could be 59
persistence calyx pears were acquired through the PCF calibration = 97.8% further improved by
MAHMOOD ET AL.
and deciduous calyx hyperspectral system. PCF prediction = 96.7% increasing the sample size
pears based on spectral DCF calibration = 96.7% and real situations of pears.
wavelengths and DCF prediction = 93.3%
circularity.
2. Table Grapes Image processing and Data set composed of 400 EL: RF Performance of three different classification task Future work will correspond 60
ML-based images of two varieties of with 5 different quality level (QL1, QL2, QL3, to learning the different
non-destructive grapes(i.e., Italia and QL4, and QL5) are: performance parameters
contactless quality Victoria) (QL5 vs. QL4 vs. QL3 vs. QL2 vs. QL1)Italia = 0.74 on different cultivars.
assessment of table ({QL5, QL4} vs. QL3 vs. {QL2, QL1})Italia = 0.94
grapes. ({QL5, QL4} vs. {QL3, QL2, QL1})Italia = 1.0
(QL5 vs. QL4 vs. QL3 vs. QL2 vs.
QL1)Victoria = 0.71
({QL5, QL4} vs. QL3 vs. {QL2, QL1})Victoria = 0.83
({QL5, QL4} vs. {QL3, QL2, QL1})Victoria = 0.92
3. Cape Cape gooseberry fruits A total of 925 fruits images SVM, DT, ANN: RBF and Average model accuracy for color spaces (RGB, HSV, Color spaces information 61
Gooseberry classification according were collected and labeled IBL: KNN and L*a*b*) would be combined with
to their ripeness level into seven classes SVM = 92.57%, DT = 90.32%, other strategies for the
utilizing color spaces according to the ripeness ANN = 88.3% and KNN = 89.77% quality assessment of
and ML techniques. level. fruits in future work.
4. Tomato Determination of crop A total of 250 image samples SVM: SVM-RBF, Accuracy achieved by models are: A small dataset is the 62
ripeness class using for five different ripeness SVM-LKF SVM-LKF = 90.80% limitation of this research
ML. class were collected. DR: LDA SVM-RBF = 84.80% and future research may
SVM-Polynomial = 79.20% include the classification of
LDA = 84% objects other than crops.
5. Coconut Assessment of coconut Overall 350 images of SVM, DT, ANN: MLP, SGD, Accuracy(%) and execution time(sec) of various The effectiveness of the SGD 63
sugar quality through coconut were acquired, IBL: KNN and EL: RF model: model for the increased
RGB values utilizing which were composed of SVMAccuracy = 96.66, SVMExe. Time = 21.09 data samples will be
various ML algorithms. superior quality (48), good DTAccuracy = 97.05, DTExe. Time = 27.27 evaluated in future work.
quality (110), and rejected ANNAccuracy = 97.43, ANNExe. Time = 28.52
samples (192). SGDAccuracy = 98.38, SGDExe. Time = 26.65
KNNAccuracy = 96.48, KNNExe. Time = 19.35
RFAccuracy = 96.47, RFExe. Time = 27.21
6. Cashew Nuts External features-based Images of 5 quality cashew Regression, MCC, EL: RF, Classification Accuracy of different models: In the future technology will 64
automated grading of grades out of 26 were ANN: MLP and BPNN Regression = 90.59% be proposed to identify the
cashew nuts deploying acquired. 100 samples of MCC = 87.65% damaged cashews to
five ML classifiers. each grade were taken RF = 88.83% remove them from the
resulting in a total of 500 MLP = 89.41% grading process.
image samples. BPNN = 96%
(Continues)
31 of 22
32 of 22
TA B L E A6 (Continued)
S. No. Crop name Problem description Data used Models:Algorithms Outcomes Limitations/Future scope References
7. Cotton Identification and A total of 450 spectral SVM and DR: LDA Foreign matter detection accuracy was 100% in The main limitation of the 65
classification of samples for the cotton lint cotton lint. Average classification accuracy for proposed work was the
different foreign and foreign matters were full spectra and chosen wavelength are 87.98% presence of foreign matter
matters within cotton gathered. and 95.5%s respectively. particles in the cotton
lint using a fleece which was larger
hyperspectral imaging than the actual count.
system. Online identification and
classification system will
be developed in the future
using the present
methodology.
8. Apple Fruits quality evaluation 45 image samples of ANN: PNN Model performance with 10-fold cross-validation: Further performance would 66
using PNN approach. damaged fruits and 20 PNN with 5 features = 86.52% be improved by enhancing
image samples of healthy PNN with 9 features = 88.33% the dataset size.
fruits were considered for
the experiment.
9. Olive oil Classification of olive oils For the study, eight free DR: LDA, SVM: SVC and Accuracy of different model: N/A 67
in accordance with acidity samples of olive oils EL: RF LDA = (99.2 ± 1.5)%
geographical genesis of different degrees were SVC = (95.0 ± 3.7)%
and degree of acidity considered from different RF = (90.2 ± 4.0)%
with the help of LIBS areas within Greece.
and ML algorithms.
10. Rice Classification of crop Dataset consists of just SVM, EL: RF and ANN: Accuracy of different model: N/A 68
samples based on their 21Samples of white rice of MLP SVM = 93.66%
chemical components Oriza Sativa variety, out of RF = 93.83%
for the two producer which 12 samples taken MLP = 90%
states in Brazil. from Goias State and 19
from the Rio Grande do Sul
state.
MAHMOOD ET AL.
TA B L E A7 Livestock welfare
S. No livestock’s name Problem description Data used Models: algorithms Outcomes Limitations/Future scope References
1. Dairy Cattle Calving prediction in Data was collected EL: RF, DR: LDA, and Calving prediction utilizing combined data of HR Tag In future work focus will be 69
dairy cattle, for 53 dairy cattle ANN and IceQube for 14 days before calving: on analyzing calving
MAHMOOD ET AL.
analyzing cattle of both RFSensitivity = 25%, RFSpecificity = 89% patterns in a short time
activities, and primiparous and LDASensitivity = 75%, LDASpecificity = 93.4% duration to help the
rumination multiparous types. ANNSensitivity = 100%, ANNSpecificity = 86.8% farmers in taking early
behavior. management decisions.
2. Grazing animals Classification of The training dataset DT: C4.5 Average success rate = 94% N/A 70
rumination pattern contains a total of
in grazing animals 1000 chewing
with FBG sensor instances for 5
and ML. different classes
(i.e., 200 for each
class).
3. Dairy herds Detection of Routine herd data Regression: GLM, DT: Dairy herds lameness level detection: Additional lameness-related 71
lameness in dairy and lameness CART, GLMSensitivity = 44%, GLMSpecificity = 25% information should be
livestocks prevalence data for EL: GBM, XGB, and CARTSensitivity = 56%, CARTSpecificity = 89% gathered to improve the
according to 299 dairy herds RF GBMSensitivity = 58%, GBMSpecificity = 83% models’ performance and
farm-based records were compiled. XGBSensitivity = 56%, XGBSpecificity = 81% the lameness status of
and ML approach. RFSensitivity = 54%, RFSpecificity = 87% individual cow need to be
investigated.
4. Chicken MV-based tracking 34,280 images of ANN, Regression: Health status classification accuracy of different It is required to check the 72
system for early broiler chicken Logit log models based on both shape descriptor and walk proposed system for
recognition and were captured for (Likelihood), SVM: speed of chicken: different chicken breeds
prediction of training (23996) Linear, Quadratic, ANN = 96.9% and diseases.
disease in broiler and testing (10284) Cubic, and RBF Logit = 80.8%
chicken. purposes. Linear-SVM = 86%
Quadratic-SVM = 96.5%
Cubic—SVM = 97.1%
RBF-SVM = 97.8%
5. Cow Classification of cow Experimental data EL: Bagging, Average classification accuracy for the models: The best trained behavioral 73
behavior patterns were compiled for AdaBoost, and Bagging with “tree” learner = 96% classifier model will be
using sensor-based 24 Random Subspace Bagging with “LDA” learner = 92% deployed to the sensor
collar system and Holstein-Friesian AdaBoost with “tree” learner = 89% collar system in future
ensemble cow AdaBoost with “LDA” learner = 91% work.
classifiers. Subspace with “KNN” learner = 90%
6. Sheep Classification of Sheep face dataset DL: VGG16, Testing error rate(%) of different models: Multiclass classification of 74
sheep pain facial contains 2350 face ResNet50, VGG16 = 0.0% sheep based on the pain
expression using samples of two DenseNet201, ResNet50 = 1.43% rating scale will be done in
CNN. classes (i.e., normal AlexNet, DenseNet201 = 2.57% the future.
sheep faces and GoogleNet, AlexNet = 3.71%
abnormal sheep DarkNet and GoogleNet = 3.71%
faces) Inceptionv3 DarkNet = 4.285%
33 of 22
Inceptionv3 = 6.29%
TA B L E A8 Livestock production
S. No Livestock’s name Problem description Data used Models: algorithms Outcomes limitations/future scope References
34 of 22
1. Sheep Prediction of wool quality The collected dataset EL: BG, DT: MT, RMSE value for predicting clean fleece weight, Future study is required to build an 75
and quantity in adult contains records for Regression: LM and greasy fleece weight, fiber diameter, staple “if-then” scenario in the existing
Australian sheep. 7294 sheep. ANN: FFNN. length, and staple strength in adult sheep for system through simulation to help
different models are: the farmers to achieve their
BG = 0.51, 0.62, 1.19, 8.74 & 8.52 performance goals.
MT = 0.53, 0.65, 1.24, 8.89 & 8.68
LM = 0.66, 0.92, 1.38, 9.49 & 8.92
NN = 0.82, 10.3, 1.79, 11.69 & 10.46
2. Bovine Estimation of bovine The dataset contains a SVM and Regression: Maximum MAPE(%) in predicting weights for the N/A 76
future growth, weight trajectory with LRi Angus bull, bull and cow:
exploiting weight time records for Angus SVMAngus Bull = 3.9 ± 3.0
trajectories along time. bulls (351), bulls (822), LRi Angus Bull = 8.3 ± 6.1
and beef cows (358). SVMBull = 5.3 ± 4.4
LRi Bull = 9.2 ± 9.6
SVMCow = 9.3 ± 6.7
LRi Cow = 42.5 ± 32.9
3. Sheep Automatic classification A total of 1642 images of DL: VGG16 Average accuracy = 95.8% Dataset enhancement will be done 77
of 4 sheep breeds 4 sheep breeds Standard deviation = 1.7 with more breed types and sheep
exploiting computer (Merino, Suffolk, ages in further study.
vision and ML. Whiteface Suffolk, and
Poll Dorset) were
acquired and manually
labeled.
4. Goat Classification of 12 goat Morphological traits and IBL: KNN and ANN: Correct classification rate using hierarchical (breed N/A 78
breeds utilizing 9 breeds dataset MLP and aptitude) concatenation of two model:
morphological traits composed of a total of KNN + MLP = 83.48 ± 1.69
and 3 aptitudes. 2406 female goats MLP + MLP = 83.31 ± 1.77
aged 2 to 5 years old. KNN + KNN = 89.18 ± 1.79
MLP + KNN = 89.89 ± 1.81
5. Bees Recognition of 354 pollen and 346 IBL: KNN, BM: NB, The average accuracy of baseline classifiers with To evaluate the system performance 79
pollen-carrying bees nonpollen images were SVM: SVM-RBF, PCA preprocessing over 3 feature map images: for better accuracy, the automatic
using computer vision extracted from the DL: VGG16,VGG19 KNN = 84.25%, NB = 77.53%, collection of large-scale datasets
and CNN. recorded video of and ResNet50 SVM = 81.67%, SVM-RBF = 84.68% can be utilized in future studies.
foraging bees. Accuracy of CNN models:
VGG16 = 87.2%, VGG19 = 90.2%
ResNet50 = 61.7%
6. Pig CNN-based imaging The experimental dataset DR: Fisherface, DL: Accuracy of models: Future work will be to investigate 80
system to recognize contains 1553 image VGG-Face and Fisherface = 78.4% the effect of uncertain dimensions
on-farm pig faces. samples for 10 pigs. CNN VGG-Face = 91.0% of pig aging, dirt, or tear staining
CNN = 96.7% appearances.
7. Cattle Recognition of individual For 41 cattle beef total of DL: CNN (Inception CNN + LSTM model accuracy for 15 and 20 frame The future work will be to verify the 81
beef cattle using the 516 videos and 10,320 v3) and LSTM video length = 88% and 91% current work overabundant and
MAHMOOD ET AL.
DL framework. images were captured. CNN model accuracy for image samples = 57% complex video samples.
MAHMOOD ET AL.
TA B L E A9 Species breeding
S. No Crop name Problem description Data used Models: algorithms Outcomes Limitations/Future scope References
1. Soybean Estimation of seed A total of 18,178 SVM and DL: Accuracies of model: The suggested method can be 82
count in soybean samples were CNN(VGG) SVM = 50.4% ± 1.45% pertinent to other species
pods for plant accumulated CNN = 86.2% ± 0.52% of data samples. Moreover,
breeding. during two seasons the authors are working to
for three classes of detect no. of seeds per pod
“seeds per pod” (2, for the field images.
3 & 4).
2. Maize and Wheat Prediction of crop Dataset consists of ANN: PNN and MLP Average (AUCpr ) for maize dataset: N/A 83
individuals 16 and 17 Upper class
phenotypic class trait-environment MLP15% = 0.210; PNN15% = 0.270
based on genomic combinations for MLP30% = 0.390; PNN30% = 0.462
and phenotypic maize and wheat Middle class
data. respectively, with MLP40% = 0.450; PNN40% = 0.488
sample size ranges MLP70% = 0.714; PNN70% = 0.734
between 290–300 Lower class
individuals. MLP15% = 0.258; PNN15% = 0.355
MLP30% = 0.445; PNN30% = 0.528
Average (AUCpr ) for wheat dataset:
Upper class
MLP15% = 0.276; PNN15% = 0.338
MLP30% = 0.461; PNN30% = 0.470
Lower class
MLP15% = 0.321; PNN15% = 0.429
MLP30% = 0.464; PNN30% = 0.553
3. Barley, corn, cotton, Recognition and Around 400 image DL: CNN(AlexNet) Accuracy of CNN for plants (in %): Own CNN architecture from 84
lentil, pepper and classification of samples were and BM: NB Barley = 77.15, Cor = 86.54, the scratch will be built in
wheat plants phenological collected for each Cotton = 86.54, Lentil = 73.76, future for the plants
stages deploying a plant type, with Pepper = 87.14 and Wheat = 83.64 phenological stage
pre trained CNN. almost 30 samples Accuracy of NB for plants (in %): classification.
from each stage Barley = 71.43, Cor = 81.86,
were considered. Cotton = 80.89, Lentil = 68.97,
Pepper = 82.41 and Wheat = 74.53
35 of 22
TA B L E A10 Species recognition
36 of 22
S. no Crop name Problem description Data used Models: algorithms Outcomes Limitations/Future scope References
1. Plant leaf Classification of 4 publically available plant SVM, DR: LDA, DL: Best classification accuracy obtained The performance of the models 85
different plant image dataset was AlexNet and over different dataset with will be analyzed over different
species through accessed: VGG16 different models: agricultural applications
transfer learning of Plant Village = 54,306 FT-VGG16Plant Village = 99.8% shortly.
deep neural Flavia = 1907 DF-VGG16/LDAFlavia = 99.0%
network. Swedish Leaf = 1125 CNN-RNNSwedish = 98.8%
UCI Leaf = 443 DF-AlexNet/LDAUCI Leaf = 96.2%
2. Chickpeas Computer vision and A total of 1019 images of five ANN: BPNN Correct Classification Rate: Further work will ensure that 86
ANN-based Iranian chickpeas varieties BPNN = 99.3% classifier does not depend on
classification of five (Adel, Azad, Arman, ANN-PSO = 97.0% the conditions (i.e., brightness,
Iranian chickpeas Hashem, and Bevanij) were white balance of the camera,
varieties. gathered. etc.) under which images were
captured.
3. Legume DL recognition and Dataset consisting of 866 DL: CNN Accuracy (%) using CNN 5 layers: N/A 87
assortment of 3 leaf images, where 422, Soybean = 98.8 ± 0.2
legume species 272, and 172 images Red bean = 98.3 ± 0.3
(soybean, white correspond to soybean leaf, White bean = 90.2 ± 1.0
bean, and red bean) red bean leaf, and white
from leaf vein bean leaf respectively.
morphological
patterns.
4. Medicinal plants Color and Dataset for 15 medicinal DT, IBL: KNN, Classification accuracies of models: In the future authors are 88
texture-based plant species corresponds Weighted KNN, EL: DT = 94.44% intended to expand the dataset
classification of to a sum of 287 images RF, and ANN: KNN3 = 66.39% and develop an application for
medicinal plant that were collected from MLP-BP Weighted-KNN3 = 89.56% the identification of medicinal
species. the field and websites. RF = 97.61% plant species.
MLP-BP = 97.735
5. Vegetables Classification and A novel dataset named DL: VGG16, VGG19, Classification accuracies of models: Further work will include more 89
detection of novel “CropDeep” possesses SqueezeNet, VGG16 = 98.56%, crop images and annotations
vegetable species 31,147 images and 49,765 InceptionV4, VGG19 = 98.84%, of new crop species to the
dataset annotated samples for 31 Densenet121, SqueezeNet = 93.03%, CropDeep and applies the
species of vegetables. ResNet18 and InceptionV4 = 96.89%, classification and detection
ResNet50 Densenet121 = 99.56%, framework to other
ResNet18 = 99.62% and agricultural applications.
ResNet50 = 99.81%
6. Plants seedling Classification of A dataset composed of 4275 SVM, IBL: KNN and Classification Accuracy of models: Future work will focus on 90
plants seedling at images of around 960 DL: CNNs SVM = 61.47% training models with the
distinct growth plants corresponds to 12 KNN = 56.84% enhanced dataset and testing
stages using CNN. species at different growth CNN = 92.60% the models with images having
levels. multiple plants.
MAHMOOD ET AL.