Base Paper
Base Paper
Satellite imagery, big data, IoT and deep learning techniques for
wheat yield prediction in Morocco
Abdelouafi Boukhris *, Antari Jilali, Abderrahmane Sadiq
Laboratory of Computer Systems Engineering, Mathematics and Applications (ISIMA), Polydisciplinary Faculty of Taroudant, Ibnou Zohr University,
B.P. 8106, Morocco
A R T I C L E I N F O A B S T R A C T
Keywords: In the domain of efficient management of resources and ensuring nutritional consistency, accu
Satellite imagery racy in predicting crop yields becomes crucial. The advancement of artificial intelligence tech
Iot niques, synchronized with satellite imagery, has emerged as a potent approach for forecasting
Arcgis
crop yields in modern times. We used two types of data: spatial data and temporal data. Spatial
Deep learning
RMSE
data are gathered from satellite imagery and processed using ArcGIS to extract data about crops
NoSQL Database based on several indices like NDVI and NWDI. Temporal data are gathered from agricultural
sensors such as temperature sensors, rainfall sensor, precipitation sensor and soil moisture sensor.
In our case we used Sentinel 2 satellite to extract vegetation indices. We have used IoT systems,
especially Raspberry Pi B+ to collect and process data coming from sensors. All data collected are
then stored into a NoSQL server to be analysed and processed. Several machine learning and deep
learning algorithms have been used for the processing of crop recommendation system, such as
logistic regression, KNN, decision tree, support vector machine, LSTM, and Bi-LSTM through the
collected dataset. We used GRU deep learning model for the best performance, the RMSE and R2
for this model was 0.00036 and 0.99 respectively.
The main contribution of our paper is the development of a new system that can predict several
crop yields, such as wheat, maize, etc., using IoT, satellite imagery for spatial data and the use of
sensors for temporal data. We are the first paper that has combined spatial data and temporal data
to predict crop yield based on deep learning algorithms, unlike other works that uses only remote
sensing data or temporal data. We created an E-monitoring crop yield prediction system that
helps farmers track all information about crops and show the result of prediction in a mobile
application. This system helps farmers with more efficient decision making to enhance crop
production. The main production regions of wheat in Morocco are in the rainfed areas of the
Abbreviations: RMSE, Root Mean Square Error; MAE, Mean Absolute Error; NDVI, Normalized Difference Vegetation Index; NWDI, Normalized
Difference Water Index; ArcGIS, Aeronautical Reconnaissance Coverage Geographic Information System; IoT, Internet of Things; SQL, Structured
Query Language; NoSQL, Not only SQL; KNN, K-nearest Neighbors; LSTM, Long short-term memory; Bi-LSTM, Bidirectional long short-term
memory; Bi-GRU, Bi-directional gated recurrent unit; GPS, Global Positioning System; GIS, Geographic Information System; AI, Artificial intelli
gence; ML, Machine Learning; MODIS, Moderate Resolution Imaging Spectroradiometer; ESA, European Space Agency; VI, Vegetation Indices; RF,
Random Forest; NB, Naive Bayes; LR, Logistic Regression; LAI, Leaf Area Index; VGG, Visual Geometry Group; DBN, Deep Belief Network; XGBoost,
Extreme Gradient Boosting; MVGF, Multi-view Gated Fusion; GRU, Gated Recurrent Unit; EVI, Enhanced Vegetation Indices; NMDI, Normalized
Difference Moisture Index; NIR, Near-infrared Radiation; MIR, Mid-infrared Radiation; JSON, JavaScript Object Notation; MSE, Mean Squared
Error.
* Corresponding author.
E-mail address: Abdelouafi.boukhris@edu.uiz.ac.ma (A. Boukhris).
https://doi.org/10.1016/j.rico.2024.100489
Received 29 August 2024; Received in revised form 15 October 2024; Accepted 24 October 2024
Available online 8 November 2024
2666-7207/© 2024 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license
(http://creativecommons.org/licenses/by-nc-nd/4.0/).
A. Boukhris et al. Results in Control and Optimization 17 (2024) 100489
plains and plateaus of Chaouia, Abda, Haouz, Tadla, Gharb and Saïs. We studied three main
regions well known for wheat production which are Rabat-Salé, Fez-Meknes, Casablanca-Settat.
1. Introduction
The efficient oversight of agricultural operations, ensuring food security and making wise decisions about resource allocation and
market prediction rely on the importance of precise and timely predictions of crop yields. However, several obstacles prevent farmers
from achieving the full potential of high crop yields on their cultivated arable lands [1]. Obtaining early-stage crop growth data and
crop yield information is crucial for aligning agricultural production with the needs of both national and global food demands, thereby
ensuring food security [2]. The emergence of deep learning techniques combined with the widespread availability of satellite imagery
has brought about new opportunities for achieving more accurate and efficient estimates of agricultural yields. Understanding the
importance of large-scale yield estimation and comprehending the consequences of variability in agricultural growing conditions is
crucial [3,4], especially given the increased frequency of extreme climate events. Considering the limited availability of arable lands, a
significant portion of this increased demand will be met through the extensive use of remote sensing in precision agriculture [5]. This
will coincide with a simultaneous increase in the use of fertilizers, pesticides, water, and other inputs. Various methods are available
for predicting crop yields, ranging from field to regional scales, utilizing remote sensing techniques [6]. Sophisticated technologies
including remote sensing, global positioning systems (GPS), geographic information systems (GIS), the Internet of Things (IoT) [7], big
data analysis [8], as well as artificial intelligence (AI) and machine learning (ML) [9], serve as essential tools for improving agricultural
practices. Their usage leads to increased production while simultaneously reducing inputs and minimizing crop yield losses. These
revolutionary technologies provide farmers with timely information to identify spatial variability, such as soil variations, within farms
and large crop fields. This is crucial for addressing factors that negatively affect crop growth and yields.
Remote sensing is essential for assessing spatial differences in crop yield. Throughout the growing season, it serves as an effective
technology in precision agriculture for estimating crop status [10]. This is especially relevant for assessing the relationship between
spectral vegetation indices during crop growth and eventual crop yield. Estimating crop yields at the field level before harvesting
attracts significant attention from farmers, government agencies, traders, decision-makers, and policymakers. At the same time, early
predictions about yields play a crucial role in influencing decisions regarding the collection, processing, storage, transportation, as well
as the import and export of agricultural products [11].
Numerous earth observation systems have been developed, including the Moderate Resolution Imaging Spectroradiometer
(MODIS) and Landsat satellite imagery. These resources are commonly used in agricultural settings. The Sentinel-2 satellite, developed
by the European Space Agency (ESA) as part of the Copernicus program in 2015, is equipped with a multispectral high-resolution
instrument (MSI). This has significant potential for monitoring crop plants at a farm scale across agricultural landscapes [12].
Several methods have emerged for predicting different agricultural yields, both at regional and field levels. These approaches rely
on analyzing remotely acquired vegetation indices along with comprehensive datasets related to crop yields [13,14]. Abdikan and
Narin [15] forecasted sunflower yield by using crop phenological stages obtained from Sentinel-2 satellite images. This estimation was
achieved through the application of a linear regression algorithm. In this study, ten Vegetation Indices (VIs) were utilized, extracted
from Sentinel-2 data collected during the different growth stages of sunflowers. Consequently, the most precise prediction was ob
tained by using NDVI (Normalized Difference Vegetation Index), with an R2 value of 0.74 and a Root Mean Square Error (RMSE) of
10.80 kg/ha, approximately three months before the harvesting stage. Cao et al. [16] employ machine learning and remote sensing for
field-scale yield estimation. Machine learning algorithms can assist in extracting crucial features from sequences of remote sensing
data. Jiang et al. [17] employ conventional machine learning algorithms such as Random Forest (RF), Naive Bayes (NB), and Logistic
Regression (LR) extensively for estimating crop yields based on remote sensing data. Wang et al. [18] employed deep learning in
combination with remote sensing for winter wheat yield estimation. In this study, a novel Upscaled Convolutional Gated Recurrent
Unit model, incorporating an attention mechanism (referred to as the UpSc-AConvGRU model), was introduced to improve the ac
curacy of estimating the growth parameter for winter wheat, particularly the Leaf Area Index (LAI). Therefore, the accuracy of LAI
estimation improved, with RMSEs ranging from 0.413 to 0.699 m2 for estimated LAI and an R2 of 0.546. To predict maize yield in
China, Li et al. [19] employed deep learning with satellite data. As a result, the average root-mean-square error was 1006 kg/ha, and
the mean absolute percentage error was 17.1 %. To estimate crop yield, Vannoppen et al. [20] used deep learning models while the
inputs are common crop growth indicators, including NDVI (Normalized Difference Vegetation Index), LAI (Leaf Area Index), soil data,
Table 1
Methods utilizing satellite imagery for crop yield prediction.
Author Year Methods Accuracy
2
A. Boukhris et al. Results in Control and Optimization 17 (2024) 100489
and meteorological data. The fluctuations in winter wheat yield were more accurately predicted by monthly precipitation during the
tillering and anthesis stages than by yield indicators derived from NDVI (Normalized Difference Vegetation Index) from 2016 to 2018,
resulting in an R-squared value of 0.66.
Table 1 illustrates the state-of-the-art methods using satellite imagery for crop yield predictions. Kale et al. proposed a VGG method
combined with DBN and reached accuracy of 97 %. Mantri et al. used XGBoost for crop yield estimation and the R2 value was 0.94.
Saini et al. used CNN-LSTM and Random Forest machine learning algorithms and the RMSE value was 96.6 %. As a result, the best
method to predict crop yield among these six models was the model proposed by Kale et al. by combining DBN and VGG with an
accuracy of 97 %.
Table 2 shows the assessment of some methods using satellite imagery and deep learning to estimate crop yield.
Satellite imagery, IoT, sensors [31], Big data and deep learning have been combined for crop yield estimation in recent years and
the result has been encouraging. Accurate and reliable data collection serves as a fundamental initial phase in actualizing a deep
learning and satellite imagery-based agricultural production estimation system. Data collection involves acquiring satellite imagery,
ground-based yield data, and relevant auxiliary data.
The remainder of the paper is organized as follows: section 2 classifies the materials and methods used in this paper and illustrates
the proposed deep learning algorithm used for crop yield prediction. section 3 discusses the results of our model after training and
testing and the data used by our model. We finalized our paper with a conclusion and perspective.
Yield prediction plays a crucial role in decision making at global, regional, and field levels. Predicting crop yield accurately can
significantly enhance food production. In earlier days the crop yield prediction was carried out by the experience of the farmers;
however, farmers don’t have enough knowledge about the new crops, and they are not aware about the environmental conditions
which affect crop production. This triggered the idea of creation a system for wheat yield estimation (see Fig. 1) based on new
technologies such as satellite imagery, IoT, machine learning and deep learning. The system can accurately predict wheat production
and gives more information about crop growing, the degree of soil moisture which can help farmers take decisions earlier Fig 2,3.
The main contributions of this system are:
• The system can be used for several crop types: wheat, rice, maize, etc.
• Give information about crops: farmers can track plant growth.
• Using agricultural sensors to collect real-time data: temperature sensor, rainfall sensor, soil moisture sensor.
Using spatial data based on satellite imagery: we can extract data from images of field such as NDVI indices, …
• Using mobile applications to collect data about lands: surface of land (per ha), production (per kg/ha), region, country.
• Using IoT for data collection.
• Using Big Data database to store all data collected by sensors and mobile applications.
• Developing a mobile application to help farmers visualize prediction results and data collected about crops.
1. Back-end system: the farmers can create an account to enter information about their land and visualize all data collected about
crops.
2. Big Data database: to store data collected by sensors and mobile application.
3. Build predictive model:
• Analyze and transform data.
• Create a deep learning model.
• Evaluate the model.
• Visualize result in mobile application.
4. IoT technology: agricultural sensors connected to Raspberry Pi to collect data.
Table 2
Comparison of state-of-the-art methods based on RMSE (root mean squared error) and MAE (mean absolute error).
Author Algorithms RMSE MAE
3
A. Boukhris et al. Results in Control and Optimization 17 (2024) 100489
In this paper we have used two types of data: spatial data using satellite imagery and temporal data using agricultural sensors such
as temperature sensors, soil moisture sensors, and rainfall sensors.
2.2.1.1. Satellite data. Earth observation satellites equipped with multispectral sensors, such as Landsat, Sentinel 23, or MODIS, offer
valuable satellite imagery for crop monitoring. These sensors capture images at regular intervals, providing a historical record of crop
4
A. Boukhris et al. Results in Control and Optimization 17 (2024) 100489
growth stages and environmental conditions. To control the dynamic change of vegetation we used NDVI (Vegetation Indices) and EVI
(Enhanced Vegetation Indices) related to crop yields, NMDI (Normalized Difference Moisture Index) which is a vegetation index
sensitive to the water content of vegetation and is complementary to the NDVI. We can calculate the NDVI indices from red and
near-infrared spectral bands based on satellites type used, and EVI is calculated using red band near-infrared band and blue band. Eq.
(1) and (2) shows how NDVI and EVI are calculated.
(NIR − RED)
NDVI = (1)
(NIR + RED)
(NIR − RED)
EVI = 2.5 ∗ (2)
(NIR + C1 ∗ RED − C2 ∗ B + L)
5
A. Boukhris et al. Results in Control and Optimization 17 (2024) 100489
For Sentinel-2 data, the values of L, C1 and C2 are 1, 6, and 7.5, respectively. “C” values (C1 and C2) are coefficients for atmospheric
resistance and values from the blue band. In our case L=1, C1=6 and C2=7.5
(NIR − MIR)
NDMI = (3)
(NIR + MIR)
Using Sentinel-2 Band 8 (NIR) and Band 12 (MIR).
Three vegetation indices have been taken from Sentinel-2 in the study area during 2015–2019, which has a spatial resolution of 10
× 10 m.
For Sentinel 2 satellite the value of NIR, RED and MIR are B8, B4 and B12 respectively
Big data is a powerful tool for storing agricultural data used by machine learning algorithms to enhance crop production in the
agriculture area. Advanced technologies can be used to collect a huge amount of agricultural data every second. Mobile applications
and agricultural sensors are employed to collect all necessary information about the agricultural field (see Fig. 2). We used two types of
data:
The above-mentioned data are filled in JSON files every minute and sent to a server to be stored using MongoDB Compass.
Sensors are the source of IoT data that can generate a hug amount of data. In the agriculture field, Vital data holds paramount
importance in amplifying agricultural yield and aiding in decision-making processes. In our case we used different agricultural sensors
such as temperature sensors, rainfall sensors, soil moisture sensors (see Table 3) Fig 5,6.
To collect and manage data, the IoT tool used is Raspberry Pi B+ which is a nano computer used for processing and programming.
All data collected are then transferred to the mobile application “E-agriculture” as shown in Fig 7. Table 4 shows a description of the
data employed in our dataset.
Various machine learning methods are used in this paper for wheat yield prediction in Morocco, among these methods we have
Random Forest algorithm, Decision Tree, K-Neighbors, Gradient boosting etc.
Table 3
Examples of sensors used in agricultural data collection.
Sensors Data collected
6
A. Boukhris et al. Results in Control and Optimization 17 (2024) 100489
used for pattern recognition; for the numerical target of the K-nearest Neighbor, this algorithm can calculate the average. KNN
regression and classification use the same distance functions.
Where N is the number of data points, fi is the value returned by the model and yi is the actual value for data point i.
√̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅
∑ n
(yiʹ − yi)2
RMSE = (5)
i=1
n
Where n is the number of observations, y’i is the predicted value and yi is the observed value.
7
A. Boukhris et al. Results in Control and Optimization 17 (2024) 100489
Table 4
Summary of collected data.
Type of data JSON
We used various machine learning and deep learning algorithms such as LSTM, Bi-LSTM, Random Forest, GRU compared and used
the best model based on metric like RMSE, MAE and R2.
The results show that the best model is GRU with an RMSE of 0.001 and MAE of 8.0746e− 04 and R2 of 0.99, which outperforms all
other models.
3. Experiment results
The data contains different information about agriculture collected from agricultural sensors (temporal data) and satellite imagery
(spatial data). We created a mobile application to collect this data. We are the first paper that combines spatial data (based on satellite
8
A. Boukhris et al. Results in Control and Optimization 17 (2024) 100489
imagery) and temporal data (based on agricultural sensors) for crop yield predictions. Table 5 illustrates the description of the dataset
used in this paper, while Table 6 shows the summary of features.
Table 6 illustrates the summary of all features used for the prediction of wheat yield. We have used 7 features such as temperature,
soil moisture, precipitations, NDVI of crops, weather data, pesticides, and yield. These table shows the mean, standard deviation (Std),
minimum and the maximum values of all features.
In this study we are used machine learning and deep learning [35] for the prediction of wheat in Morocco based on spatial and
temporal data using new technologies like satellite imagery and IoT. The results of different models showed the successful performance
of deep learning algorithms for yield estimation [36]. We are using three evaluation indicators R2, RMSE and MAE. Table 7 illustrates
the comparison of machine learning algorithms based on the three estimation indicators.
Various deep learning techniques used in this paper to estimate wheat accurately. With the same number of nodes and the same
number of layers for each algorithm. Table 8 shows the parameters of each deep learning algorithm.
After 20 epochs the experiment shows that the best model is Bi-GRU with a best RMSE value of 0.03 and Loss of 0.001. Table 9
shows the comparison of all deep learning algorithms based on loss and RMSE metrics.
On the other hand, the comparison of deep learning algorithms based on R2 and MAE metrics is needed. The value for R2 can range
from 0 to 1, the best value of R2 is the value closest to 1. Table 10 shows the result of these comparisons.
As we can see in Table 10 above, the best deep learning model for wheat yield predictions in Morocco is Bi-GRU. The experience and
the assessment demonstrate that Bi-GRU model had the best result based on metrics like R2, MAE and RMSE. Table 10 shows that the R2
of Bi-GRU is 0.89 which is a very good result compared with other models such as LSTM, stacked LSTM, Random Forest, and others.
The RMSE of Bi-GRU is 0.03 which outperforms other algorithms. The second algorithm that shows a good result is a Bi-LSTM model
with an RMSE of 0.07 and R2 of 0.87. All the models were trained after 25 epochs, with 5 layers for each model and 100 nodes. Fig. 8
shows the loss and validation loss of Bi-GRU model.
Fig. 9 shows the MAE and validation MAE of our model GRU.
After training all models, the experiment shows that the best model is GRU with a best RMSE value of 0.00036 and Loss of
2.6169e− 06. Table 11 illustrates the results after 20 epochs only:
After combining multiple data sources such as satellite data and IoT sensor data, the result shows an increase in R2 value between
10 % to 19 % compared to the result of R2 based only on IoT sensor data. Table 12 illustrates the importance of integrating satellite data
with IoT data to increase R2 value.
Baset on the experience, the result shows that GRU model can predict accurately crop yield based on spatial data and temporal data.
The R2 values is 0.99 which demonstrates that the GRU model, Fig. 10 compare the actual with the predicted values.
As we can see in the Fig. 10 above, 99 % of points are closer to a straight line which demonstrate that the GRU model predict
accurately wheat yield [37].
The residuals are the differences between the predicted and actual values. The residual plot shows the residuals against the pre
dicted values. A good model will have residuals scattered randomly around zero. As we can see in Fig. 11, our GRU model all residuals
scattered randomly around the red dashed line (zero).
Fig. 12 shows the test data and predictions values. As we can see our model predicts accurately wheat yield and outperform all other
algorithms.
Integrating remote sensing data and temporal data can improve the GRU model performance for wheat yield prediction. The results
showed that the GRU model integrating satellite imagery data (vegetation indices) and IoT sensor data achieved the highest prediction
accuracy (R2=0.99 and RMSE=0.001; Tables 13 and 14). The GRU model with temporal data only, gathered by IoT sensors (tem
perature, precipitation, soil moisture), achieved only an accuracy of RMSE=0.03 and R2=0.89. By combining spatial data (vegetation
Table 5
Description of the agricultural dataset.
Risk factor Description Sources
Spatial data
NDVI Vegetation index. Satellite imagery
NDMI Soil moisture (Normalized Difference Moisture Index).
Weather data Temperature data extracted using Sentinel-2 remote sensing.
Temporal data
Temperature Air temperature Agriculture sensors
Soil moisture Soil moisture
Precipitation Precipitations
9
A. Boukhris et al. Results in Control and Optimization 17 (2024) 100489
Table 6
Summary of features used in the analysis.
Features Mean Std Min Max 25 % 50 % 75 %
Table 7
Comparison of machine learning algorithms.
Machine learning R2 RMSE MAE
Table 8
Summary of deep learning model parameters.
Models Number of layers Number of nodes Nb of parameters
Table 9
Deep learning benchmarks based on loss and RMSE (root mean squared error) parameters.
Models Loss RMSE
indices) and temporal data (gathered by sensors like temperature, soil moisture, precipitation, etc.) can improve the performance of
LSTM model by 31,89 % and for Bi-GRU model by 4 % and by 12 % for the GRU model, so we constate the big impact of combining both
spatial and temporal data to improve the performance model. The result indicates that integrating satellite data and data gathered by
IoT sensor can improve wheat yield predictions.
Spatially, GRU model outperforms all other model based on R2 metrics by integrating vegetation indices (satellite data) and IoT
data with R2=0.99 which is an increase of 10 % compared with GRU model with only IoT data inputs. For other models, the result
indicates an increase of performance between 10 % to 19 %. Table 13 illustrates the comparison between models’ performance based
on R2 metric.
Table 14 shows that multisource data can increase performance model by 10 % to 20 % which is a good result. Based on RMSE
metrics the GRU model had a good performance with an RMSE=0.00036 much improved compared to same model based on IoT data
only (RMSE=0.12). Combining satellite data and temporal data significantly improve the RMSE values for all model
The result shows that combining both satellite imagery data (spatial data) and temporal data significantly improve wheat yield
prediction. These integrated data bring complementary information that helps understanding dynamics crop growth and
10
A. Boukhris et al. Results in Control and Optimization 17 (2024) 100489
Table 10
Deep learning benchmarks based on R2 (coefficient of determination) and MAE (mean ab
solute error) metrics.
Models R2 MAE
environmental conditions. The main features that improve prediction accuracy when combining both types of data are: (1) spatial
features such as vegetation indices (NDVI and EVI) derived from satellite imagery which can assess stress level, biomass and crop vigor.
Soil variability and moisture levels are also factors that affect crop growth and then crop yield prediction. (2) Temporal features, such
as weather and climate trend (temperature, rainfall, humidity), can impact wheat growth dynamics over time and yield production.
11
A. Boukhris et al. Results in Control and Optimization 17 (2024) 100489
Table 11
Benchmark of models based on RMSE (root mean squared error) and loss.
Model RMSE Loss
05
Bi-GRU 0.0011 6.1261e−
06
LSTM 0.0016 5.7726e−
06
Stacked LSTML 0.0015 4.9523e−
04
Convolutional LSTM 0.47 6.5323e−
Bi-LSTM 0.37 0.0787
06
GRU 0.00036 2.6169e−
Linear regression 0.85 0.5
Random forest 0.23 0.03
Decision tree 0.28 0.06
Stochastic gradient descent regression 0.56 0.23
K-nearest neighbors 0.27 0.05
Gradient Boosting regression 0.52 0.21
Neural network 0.0000e+00 0.00
Table 12
Benchmark of models based on R2 (coefficient of determination) and MAE (mean absolute error).
Model R2 MAE
Historical and real-time weather data can inform predictions and help the model to factor in weather events such as heat weaves,
droughts that may impact wheat yield and allows the model to adjust wheat predictions.
Integrating both spatial and temporal data can enable dynamic modeling of wheat growth and track how wheat grows in different
parts of the field over time. This can identify critical growth period and determine when human interventions (irrigation, fertilization)
might be needed.
If we use spatial data or temporal data only for wheat prediction, the model loses access to critical information to capture accurately
the complex dynamics of crop growth and then the accuracy of prediction decreases. If only spatial data is used the model cannot
account for how the crop has developed over the season or how changing weather conditions (rainfall, temperature) affect the field,
leading to inaccurate predictions. In addition, a single image captured by satellite imagery may not reflect if crops experienced a period
of stress (drought, pest attack), so the model can’t track if the crop recovers or worsen over time. Using temporal data only can’t
capture spatial variability in soil and spatial variation in crop health (area affected by poor soil quality), which are critical for un
derstanding how different part of the field respond to the same weather conditions.
The GRU model is used in yield prediction for its ability to handle time-series data (environmental factors like rainfall and tem
perature which are important in yield production) and capturing long-term dependencies in the data such as how early-season weather
affects yield outcomes months later.
Unlike traditional machine learning models and simpler recurrent neural networks (RNNs), GRU model has several salient features
which makes it effective in handling time-series data and complex patterns. For example, gating mechanisms are used for efficient
memory management. The update gate decides the extent to which the previous information is retained, while reset gate controls how
much of the previous hidden state should be ignored. In traditional machine learning models (linear regression, decision tree, etc.)
there is no such mechanism to select retain or forget information over time which are less effective than GRU at handling long-term
patterns.
4. Conclusion
Big data, IoT, satellite imagery and deep learning are very beneficial for crop yield predictions and enhancing agricultural product.
This paper proposes an E-monitoring system for crop yield prediction. We collect two types of data which are spatial data and temporal
data. First, we use satellite imagery to collect spatial data; we use ArcGIS to generate and process agricultural data. Second, we use
12
A. Boukhris et al. Results in Control and Optimization 17 (2024) 100489
Fig. 10. True values vs. predicted values for model accuracy assessment.
agricultural sensors to collect temporal data. To reduce response time of results we use Raspberry Pi to collect and process data, then
we send the result to the mobile application.
Furthermore, the dataset used in this paper combines two types of data which are spatial and temporal data. We are the first paper
that uses these types of data to predict crop yield. The result shows that combining satellite imagery data and temporal data (gathered
by sensors such as temperature sensor, soil moisture sensor) can improve wheat yield predictions accuracy. Results with RMSE, MAE
and R2 of 0.00036, 8.0746e− 04, and 0.99, respectively show that the GRU model is able to predict accurately wheat yield and
outperform all other machine learning and deep learning algorithms. The GRU model is used in yield prediction because it can
effectively handle time-series data, such as environmental factors like rainfall and temperature, and capture long-term dependencies,
such as the impact of early-season weather on yield outcomes months later. GRU model increase the performance of wheat yield
prediction by 14 %, based on dataset that combine spatial and temporal data, which outperform other machine learning models.
13
A. Boukhris et al. Results in Control and Optimization 17 (2024) 100489
Table 13
Percentage increase in performance of deep learning model based on different input data types.
Satellite data and IoT data combined IoT data only Percentage increase
2 2
Model R R
Bi-GRU 0.99 0.89 10 %
LSTM 0.99 0.80 19 %
Stacked LSTML 0.99 0.32 67 %
Convolutional LSTM − 1.06 − 2.13 –
Bi-LSTM 0.88 0.84 4%
GRU 0.999 0.85 14 %
Linear regression 0.71 0.69 2.81 %
Random forest 0.95 0.93 2.81 %
Decision tree 0.93 0.90 3%
Stochastic gradient descent regression 0.73 0.71 2%
K-nearest neighbors 0.93 0.89 4%
Gradient Boosting regression 0.76 0.72 4%
Neural network 0.0000e+00 0.0000e+00 –
Table 14
Comparison of model performance using RMSE (root mean squared error) metrics across different data sources.
Satellite data and IoT data combined IoT data only
14
A. Boukhris et al. Results in Control and Optimization 17 (2024) 100489
Future works can process our dataset and test other deep learning and machine learning algorithms for more accuracy. We propose
to use our system in several farmers to test the effectiveness of our method. We think that combining different technologies like
artificial intelligence, satellite imagery, big data and IoT can be more beneficial for agricultural areas and can predict accurately
different crop yield. On the other hand, we can use other vegetation indices like DVI (The Difference Vegetation Index), PVI (The
Perpendicular Vegetation Index) to enhance the accuracy of our model. Moreover, we can test other satellites with different resolutions
like Landsat, Modis, etc.
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to
influence the work reported in this paper.
Data availability
References
[1] Mishra V, Cruise JF, Mecikalski JR. Assimilation of coupled microwave/thermal infrared soil moisture profiles into a crop model for robust maize yield estimates
over Southeast United States. Eur J Agron 2021;123:126208. https://doi.org/10.1016/j.eja.2020.126208.
[2] Amankulova K, Farmonov N, Mucsi L. Time-series analysis of Sentinel-2 satellite images for sunflower yield estimation. Smart Agricult Technol 2023;3:100098.
[3] Jiang H, Hu H, Zhong R, Xu J, Xu J, Huang J, et al. A deep learning approach to conflating heterogeneous geospatial data for corn yield estimation: A case study
of the US Corn Belt at the county level. Glob change bio 2020;26(3):1754–66.
[4] Kogan F, et al. Winter wheat yield forecasting in Ukraine based on Earth observation, meteorological data and biophysical models. Internat J Appl Earth
Observat Geoinformat 2013;23:192–203. https://doi.org/10.1016/j.jag.2013.01.002.
[5] Sishodia RP, Ray RL, Singh SK. Applications of remote sensing in precision agriculture: a review. Remote Sens 2020;12:3136. https://doi.org/10.3390/
rs12193136.
[6] Leroux L, Castets M, Baron C, Escorihuela M-J, Bégué A, Seen DLo. Maize yield estimation in West Africa from crop process-induced combinations of multi-
domain remote sensing indices. Eur J Agron 2019;108:11–26. https://doi.org/10.1016/j.eja.2019.04.007.
[7] Gupta A, Nahar P. Classification and yield prediction in smart agriculture system using IoT. J Ambient Intell Humaniz Comput 2023;14(8):10235–44.
[8] Gupta R, Sharma AK, Garg O, Modi K, Kasim S, Baharum Z, Mostafa SA. WB-CPI: weather based crop prediction in India using big data analytics. IEEE access
2021;9:137869–85.
[9] Burdett H, Wellen C. Statistical and machine learning methods for crop yield prediction in the context of precision agriculture. Precision Agric 2022;23:
1553–74. https://doi.org/10.1007/s11119-022-09897-0.
[10] Gitelson AA. 15 remote sensing estimation of crop biophysical characteristics at various scales. Hyperspectral Remote Sens. Veget. 2016;20:329.
[11] Ji Z, Pan Y, Zhu X, Wang J, Li Q. Prediction of crop yield using phenological information extracted from remote sensing vegetation index. Sensors 2021;21:1406.
https://doi.org/10.3390/s21041406.
[12] Goffart D, Curnel Y, Planchon V, Goffart JP. Defourny: field-scale assessment of Belgian winter cover crops biomass based on Sentinel-2 data. Eur J Agron 2021;
126:126278. https://doi.org/10.1016/j.eja.2021.126278.
[13] Nagy A, Szabó A, Adeniyi OD, Tamás J. Wheat yield forecasting for the Tisza river catchment using landsat 8 NDVI and SAVI time series and reported crop
statistics. Agronomy 2021;11:652. https://doi.org/10.3390/agronomy11040652.
[14] Schwalbert RA, Amado T, Corassa G, Pott LP, Prasad PVV, Ciampitti IA. Satellite-based soybean yield forecast: integrating machine learning and weather data
for improving crop yield prediction in southern Brazil. Agric. For. Meteorol. 2020;284:107886. https://doi.org/10.1016/j.agrformet.2019.107886.
[15] Narin OG, Abdikan S. Monitoring of phenological stage and yield estimation of sunflower plant using Sentinel-2 satellite images. Geocarto. Int. 2022;32:
1378–92. https://doi.org/10.1080/10106049.2020.1765886.
[16] Cao J, et al. Wheat yield predictions at a county and field scale with deep learning, machine learning, and google earth engine. Eur J Agron 2021;123:126204.
[17] Jiang H, et al. A deep learning approach to conflating heterogeneous geospatial data for corn yield estimation: a case study of the US Corn Belt at the county
level. Glob Chang Biol 2020;26:1754–66.
[18] Han Dong, Wang Pengxin, Tansey Kevin, Liu Junming, Zhang Yue, Tian Huiren, et al. Integrating an attention-based deep learning framework and the SAFY-V
model for winter wheat yield estimation using time series SAR and optical data. Comput Electron Agricult 2022;201:107334. https://doi.org/10.1016/j.
compag.2022.107334. ISSN 0168-1699.
[19] Li Xingang, Geng Hao, Zhang Liqiang, Peng Shuwen, Xin Qi, Huang Jianxi, Li Xuecao, et al. Improving maize yield prediction at the county level from 2002 to
2015 in China using a novel deep learning approach. Comput Electron Agric 2022;202(Nov 2022). https://doi.org/10.1016/j.compag.2022.107356. C.
[20] Vannoppen A, Gobin A. Estimating farm wheat yields from NDVI and meteorological data. Agronomy 2021;11:946. https://doi.org/10.3390/
agronomy11050946.
[21] Kale N, Gunjal SN, Bhalerao M, Khodke HE, Gore S, Dange BJ. Crop yield estimation using deep learning and satellite imagery. Int J Intelligent System Appl
Engineer 2023;11(10s):464–71. Retrieved from, https://ijisae.org/index.php/IJISAE/article/view/3301.
[22] Moussaid A, El Fkihi S, Zennayi Y, Lahlou O, Kassou I, Bourzeix F, et al. Machine learning applied to tree crop yield prediction using field data and satellite
imagery: a case study in a citrus orchard. Informatics 2022;9(4):80. https://doi.org/10.3390/informatics9040080.
[23] Saini P, Nagpal B. Spatiotemporal landsat-sentinel-2 satellite imagery-based hybrid deep neural network for paddy crop prediction using google earth engine.
Adv Space Res 2024;73:4988–5004.
[24] Mena, F., Pathak, D., Najjar, H., Sanchez, C., Helber, P., Bischke, B., et al. (2024). Adaptive fusion of multi-view remote sensing data for optimal sub-field crop
yield prediction. arXiv preprint arXiv:2401.11844.
[25] Mantri, S., & Purohit, S. (2023). Satellite imagery solution for rice crop yield estimation using machine learning models.
[26] Bisht B. Yield prediction using spatial and temporal deep learning algorithms and data fusion (Doctoral dissertation. Université d’Ottawa/University of Ottawa;
2023.
[27] Pargaien S, Prakash R, Dubey VP, Singh D. Machine learning techniques in wheat crop yield prediction using NDVI indices and meteorological parameters. In:
2023 International Conference on Sustainable Communication Networks and Application (ICSCNA). IEEE; 2023. p. 1165–9.
[28] Zhou H, Yang J, Li D. Improving grain yield prediction through fusion of multi-temporal spectral features and agronomic trait parameters derived from UAV
imagery. Front Plant Sci 2023;14:1217448.
[29] Khaki S, Wang L. Crop yield prediction using deep neural networks. Front Plant Sci 2019;10:452963.
15
A. Boukhris et al. Results in Control and Optimization 17 (2024) 100489
[30] Khaki S, Wang L, Archontoulis SV. A CNN-RNN framework for crop yield prediction. Front Plant Sci 2020;10:492736.
[31] Kumar M, Singh PK, Maurya MK, Shivhare A. A survey on event detection approaches for sensor based IoT. Int Things 2023;22:100720.
[32] Jamali M, Bakhshandeh E, Yeganeh B, Özdoğan M. Development of machine learning models for estimating wheat biophysical variables using satellite-based
vegetation indices. Adv Space Res 2024;73(1):498–513.
[33] Jamali M, Bakhshandeh E, Yeganeh B, Özdoğan M. Development of machine learning models for estimating wheat biophysical variables using satellite-based
vegetation indices. Adv Space Res 2024;73(1):498–513.
[34] Nie J, Jiang J, Li Y, Li J, Qiao Y, Ercisli S. UAVEC-FLchain: distributed multi-regional jujube orchard joint yield estimation for secure agricultural-IoT
applications. Int Thing 2024:101143.
[35] Morales-García J, Bueno-Crespo A, Martínez-España R, García FJ, Ros S, Fernández-Pedauyé J, et al. SEPARATE: a tightly coupled, seamless IoT infrastructure
for deploying AI algorithms in smart agriculture environments. Internet of Things 2023;22:100734.
[36] Sachithra V, Subhashini LDCS. How artificial intelligence uses to achieve the agriculture sustainability: systematic review. Artific Intell Agricult 2023;8:46–59.
[37] Jubair S, Tremblay-Savard O, Domaratzki M. Gxenet: novel fully connected neural network based approaches to incorporate gxe for predicting wheat yield.
Artific Intell Agricult 2023;8:60–76.
16