GIS MA Thesis
GIS MA Thesis
GIS MA Thesis
BY
BY
Dr. K.V.SURYABHAGAVAN
Asst. Professor, School of Earth Sciences
Addis Ababa University, Addis Ababa
Chairman, Department
vi
ACKNOWLEDGEMENTS
First and foremost, I would like to thank, the Almighty God. who made it possible to
I am very much grateful to my advisor, Dr. K.V. Suryabhagavan for his collaboration in
sharing knowledge and invaluable comments and his usual advice, guidance and
I would like to express my deepest gratitude to Carolien Tote and her colleagues from
VITO for their commitment to solve my problem during software failure. Thank you very
much for the role you have played for the successful completion of the study.
I am also thankful to Central Statistical Agency (CSA) for provision of financial support
My warm and special thanks goes to Atireshewal Girma(Atir), Sisay Guta and all CSA
staffs for their unreserved and tireless support, sharing of knowledge and invaluable
My colleagues, Abay Amare and Adane Tadesse have helped me in number of ways and
deserve many thanks. Abe, your editing helped me a lot. I am grateful to all my family
members(Atse, Abe, Kure, mestu and Yemane). Adi, Endale and Sele your
encouragement is unforgettable.
Last but not least, I would like to extend my appreciation to my batch students for their
Had it not been just for a matter of space, I would have been happy to mention the names
of all those, who, in one way or another, have contributed to the accomplishment of this
vii
Table of Contents
ABSTRACT...................................................................................................................................xiv
1 INTRODUCTION ....................................................................................................................... 1
3.1.3 Rainfall.......................................................................................................................... 22
viii
3.1.5 Soil ................................................................................................................................ 24
3.3.1 Selection of date of satellite imagery and acquiring the time series imagery ............... 34
CHAPTER FOUR.......................................................................................................................... 51
4 RESULTS ................................................................................................................................. 51
4.3 Comparison of conventional crop yield forecast with the developed model ...................... 61
5 DISCUSSION ........................................................................................................................ 65
ix
List of Tables
Table 3.1 Area, Production and yield of cereal crops for private peasant holding for
Meher season 2012/13. .............................................................................................. 23
Table 3.2 Summary of equipment and materials used for data collection and analysis. . 30
Table 3.3 Accuracy assessment table ............................................................................... 44
Table 3.4 Table showing observed yield and independent variables............................... 49
Table 4.1 correlation result of different NDVI values ..................................................... 51
Table 4.2 correlation result of rainfall ............................................................................. 54
Table 4.3 correlation result of WRSI values .................................................................... 55
Table 4.4 correlation result of different ETa values. ....................................................... 56
Table 4.5 Parameter Estimates for the maize forecast model .......................................... 60
Table 4.6 Maize yield forecast model variance ............................................................... 60
Table 4.7 Maize production level of the year 2013 for south Tigray zone ...................... 63
List of Figures
x
Figure 3.14 Smoothed S10s data ...................................................................................... 38
Figure 3.15 Spectral profile of smoothed data .................................................................. 38
Figure 3.16 SPOT image of the study area. ...................................................................... 40
Figure 3.17 Interpreted image of the study area ............................................................... 41
Figure 3.18 Agricultural land of the study area ................................................................ 41
Figure 3.19 Random points generated for accuracy assessment....................................... 43
Figure 3.20 Crop mask data for maize .............................................................................. 45
Figure 3.21 Crop coefficient of maize in different stage(Planting-Flowering) ................ 47
Figure 3.22 NDVI value for the month of july,2003 ........................................................ 48
Figure 3.23 Mean ETa for the month of May,2003 .......................................................... 48
Figure 3.24 PET for the month of May,2003.................................................................... 48
Figure 3.25 RFE for the month of May, 2003 .................................................................. 48
Figure 4.1 Graph showing yield and NDVIa .................................................................. 52
Figure 4.2 Graph showing yield and NDVIc ................................................................... 52
Figure 4.3 Graph showing yield and NDVIx ................................................................... 53
Figure 4.4 Graph showing yield and rainfall .................................................................... 54
Figure 4.5 Graph showing yield and WRSI ...................................................................... 55
Figure 4.6 Graph showing yield and ETa ........................................................................ 56
Figure 4.7 Graph showing yield and ETa total ................................................................. 57
Figure4.8 Comparison between the maize yield estimated by the spectro
agrometeorological model and the observed yields for the study area...................... 59
Figure 4.9. Comparison between maize yield estimated by the model and the observed
yield. .......................................................................................................................... 62
Figure 4.10 Maize yield forecast map of 2013. ................................................................ 64
xi
List of Abbreviations
xii
MODIS - Moderate Resolution Imaging Spectroradiometer
xiii
ABSTRACT
For a country like Ethiopia whose economy is strongly dependent on rainfed agriculture;
reliable, accurate and timely information on types of crops grown, their acreage, crop
growth and Yield forecast are vital components for planning efficient management of
resources. Remote-sensing data acquired by satellite have a wide scope for agricultural
applications owing to their synoptic and repetitive coverage. This study reports the
development of an operational spectro-agrometeorological yield model for maize crop
derived from time series data of SPOTVEGETATION, actual and potential
evapotranspiration, rainfall estimate satellite data for the years 2003-2012 which were
utilized as input data for the indices while official grain yield data produced by the
Central statistical Agency of Ethiopia was used to validate the strength of indices in
explaining yield (quintal per hectare). One obstacle to successful modeling and prediction
of crop yields using remotely sensed imagery is the identification of image masks. This
process allows to consider only information pertaining to the crop of interest. Therefore
crop masking at crop land area was applied and further refined by using agro ecological
zone suitable for crop of interest(maize).Correlation analyses were used to determine
associations between crop yield, spectral indices and agrometeorological variables for the
maize crop of the longest rainy season (Meher). Indices with high correlation with maize
yield were identified and were ready for further analysis, accordingly rainfall and average
Normalized Difference Vegetation Index (NDVIa) have high correlation with yield (85%
and 80% respectively). Many studies reported that linear regression modeling is the most
common method to produce yield predictions by using remote sensing derived indicators
together with bio climatic information. Statistical multiple linear regression model has
been developed using variables which have high correlation with yield. Accordingly,
NDVIa and rainfall were bring to the regression and lastly a regression model with P-
value of less than 0.05 at 95 % confidence level were developed. The developed spectro-
agrometerological yield model was validated by comparing the predicted Zone level
yields (quintal per hectare) with those estimated by CSA(quintal per hectare). Very
encouraging results were obtained by the model (r2 0.88 , RMSE 1.4 quintal/ ha and 21%
CV). From this study we found that crop yield forecasting is possible using remote
sensing and GIS in the fragmented agricultural lands of south Tigray. Since the data
range we used for analysis was small we recommend application of the model after
testing by newly appeared data with a long range of time series data before using for
operational purposes.
Key Words: Remote Sensing, Yield Prediction, NDVI, Maize Yield
xiv
CHAPTER ONE
1 INTRODUCTION
1.1 Background
In Ethiopia, there are two methods of monitoring and forecasting crop yield in advance of
harvest. The first is Crop Yield Monitoring and Forecasting System(CYMFS) run by the
Ethiopian National Meteorological Agency (NMA) in conjunction with the European
Union (EU) Joint Research Council (JRC) and the Food and Agricultural
Organization(FAO). This system relies on empirical Crop Specific Water Balance
(CSWB) model of Food and Agriculture Organization (FAO) rather than processed based
crop simulation model. The CSWB, which is less likely to capture complex nonlinear
interaction between crop and climate is considered as shortcoming of the CYMFS system
(Greatrex, 2012).
The second method involves collecting data on crop yield based on stakeholders
assessment of the crop field compared to the previous year yield estimation as collected
by CSA. The result of this method is used by the Ethiopian government as an official
statistics and considered as conventional technique. Beyene and Meissner (2010)
concluded that the procedure which CSA uses to forecast yield is highly subjective and
1
dependent on the agenda of the stakeholders since the data is collected from the
stakeholder’s discussion but this data is widely used by the decision makers of the
country (Greatrex, 2012).
The introduction of remote sensing and the derived vegetation indices in the early 80’s
was considered a potential tool to improve simulations in real-time. NDVI has been used
as an indicator of the vigor of vegetative activity as represented by indirectly observable
chlorophyll activity. Remote sensing products alone have been used in different parts of
the world to estimate crop yield (Hastings and Emery, 1992).
Potdar et al. (1999) as cited in Rojas (2006) observed for some cereal crops grown in
rain-fed conditions that rainfall distribution parameters in space and time need to be
incorporated into crop yield models in addition to vegetation indices deduced from
remote sensing data. Such hybrid models show higher correlation and predictive
capability than the simple models. The agro-meteorological models introduce information
about solar radiation, temperature, air humidity and soil water availability while the
spectral component introduces information about crop management, varieties and stresses
not taken into consideration by the agro-meteorological models.
A large range of satellite sensors provide us regularly with data covering a wide spectral
range (from optical through microwave) and these data are acquired from various orbits
and in different spatial and temporal resolutions namely high resolution and low
resolution imagery (Rembold et al., 2013).
The large number of existing studies carried out throughout the world prove the relevance
of low resolution satellite images for crop monitoring and yield prediction at the regional
level and under different environmental circumstances. The relatively lower costs
generally associated with the acquisition of low resolution satellite images makes them an
attractive instrument for crop monitoring and yield forecasting (Rembold et al., 2013).
In Ethiopia, field survey for the conventional method of forecast is proved by researchers
as subjective; it is worth while investigating cheaper and more timely methods to
substitute it using remote sensing and GIS.
2
Therefore this research tries to adress the development of an operational spectro-
agrometeorological yield model for maize in South Tigray zone of Tigray regional state
using a spectral index; the Normalized Difference Vegetation Index (NDVI) derived from
SPOT-VEGETATION, meteorological data obtained from Rainfall estimate (RFE 2.0)
model and Official figures produced by the Government of Ethiopia, which is CSA yield
data.
In contrast, remote sensing can provide accurate and timeliness of the crop production
statistics hence most studies have established that there is correlation between
Normalized Difference Vegetation Index (NDVI), agro meteorological data and the green
biomass and yield (Rojas, 2006).
In Ethiopia, these studies were mostly done at regional / national level covering large
areas using low-resolution imagery and few studies have been conducted at lower
administrative level. This study, therefore, forecast crop yield at zonal level using remote
sensing and GIS.
The research problem has been identified by the researcher practical concern on the area
and personal communication with CSA officials on June 2013 to supplement the existing
approach with remote sensing technology and serve as spring board of relatively accurate
and timely agricultural forecast.
3
1.3 Objectives
The general objective of the research is to develop a maize yield forecast model for
Southern zone of Tigray region using remote sensing and GIS techniques.
5
CHAPTER TWO
2 LITERATURE REVIEW
2.1 Theoretical framework
Remote sensing is defined as the science of acquiring information about an object
through the analysis of data obtained by a device that is not in contact with the object.
The instruments used for measuring electromagnetic radiation are called sensors. These
sensors record the reflected radiation from the surface of the earth and will be used for
many analyses; one of these is agricultural analysis (Leiliesand and Kiefer, 1994).
These reflected wave lengths may be detected by a sensor positioned above the crop. As a
result of the above mechanism, healthy vegetation will show high value of reflectance in
the NIR and low values in the visible spectrum. In the visible region, leaf reflectance is
lower than soil reflectance whereas in the NIR leaf reflectance is higher than soil
6
reflectance. This behavior is useful for explaining the utility of these reflectance
measurements in agricultural applications and for the separation of crops from soil
(George and Hanuschak, 2010).
Already in the early 80s, it was shown by Tucker and co-workers (1980, as cited in
Atzberger, 2013) that green vegetation can be monitored through its spectral reflectance
properties. Today, a large range of satellite sensors provide us regularly with data
covering a wide spectral range (from optical through microwave) and these data are
acquired from various orbits and in different spatial and temporal resolutions.
Low resolution satellite images essentially refers to a spatial resolution between 250
meters and several kilometers. Most of the early studies (e.g., from the 80s and the 90s)
relate to the use of different sensors of the NOAA AVHRR series. These images were
typically available at the national and multinational level with a 1 km resolution (LAC or
Local Area Coverage) and, at the continental and global level, with a 4 - 6 km resolution
(GAC or GLOBAL Area Coverage) or below. It was only at the end of the 90s that the
French–Belgian–Swedish satellite, SPOT, was equipped with a 1-km resolution sensor
7
for vegetation monitoring at the global scale called VEGETATION. In addition, several
so-called medium resolution sensors (maximum 250 m) have become operational since
the year 2000; amongst the best known are the Moderate Resolution Imaging Spectro
radiometer (MODIS) and Medium Resolution Imaging Spectrometer (MERIS) sensors
belonging to the TERRA/AQUA and ENVISAT platforms, respectively. All the low and
medium resolution sensors that have proven their validity for land surface observation
and vegetation analysis normally also find their applications in agriculture (Rembold et
al.,2013).
Low resolution satellite imagery has been extensively used for crop monitoring and yield
forecasting for over 30 years and plays an important role in a growing number of
operational systems. The relationship between the spectral properties of crops and their
biomass/yield has been recognized since the first spectrometric field experiments
(Rembold et al., 2013).
8
stages the effects of soil reflectance influence the values of some vegetation indices for
the detection of crop stress (FAO, 2010). The most commonly used vegetation index
include ratio vegetation index (RVI) and Normalized difference vegetation index
(NDVI).
Ratio Vegetation Index (RVI) is the simplest form of ratio based vegetation indices
calculated through the use of infrared and the red band of the electromagnetic spectrum.
It is calculated as follows; RVI=IR/R where IR is infrared and R is Red band of the
electromagnetic spectrum.
NDVI= IR-R/IR+R
Out of all VIs, NDVI stands out and is regarded as an all-purpose index. This vegetation
index is the most widely used and well understood vegetation index (Sawasawa, 2003).
The well-known NDVI was proposed in 1978 by Deering. The index became,
subsequently, the most popular indicator for studying vegetation health and crop
production. The success of the NDVI stems from its close relation to the canopy Leaf
Area Index (LAI) and fAPAR (fraction of Absorbed Photo synthetically Active
Radiation). Due to its almost linear relation with fAPAR, the NDVI can be readily used
as an indirect measure of primary productivity (Atzberger, 2013).
9
Figure 2.1 Computation of NDVI.
Source: Fewsnet (2007)
Water Requirement Satisfaction Index (WRSI) is a geospatial model that was developed
by Food and Agricultural Organization (FAO) for use with satellite data to monitor water
supply and demand for rainfed crop throughout the growing season. It is also a crop
performance index based on the availability of water in the soil (Legesse and
Suryabhagavan, 2014).
Currently, crop moisture stress on grain crop can be monitored using WRSI which is
satellite based crop performance index. This index indicates the extent to which the water
requirement of the crop has been satisfied in the growing season (Tewlde Yideg, 2012).
Technically, WRSI is the ratio of seasonal actual crop evapotranspiration (ETac) to the
seasonal crop water requirement, which is the same as the potential crop
evapotranspiration (PETc). Originally developed by FAO, the WRSI has been adapted
and extended by USGS in a geospatial application to support FEWS NET monitoring
10
requirements. As a monitoring tool, the crop performance indicator can be assessed at the
end of every 10-day period during the growing season (Geo WRSI v2.0 manual).
Solar radiation is one of the most important indicators of the relationship between crop
and weather. This radiation is directly used by crop photosynthesis. In the tropics, if a
crop is not water limited, yields will be higher in cloudless season than in wet season.
The effect of temperature is primarily on the development of the crop. In addition plants
are often sensitive to heat stress during certain development stage (Greatrex, 2012).
Timely and accurate rainfall estimation is of great importance when forecasting crop
yields and real time rainfall observations. Rain gauge networks have traditionally
provided a simple and in expensive method for daily and dekedal rainfall estimation. In
recent years, these have been complemented by the development of precipitation radar
networks, satellite rainfall estimates (SRFEs) and output from numerical weather
prediction (NWP) models, which have been particularly successful in increasing the
temporal and spatial resolution of the estimates (Novella and Thiae, 2012).
Rainfall estimate (RFE) are produced specifically with the aim of monitoring African
drought and rainfall. The algorithm uses a mix of panchromatic and infrared sensors plus
daily rainfall observations to produce daily rainfall estimate at a scale of 0.1o. The RFE
2.0 is a combination of three satellite rainfall datasets and one rain gauge rainfall data
inputs. The inputs are GOES precipitation index (GPI) which is 4 km and half hours of
spatial and temporal resolution respectively, special sensor microwave imagery (SSN/I)
11
with15 km spatial and 4 times per day (6 hour) temporal resolution and Advanced
Microwave Sounding Unit (AMSU) is also 1/3 degree (37 km) spatial 5 day temporal
resolution of satellite derived rainfall data sets and the Global Telecommunication Station
(GTS). GTS is a station rainfall data with un even spatial and from minute to daily
temporal resolutions rainfall datasets (Novella and Thiae, 2012).
There has been a limited attempt to validate or compare satellite products over Africa.
Jobard et al. (2011) as cited in Greatrex (2012) compared and validated all of the satellite
products at a 10 day time scale over the Sahel and found that the regionally calibrated
TAMSAT and RFE 2.0 had higher skill.
According to Reynolds et al. (2000) as cited in Greatrex (2012) there are several methods
of yield forecasting. The traditional method of yield forecasting is the evaluation of crop
status by experts. Observations and measurements are made throughout the growing
season such as tiller number, spikelet number and their fertility percentage, percentage of
damage from pests and fungi etc. From the data obtained in this way yield can be
forecasted using regression method. These reports are often subjective, costly, time
consuming and are prone to large errors due to incomplete ground observation, leading to
poor crop yield assessment and crop area estimation. In most countries the data become
available too late for appropriate actions to be taken to avert food shortage. Other
methods used to forecast crop yield are the use of remote sensing and crop simulation
models.
In the first method, with the development of satellites, remote sensing images provide
access to spatial information at global scale; and of features and phenomena on earth on
an almost real time basis. They have the potential not only identifying crop classes but
also of estimating crop yield. As outlined by Becker Reshef et al.(2010); preliminary
12
research and development on satellite monitoring of Agriculture started with the landsat 1
system (ERTS) in the early 1970s, stated that unanticipated severe wheat shortage in
Russia drew attention to the importance of timely and accurate prediction of world food
supplies. As a result, in 1974, the United States Developmental Aid (USAiD) together
with NASA and NOAA initiated the Large Area Crop Inventory Experiment (LACIE).
The goal of this experiment was to improve domestic and international crop forecasting
methods.
With enhancement that become available from the NOAA-AVHRR sensor (Advanced
Very High Resolution Radiometer) allowing for daily global monitoring, the
AGRISTARS (Agriculture and Resource Inventory Surveys through Aerospace Remote
Sensing) program was initiated in the early 1980s. One of the most recent efforts that
NASA and the USDA Foreign Agriculture Service (FAS) have initiated is the Global
Agriculture Monitoring (GLAM) project. Besides GLAM system, there are currently
several other regional to global operational agricultural monitoring systems providing
critical agriculture information at a range of scales like USAID Famine Early Warning
System (FEWS NET), UNFAO Global Information and Early Warning System(GIEWS).
However, the USDA FAS with its GLAM system is currently the only provider of
regular, timely, objective crop production forecasts at a global scale (Atzberger, 2013).
The second method is yield forecast using agro meteorological inputs in to a statistical
regression which is used in many yield forecasts research. In general a simple statistical
model is build using matrix with historic yield and several agro meteorological
parameters. Then a regression equation is derived between yields as function of one or
more agro meteorological parameters. The meteorological models used for forecasting
yield are mainly based on two variables temperature and precipitation because they are
related to crop yields and can be easily obtained from meteorological stations or satellite
measurements. In rainfed agricultural regions taking rainfall is the most important factor
affecting crop growth and yield (George and Hanuschak, 2010).
According to Rudorff and Batista (1990, as cited in George and Hanuschak, 2010), when
such models are applied at regional level they cannot fully simulate the different crop
13
growing conditions within the region. The application of agro meteorological models is
more common when it is integrated with remote sensing.
There were researches carried out throughout the world on the use of remote sensing and
crop simulation models. The methodology and their results were presented as follows.
Benedetti and Rossini (1993) used the AVHRR satellite derived NDVI data for wheat
forecast in a region of Italy. They derived a simple linear regression model for wheat
yield estimates and forecast based on NDVI images during the grain season. They
validated their results against official data and found good correlation between the two.
In Mediterranean African countries, Rembold and Maseli (2004) used the NDVI derived
from the AVHRR platform to estimate cereal production and found a good result too.
However, it has been argued that remote sensing might not be suitable in developing
countries because of their stratified agricultural system and very small farm sizes.
Meanwhile the increased availability of high spatial resolution makes this technique a
possible and interesting alternative for yield forecast.
Many studies showed that yield forecasting can be obtained by the use of NDVI data of
specific periods which depend on the climatic conditions of the area and the type of crop
grown. One important limitation of the yield /NDVI regression is that most of the fore
mentioned studies are linked to the environmental characteristics of specific geographic
areas or are limited by the availability of large and homogeneous datasets of low
resolution data and it is the difficult in extending locally calibrated forecasting methods
to other areas or to other scale (Rembold et al., 2013).
According to Rembold et al.(2013) it should be noted that where the crop area is not
known the NDVI /yield relationship does not provide information on final crop
production and also this relation makes use of under specific conditions such as stable
crop area over the observed period. In many cases the predictive power of remotely
sensed indicators can be improved by adding independent meteorological (bio climatic)
variables in the regression model. Several bio climatic and remote sensing based
indicators have proven to be highly correlated with yield for certain crops in specific
areas. These variables can be either measured directly (like rainfall coming from synoptic
14
station) or by satellites (such as rainfall estimates) or can be the result of other models
like ETa (actual evapotranspiration) or soil moisture.
For crop forecasting, Satellite derived point specific rainfall estimates were input in to a
crop water balance model to calculate WRSI. When these WRSI values were regressed
with historical yield data, the results showed that relatively high skill yield forecasts can
be made even when the crops are at their early stages of growth (Sawasawa, 2003).
According to Victor (1988, as cited in Senay and Verdin, 2003), rainfed based crop
performance can be assessed using water requirement satisfaction index (WRSI).The
bioclimatic variables introduce information about solar radiation, temperature, air
humidity and soil water availability while spectral component introduce information
about crop management, varieties and stresses not taken in to consideration by the agro
meteorological models. Such hybrid models show higher correlation and prediction
capability than the models using remote sensing indicators only.
Rojas (2006) in his research entitled “crop yield model development in eastern Africa.
Study case of Kenya using spectro agro meteorological model” concluded that it is
possible to conduct operational maize yield forecasts using CNDVI derived from SPOT
VEGETATION and ETa from the FAO CSWB model. CNDVI showed to improve the
spectral signal of the maize crop areas when compared with the simple spatially averaged
NDVI using the general crop mask. CNDVI proved to be a simple and valid method for
NDVI extraction with low resolution satellite images and highly fragmented high
resolution land cover classes. Due to this prediction capability it is possible to obtain an
early forecast using the CNDVI and ETa accumulated from planting decade to the end of
the flowering phonological phase. A more accurate estimate will be possible when the
maize crop cycle reaches the end using the CNDVI and ETA accumulated for the whole
length of the maize crop cycle. The result reveals that it is possible to have reliable
predictions 3 to 4 months earlier than the official estimates provided by national
authorities and based on traditional field sampling surveys. As the time-series of the yield
data was limited, some reservations for the model must be made, until a longer series of
yield data will become available. The simplicity of the proposed regression yield model
should allow an operational implementation in developing countries. He recommended
15
that based on these encouraging results, regression models could be developed for other
geographical areas in Eastern Africa.
Ethiopia currently has two methods of monitoring and forecasting crop yield in advance
of harvest. The first is CYMFS run by the Ethiopian Meteorological Agency (NMA) in
conjunction with the European Union Joint Research Council (JRC) and the Food and
Agricultural Organization (FAO). It works by combining a geographic information
system (GIS) the FAO crop specific water balance (CSWB) model, the JRC crop
production system zones database and meteorological information from the European
Center for Medium Range Weather Forecasts (ECMWF). Additional and independent
real-time Normalized Difference Vegetation Index (NDVI) satellite data from SPOT
VEGETATION is also incorporated, using a specific crop mask to concentrate the
analysis only on agricultural areas. This approach has several potential shortcomings
according to Gretrix (2012). The first is its reliance on an empirical CSWB model rather
than a processed based crop simulation model. Empirical crop simulation models are less
likely to capture complex non-linear interactions between crop and climate (Greatrex,
2012).
Teo (2006) as cited in Greatrex (2012) concluded that a CSWB model performed less
well than a process based model when forecasting groundnut yield over the Gambia and
also ECMWF model shows some exaggeration in rainfall estimation in east Africa.
16
The second method involves collecting data on crop yield based on stakeholder’s
assessment of the crop field compared to the previous year yield estimation as collected
by CSA. This method is used by the Ethiopian Authorities to monitor crop yield as an
official statistics. At the time of harvest for either cropping season, CSA invites key
stakeholders such as farmers unions, NGOs & external organizations (FAO) to a meeting
to discuss on how the season has developed. These stakeholders are asked to agree on a
percentage change in perceived crop yield from the previous season in the prepared
questionnaire at the field. For example, it might be decided that the crop yield in Tigray
during 2009 is 80 % of the crop yield in Tigray during 2008. This number is then
multiplied by the previous year’s yield to give a ‘pre-harvest estimate’ (Beyene and
Meissner, 2010).
Greatrex (2012) carried out research on Ethiopia using the crop simulation model called
the General Large Area Model for annual crops (GLAM). GLAM is a process based
model designed to simulate tropical crop production in regions where there is an
observed relationship between climate and crop yield and has been shown in studies to
capture the crop/weather relationship at large scales and found that this model was shown
to exhibit the correct sensitivity to climate and to perform reasonably when compared
with observed crop yields. The limitations of the approach with the current calibration
dataset are, Ethiopian agriculture is extremely complex with many different varieties of
maize used by farmers to adapt to their climate, thus the use of a single cultivar in GLAM
MAIZE led to unrealistic results in some high altitude locations and the time a plant takes
17
to develop to maturity is one of the most important determinants of final yield and
strongly dependent on growing season temperature. In turn, temperature is highly
dependent on altitude, thus maize grown at a high altitudes will experience lower
growing season temperatures, resulting in longer development times and higher yields. In
the study, GLAMMAIZE was run using a single lowland maize cultivar that needs
relatively high temperatures in order to develop. This means that for high altitude, low
temperature pixels, GLAMMAIZE took an excessively and unrealistically long time to
develop and results in an unrealistic pattern. Lastly she recommended that an operational
crop yield forecasting system needs to be able to work at a regional scale if it is to be of
use to policy makers in order to substitute the existing conventional method which was
proofed as subjective, time consuming, un reliable and have also a problem of timeliness.
Currently, maize is the preferred staple for over 900 million poor consumers and is
widely regarded as the most important staple food crop in Africa. The importance of
maize in developing countries can be seen clearly from the analysis done by the
International Maize and Wheat Improvement Center (CIMMYT) for the maize global
mega Programme (Greatrex, 2012).
Among cereals, maize accounts for the largest share in total production and the total
number of farm holdings involved in Ethiopia. In 2010/11, maize accounted for 28
percent of the total cereal production, compared to 20 percent for teff and 22 percent for
sorghum, the second and third most cultivated crops. About 8 million smallholders were
involved in maize production in 2010/11, compared to 6.2 million for teff and 5.1 million
for sorghum. It should be noted that in Ethiopia, smallholder farms account for 95 percent
of the total agricultural production, with large farms contributing to only 5 percent of
total production and to only 2.6 percent of cereal production in particular. The average
farm size is less than one hectare, with 40 percent of the farmers cultivating less than 0.52
hectares. Maize is the largest and most productive crop in Ethiopia. The fastest growth
18
rates in area cultivated, production and yield were also recorded in the case of maize:
between 2003/04 and 2007/08, maize production expanded by 103 percent; and area
under maize increased by 51 percent while yield increased by 32 percent. The share of
maize in total area has increased by 6 percent between 2003/04 and 2007/08. According
to the data obtained from FAOSTAT, Ethiopia is the second largest producer of maize in
Eastern and Southern Africa, following South Africa. Between 2000 and 2010, it
accounted for 12.3 percent of the total maize production in the region, compared to 36.3
percent for South Africa (Demeke, 2012).
Generally speaking, there are two main agricultural seasons in Ethiopia namely short
rainy season (Belg) and longest rainy seasons (Meher). Shortest rainy season (Belg)
season occurs primarily during February, March and April. It generally consists of short
season crops and out of the six staple food crops, short cycle maize is considered the
most important. Although Short rainy season (Belg) season is important as a hunger
breaker, it accounts for only 3% of total production. Longest rainy season (Meher) season
is the main agricultural season in Ethiopia and is dominated by teff and maize grown
during the summer rains (Greatrex, 2012).
19
Figure 2.2 Seasonal calendars of crops.
Source: Livelihood profile for Tigray (2007)
20
CHAPTER THREE
3 MATERIALS AND METHODS
3.1 Description of the study area
The study is conducted in south Tigray zone, Tigray Regional State of Ethiopia.
Geographically it is situated, latitude 12o15′16″N–13o38′45″N and longitude 38o59′33″E
– 39o53′20″E covering a total area of about 9432 km2 (Fig 3.1).The altitude of the study
area ranges from 1156 to 3671 m asl.
3.1.1 Population
Based on the 2007 Census conducted by the Central Statistical Agency of Ethiopia
(CSA), South Tigray Zone has a total population of 1,006,504, of which 497,280 are men
and 509,224 women; and according to a projection conducted after 5 years the total
population of the zone is 1,166,578 of which 575,797 are men and 590,781 are women.
21
The density of the population of the zone is 61 persons per square kilometer according to
CSA Abstract of 2012 (CSA, 2012).
3.1.2 Temperature
The climatic variables of the study area are highly governed by the topography of the
area mainly by its altitude. The monthly maximum temperature of the zone ranges from
24.5°C in January to 29.5°C in June and its minimum temperature ranges from 10.2°C in
December to 14.8°C in June according to NMA as reconstructed from station
observations and remote sensing and other proxies for the years from 1981 up to 2010
(Fig 3.2 and 3.3).
Figure 3.2 Maximum temperature of the study area Figure 3.3 Minimum temperature of the
(1981-2010) study area (1981-2010)
3.1.3 Rainfall
The area is characterized by a bimodal rainfall pattern with a short rainy season “Belg”
from March to April and a long rainy season “Meher” from June to September with a
peak in August. The annual mean rainfall varies from 10 mm in November to 210 mm in
August as shown in figure 3.4.
22
Figure 3.4 Mean monthly rainfall for the year 1983 - 2010
3.1.4 Cropping condition
The planting or sowing time of different crops varies depending on the onset and
continuity of the rainfall. There are two distinctly known (Bi-modal) and traditionally
used cropping seasons. Short cropping season (Belg) is the one, which starts as soon as
the last harvest of the previous long rainy (Meher) season crop is over. Successful short
rainy season crops are meant to leave the land for the second crop season and hence will
be harvested around May, allowing enough time for land preparation and sowing of the
longest rainy season (Meher) crops. The second cropping season is the long rainy season
as it is practiced in most parts of the country and the study area. As shown in table 3.1,
cereal crops grown in the study area include teff, wheat, maize, barley and sorghum
(Dagnew Belay, 2007).
Table 3.1 Area, Production and yield of cereal crops for private peasant holding for meher season
2012/13.
Cereal Crop Area in Hectare Production in quintal Yield quintal
/hectare
23
Cereal Crop Area in Hectare Production in quintal Yield quintal
/hectare
Vertisols are clay-rich soils that shrink and swell with changes in moisture content.
During dry periods, the soil volume shrinks, and deep wide cracks form. The soil volume
then expands as it wets up. This shrink/swell action creates serious engineering problems
and generally prevents formation of distinct, well-developed horizons in these soils (Soil
Taxonomy, 2013)2.
Cambisol type of soil is also dominant type of soil which is characterized by the absence
of a layer of accumulated clay, humus, soluble salts, or iron and aluminum oxides. They
differ from un weathered parent material in their aggregate structure, color, clay content,
carbonate content, or other properties that give some evidence of soil-forming processes.
Because of their favorable aggregate structure and high content of weather able minerals,
they usually can be exploited for agriculture subject to the limitations of terrain and
climate. Cambisols are the second most extensive soil group on earth (Cambisol, 2013)3.
1
http://www.isric.org/about-soils/world-soil-distribution/leptosolsaccessed on 11/18/2013
2
http://www.cals.uidaho.edu/soilorders/vertisols.htm accessed on 11/18/2013.
3
http://www.britannica.com/EBchecked/topic/707510/Cambisol accessed on 11/18/2013.
24
Calcisols type of soil occurs in regions with distinct dry seasons, as well as in dry areas
where carbonate-rich groundwater comes near the surface. Soils having a (petro-)calcic
horizon (horizon with accumulation of secondary calcium carbonates). In addition, they
have no diagnostic horizons other than an ochric or cambic horizon, a calcareous argic
horizon, or a gypsic horizon beneath a petrocalcic horizon (ISRIC, 2013)4.
4
http://www.isric.org/about-soils/world-soil-distribution/calcisols accessed on
11/18/2013.
25
3.2 Data acquisition and software packages
The data used in this study were collected both from primary and secondary sources.
Primary data comprised of information captured from satellite imagery and field
observations. Secondary data sources include published and unpublished materials such
as books, topographic and thematic layers, journals, reports of Meteorological Agency
and Central Statistical Agency as well as other publications and scientific works. To
manipulate these data sets, different software’s were also used for the analysis. Details of
the methodology are presented in Figure 3.7.
26
The data is available from 2001. Compared to other rainfall data like ECMWF, RFE
shows a better estimation (Rijks et al., 2007).
SPOT VEGETATION which is synthesized for Decadal (S10) images became regularly
available from the first of January 2003 therefore can be used for this research for NDVI
analysis (2003-2013). Here software called Spirit was used to analyze time series
imagery. Daily synthesis (S1) or ten-day synthesis (S10); these are mosaics of acquired
27
image segments, respectively for 24h periods and for the last 10 days. A Maximum Value
Composite (MVC) synthesis can be delivered with several spatial resolutions. These three
products are called S10 for 1 km2 data, S10.4 for 4 km2, and S10.8 for a resolution of 8
km2 (Eerens et al., 2014).
WRSI for a season is based on the water supply and demand a crop experiences during a
growing season. It is calculated as the ratio of seasonal actual evapotranspiration (ETa) to
the seasonal crop water requirement (WR):
WR = PET * Kc.
AET represents the actual (as opposed to the potential) amount of water withdrawn from
the soil water reservoir ("bucket"). Whenever the soil water content is above the
maximum allowable depletion (MAD) level (based on crop type), the AET will remain
the same as WR, i.e., no water stress. But when the soil water level is below the MAD
level, the AET will be lower than WR in proportion to the remaining soil water content
(Tinebeb, 2012).
28
A multi spectral image is an image that contains more than one spectral band. It is formed
by a sensor which is capable of separating light reflected from the earth in to discrete
spectral bands while a panchromatic image contains only one wide band of reflectance
data. The data is usually representative of a range of ‘n’ bands and wavelength, such as
visible or thermal infrared, that is, it combines many colors so it is “pan” chromatic.
29
Figure 3.6 Trend of Maize crop yield (2004 – 2012) in south Tigray Zone.
Source: CSA annual agricultural report.
3.2.1.7 Materials
Table 3.2 presents the materials and software’s used in the study in line with their
sources and purposes.
Table 3.2 Summary of equipment and materials used for data collection and analysis.
No Equipment Source Purpose
To derive rainfall
RFE 2.0 (8 km)
30
No Equipment Source Purpose
31
this reason, a flexible and user-friendly interface, targeting both national and international
agriculture and food security experts, is highly desirable (Eerens et al., 2014).
32
Figure 3.7 Methodological flow chart of spectro-agrometerological model.
33
3.3 Data processing and analysis methods
There are two approaches of quantifying yield production using remote sensing. These
are purely remote sensing approach (direct approach) and mixed approach where
additional bio climatic predictor variables are used. In many cases, the predictive power
of remotely sensed indicators can be improved by adding independent meteorological
(bio climatic) variables in the regression models but in an area where there is stable crop
area over the observed period and homogenously large area a direct NDVI / production
regression can be applied (Rembold et al., 2013).
Potdar et al. (1999) as cited in Rembold et al. (2013) observed that the spatiotemporal
rainfall distribution needs to be incorporated in to crop yield models, in addition to
vegetation indices deduced from remote sensing data, to predict crop yield of different
cereal crops grown in rainfed condition. Such hybrid models show higher correlation and
predictive capabilities than the models using remote sensing indicators only as input
variables. Therefore, due to these advantages this research uses the mixed approach
which is also refereed as spectro-agro meteorological approach.
3.3.1 Selection of date of satellite imagery and acquiring the time series imagery
The choice of the date of the image is made based on the analysis of the information
given by the farmers on the date of transplanting and date of harvesting. The choice of
date was in such a way that the image should coincide with the peak vegetation period of
the farmers field (Rembold et al., 2013).
Accordingly the study area planting date is found to be from middle of May and this is
also cross checked with the zone livelihood profile which also states that May is planting
date for maize crop in the study area. Generally maize crop in southern Tigray zone is
planted in May, growth in biomass occurs from June to July and flowering in September
based on the information found from the interview with the local farmers.
34
Following this, SPOT-VGT NDVI Decadal images is freely downloaded from the
website http://www.vito-eodata.be/PDF/portal/Application.html from the month May up
to September starting from 2003 to 2012 (ten years’ time series data).
After acquiring the data (which consists of several HDF layers joined in one ZIP file), the
following steps were performed:
1. Extraction of the NDVI product and the so-called ‘Status Map’ using VGT Extract
software. This software is freely available for download on www.agricab.info.
2. Applying the Status Map on the NDVI image in SPIRITS software using the ‘Flag
VGT NDVI’ tool.
After the above two processes were carried out, 150 decadal (S-10) images of SPOT
VGT NDVI (Fig 3.8 )were ready for further analysis with raster value ranging from 0 to
255.
In the same manner, RFE 2.0 satellite rainfall estimates which are found in dekedal at
http://earlywarning.usgs.gov/fews/downloads/index.php?regionIDwerefreely downloaded
from the month May up to September starting from 2003 to 2012(ten years’ time series
data) with rainfall estimate ranging from 0 up to the maximum rainfall estimated (Fig
3.9).
Figure 3.8 SPOT VGT image of Ethiopia,1st Decade of Figure 3.9 RFE 2.0 image of Ethiopia, Mean of 2003
May 2003. (May - September).
35
Actual Evapotranspiration (ETa) and Potential evapotranspiration (PET) are another
input for the model computation which were downloaded freely from FEWSNET
http://earlywarning.usgs.gov/fews/downloads/index.php? at monthly and annual level
from May up to September of year 2003 up to 2012 respectively (Fig 3.10 and 3.11).
Figure 3.10 ETa imagery of September, 2010. Figure 3.11 PET imagery (2003 mean annual).
3.3.2 Preprocessing of satellite images
A prerequisite for using time series of remote sensing data for agricultural application is
atmospheric correction and geometric rectification of the dataset. In this study, the S-10
images were used. The S-10 images represent maximum S-1 values with in a ten day
period to minimize the effects of clouds and atmospheric optical depth. Atmospheric
corrections for ozone are done on the images before they are delivered to users
(Sawasawa, 2003).
The products of SPOT VEGETATION acquired by MARS are 10-day NDVI synthesis
(S10) images, obtained through Maximum Value Compositing (MVC). The images are
corrected for radiometry, geometry and atmospheric effects and the same is carried out
for RFE 2.0 results (Rojas et al., 2005). However other preprocesses were carried out
such as projecting the layers, extracting region of interest, smoothening and filtering.
36
3.3.2.1 Projection of the data
All data used in this study including satellite imageries of different source, topographic
maps, thematic layers like road, towns were projected to the geographic coordinate
system GCS_WGS_1984 and datum of World geodetic system 84 (WGS 84), ensuring
consistency between datasets. Here topographic maps were used to check the geometric
rectification of the satellite imageries.
The software which I have used for the analysis of time series data (SPIRITS) includes a
low pass filter tool which is designed to emphasize larger, homogenous areas of similar
tone and reduce the smaller detail in an image (Eerens et al.,2014).
A good way of looking at time series at pixel level is the Z-profile tool of ENVI. In order
to export time series to ENVI, MTA (meta) files were created to visualize the time profile
of smoothed and non-smoothed images. Figures from 3.12 – 3.15 explains the difference
37
between the smoothed and unsmoothed images and how smoothening corrects values to
the normal situation by avoiding outliers (Spirit manual, 2013).
Figure 3.12 UnSmoothed S10s time series Figure 3.13 Spectral profile of unsmoothed data
Figure 3.14 Smoothed S10s time series Figure 3.15 Spectral profile of smoothed data
38
3.3.3 Maize crop masking
One obstacle to successful modeling and prediction of crop yields using remotely sensed
imagery is the identification of image masks. Image masking involves restricting an
analysis to a subset of a crop land masking where all sufficiently cropped pixels are
included in the mask but the ideal approach would be to use crop specific masks. This
would allow one to consider only information pertaining to the crop of interest. However,
this approach is not applicable to areas such as, the one being studied by the researcher
where there is crop rotation. Therefore, crop masking at crop land area is acceptable for
such areas (Rijks,2007). For this reason, pan-sharpened SPOT 5 imagery is used in order
to carryout land cover classification for the study area.
Pansharpened SPOT 5 image of 2006 was acquired from CSA and used for image
classification using visual image interpretation (Fig 3.16). The pansharpened SPOT 5
image is then processed in Definien software for object based classification.
39
Figure 3.16 SPOT image of the study area.
Object based image analysis requires the creation of objects or separated regions in an
image. One established way to do so is image segmentation. Depending on its
application, different approaches exist for image segmentation ranging from very simple
to highly sophisticated algorithm (Kindu et al., 2013).
40
based on defined parameters. These parameters are scale, shape and compactness and
defined through trial and error to successfully segment objects in an image (Kindu et al.,
2013).
Using identified target LULC classes object based classification was applied to a
segmented image in order to assign a class to each of the segments using MaDCAT
software and the technique followed was visual interpretation. This approach attempts to
assign objects that are generated through image segmentation in to a specific class of
interest, in our case agriculture and non-agriculture class (Fig 3.17).
Figure3.17Interpreted image of the study area. Figure 3.18Agricultural land of the study area.
41
3.3.3.2. Accuracy assessment
Land cover accuracy is commonly defined as the degree to which the derived
classification agrees with reality and the accuracy of the map in a larger part determines
the usefulness of the map (Ashenafi Burqa, 2008).
Accuracy assessment is critical for a map generated from any remote sensing data. Error
matrix is the most common way to present the accuracy of the classification results.
Overall accuracy, user's and producer's accuracies, and the Kappa statistic were then
derived from the error matrices. The Kappa statistic incorporates the off diagonal
elements of the error matrices and represents agreement obtained after removing the
proportion of agreement that could be expected to occur by chance.
For an accuracy assessment of a map being produced by object based image analysis
(OBIA) the units to be tested are the image objects. Therefore, the OBIA classification is
validated with the representation of the whole polygon majority class. Accordingly, the
above interpreted classes (i.e. agriculture and non-agriculture) were equally represented.
The enough number of samples that represent the thematic classes and ensure good
distribution across the map is important to test the attribute accuracy. Rule of thumb is 50
samples per map class or can be derived using the formula devised by Grenier et al.
(2008).
Accordingly the sample size for the accuracy assessment is found to be 288 and 144
sample points were generated for each class. Then these points were randomly generated
42
for each class and their GPS reading was up loaded to GPS for the field accuracy
assessment (Fig 3.19).
43
These points were checked in two ways; those that are accessible were observed in the
field and the second means was using Google Earth as a reference. Accordingly the
following error matrix (Table 3.3) for the 288 sample points is presented as follows.
The overall accuracy and kappa analysis were used to perform a classification accuracy
assessment and accordingly over all accuracy of the data is 87% and kappa coefficient
was computed which is 0.74 and from the result the interpretation can be taken as
accurate result for further analysis. Detail calculation of user and producer accuracy and
sample photos can be referred from the Appendix 2 and 3.
44
The interpretation result of land cover which have only Agriculture land area was
intersected by Crop agro ecological zone suitable for maize crop to refine the
interpretation (Fig 3.20) and to reach to a more crop specific mask data for the study area
(Rijks et al.,2007).
45
3.3.4. Preparing independent variables using mask data
To determine the predictive capability of the independent variables, all variables are
extracted with crop mask data for further correlation analysis and to identify highly
correlated ones with the dependent variable which is maize yield.
The time series data (150 dekedal) of NDVI which have passed through image
preprocesses in one go were ready for monthly maximum value compositing (MVC) and
50 monthly composited NDVI images were prepared. These monthly NDVI images were
then extracted using the crop mask data to focus only on crop of interest then average
NDVI, cumulated NDVI and Maximum NDVI value for each year was computed. The
calculated value is in raster value which ranges from 0 to 255 and needed to be changed
to NDVI value. Thus, the formula, NDVI = (RAW*0.004) - 0.1, is run and the result
were ready for correlation with maize yield (Fig 3.22) (Eerens and Haesen, 2013).
RFE 2.0 time series data of Decadal image was also composited at monthly level using
MVC and were extracted with crop mask data and yearly average was computed from the
extracted results for further analysis (Fig 3.25).
The WRSI model which is a ratio of seasonal actual crop evapotranspiration (ETac) to
the seasonal crop water requirement, which is the same as the potential crop
evapotranspiration (PETc). Here maize crop coefficient from LEAP software was
adopted (fig 3.21) for the phonological stage and accordingly;
July – 0.75
August – 1.2
September – 0.6
46
Figure 3.21 Crop coefficient of maize in different stage(Planting-Flowering).
(Source: LEAP software)
Monthly ETa were multiplied by their respective coefficient and extracted using crop
mask data and averaged in order to give ETac for each year. The same procedure is
followed for water requirement and resulted in PETc. The ratio of ETac with PETc will
give the WRSI and prepared for further analysis (Fig 3.23 and 3.24).
47
Fig 3.22 NDVI value for the month of July, 2003. Fig 3.23 Mean ETa for the month of May 2003.
Figure 3.24 PET for the month of May, 2003. Figure 3.25 RFE for the month of May, 2003.
48
3.3.5 Multiple linear regression analysis
Spectro-agrometerological yield forecasting using a multiple linear regression starts with
a table of data containing yields as dependent and a series of agrometeorological and
other variables which are thought to determine the yields (Gommes,2001).
Before correlating the indices with maize yield, data quality control was carried out. This
is a checking mechanism for the evaluation of collected data before it is used for model
development. The only statistical way is the identification of “outliers” within collected
data. Accordingly by seeing the scatter plot diagram 2005 year values which are far from
the fit line is considered as outlier and removed from the correlation and hence from the
model development (JMP manual, 2009). Table 3.4 indicates the values of yield and
dependent variables excluding the outliers.
Table 3.4 Table showing observed yield and independent variables.
No. Year (Meher season) Yield in Qt/ hector May June July August Sept NDVIc ETa ETa Total WRSI RFE2.0
1 2004 10.3 0.25 0.2 0.3 0.48 0.49 1.23 100 351.6 42.2 15.912
2 2006 17.29 0.29 0.24 0.25 0.11 0.04 0.89 139.9 539.1 69.2 33.03333
3 2007 17.93 0.22 0.18 0.39 0.54 0.53 1.33 148.6 574 51.4 31.12667
4 2008 14.66 0.2 0.21 0.25 0.42 0.47 1.08 113.1 462.6 70.5 18.09
5 2009 15.68 0.22 0.18 0.28 0.32 0.04 1 107.2 402.4 38 18.06667
6 2010 21.64 0.25 0.2 0.31 0.46 0.21 1.22 150.6 588.3 67.4 40.75
7 2011 15.38 0.22 0.2 0.14 0.47 0.53 1.03 156.2 568.8 43.5 21.91
8 2012 12.27 0.3 0.23 0.32 0.5 0.53 1.35 157.7 581.3 56.9 23.95333
The objective of multiple linear regression analysis is to predict the single dependent
variable by a set of independent variables. There are some assumptions in using this
statistics – (a) the criterion variable is assumed to be a random variable (b) there would
be statistical relationship (estimating the average value) rather than functional
relationship (calculating an exact value) (c) there should be linear relationship among the
predictors and between the predictors and criterion variable. Multiple regression analysis
provides a predictive equation:
Y= a+b1x1+ b2x2+……+bnxn
Where, a = constant
49
b1, b2,… bn = beta coefficient or standardized partial regression coefficients (reflecting
the relative impact on the criterion variable)
The b's are the regression coefficients, representing the amount the dependent variable y
changes when the corresponding independent changes 1 unit. The a is the constant, where
the regression line intercepts the y axis, representing the amount the dependent y will be
when all the independent variables are 0. The standardized version of the b coefficients is
the beta weights, and the ratio of the beta coefficients is the ratio of the relative predictive
power of the independent variables (JMP manual, 2009). Lastly the developed model
predicts the average value of one variable (Y) from the value of another variable (X).
The X variable is also called a predictor. Generally, this model is called
a regression model.
50
CHAPTER FOUR
4 RESULTS
4.1 Correlation analysis of different indices with maize yield
The first step to develop a model is to correlate the independent variables with the
dependent variable and by observing the correlation result (Coefficient of correlation, R
square, R square adjusted, P value, RMSE), the predictive capability of the independent
variable is determined in addition the assumptions were checked and if it is acceptable
then will be considered for model development. All statistical analyses were undertaken
in JMP statistical discovery software from SAS.
51
R square 0.63, R square adjusted 0.57,
RMSE 2.28, P value 0.02
52
R square 0.00, R square adjusted -0.17,
RMSE 3.76, P value 0.9452
The NDVIa shows the highest value of correlation coefficients among NDVI variables (r
= 0.79) with significant P value (0.0183) followed by NDVIc (r =0.44) while NDVIx
shows r = -0.023 stating their relationship could not be linear, this fact violates one of the
assumptions for the multiple linear regression model and rejected from the model
development. When we compare NDVIa and NDVIc correlation strength with maize
yield which satisfies the assumptions for the multiple linear regression, their correlation
coefficient suggests that the NDVI average is a good indicator of crop yield forecast than
NDVIc and was selected for model development from the NDVI categories.
53
Table 4.2 correlation result of rainfall
Yield in Quintal Rainfall
per hectare
Yield in Quintal per hectare 1.0000 0.8475
The result shows high value of correlation (85%) with a significant P value of 0.0079 and
it also satisfies the assumptions for multiple linear regression analysis. From this, it can
be observed that rainfall is a determinant factor for crop production for areas like south
Tigray where rainfed agriculture is practiced and this variable is taken for model
development in line with the previously identified NDVIa.
54
Table 4.3 correlation result of WRSI values
Yield in Quintal per WRSI
hectare
Yield in Quintal per 1.0000 0.4350
hectare
From the result, there is no significant correlation between WRSI and yield in the study
area besides the P value is not significant and thus this variable couldn’t be considered
for model development even though it satisfies the assumptions for multiple linear
regression model. This is may be related with the fact that the WRSI model was
particularly successful in capturing the response of the crop during a relatively dry year.
In areas that never experienced water deficit during the study period, it was possible to
infer the magnitude of yield variability that was caused by factors other than water
supply.
55
4.1.4 Correlation of ETa with maize yield
In a related study carried out in Kenya, the coefficient accumulated during the whole
cycle of ETa shows high correlation with maize yield. Likewise, evaluation of ETa
variables was made by computing the cumulated and average ETa for the whole cycle.
Table 4.4, Figure 4.6 and 4.7 indicates the correlation of different ETa variables with
maize yield.
56
Rsquare0.33, Rsquare adjusted 0.23,
RMSE 3.1, P value 0.13
From the correlation result we observed that, there is no significant correlation between
both average and cumulated ETa and maize yield in the study area even though their
correlation satisfies the assumptions for multiple linear regression model. Therefore, this
variable couldn’t be considered for model development.
57
4.2 Multiple linear regression model for yield forecasting
The results of the above correlation tables present the correlation matrix of maize yield
and the independent variables. Among the seven independent variables: variables derived
from remote sensing and climatic variables, variables which satisfies the assumptions for
multiple linear regression model with significant P value at 95 % confidence level were
derived for the model development. Accordingly, rainfall shows the highest value of
correlation coefficients (r 0.85) with significant P value (0.0079) at 95 % confidence
level. NDVIa which is a result of monthly maximum value composite (MVC) averages of
NDVI from the planting date to the end of the crop cycle gives a correlation coefficient of
0.80 with significant P value of 0.0183 at 95 % confidence level. While others like ETa
(total) which have a correlation value of 0.58 and WRSI (r =0.44) with a P value of
greater than 0.05 which is beyond the acceptable range at 95 % of confidence level were
rejected from the model development.
The researcher selected the two most correlated variables, rainfall and NDVIa, to create a
multiple linear regression model since many studies reported that linear regression
modeling is the most common method to produce yield predictions by using remote
sensing derived indicators together with bio climatic information.
For this study, maize yield data and data derived from the different indices were prepared
for multiple linear regression analysis. The excel sheet which is imported to the JMP
statistical software was used to build a multiple linear regression model using the two
most correlated variables.
Scatterplots which were indicated in the previous part of this chapter can help to visualize
relationships between variables. Once the relationship is visualized, the next step is to
analyze those relationships so that they can be describing numerically which is called
model.
58
As a result of all the above processes, highly correlated variables (NDVIa and RFE) were
used to develop a model. This model was validated based on its Coefficient of
determination (R2), root mean square error (RMSE) and coefficient of variation (CV) as
shows in (Fig 4.8).
Regression
line
Figure 4.8. Comparison between the maize yield estimated by the spectro agrometeorological
model and the observed yields for the study area.
When we see the overall fit of the model by examining the plot of the actual yield per
hectare against the predicted yield per hectare, it reveals that, most points lie fairly close
to the 45° line (exact prediction line). The R square value of the model is 0.88, R square
adjusted is 0.84 with root mean square error of 1.405277 quintal per hectare. The P value
of the model is very small(0.0046) at 95 % confidence level and it is a good evidence that
at least one independent variable appears to predict maize yield in quintal in a significant
59
value. By observing this P value, it is unclear which independent variable are very good
predictor and which is poor.
But the following table 4.5 which shows parameter estimates of the model which reveals
that rainfall have significant Probability value and hence high predictive capability than
NDVIa.
Table 4.5 Parameter estimates for the maize forecast model
Term Estimate Std Error t Ratio Prob>|t|
Intercept -1.063621 3.465188 -0.31 0.7713
The analysis of variance as shown in table 4.6 state that maize yield forecast model has
an observed significance probability (Prob>F) of 0.0046, which is significant at 0.05
level.
Table 4.6 Maize yield forecast model variance analysis
Source DF Sum of Squares Mean Square F Ratio
Model 2 74.994575 37.4973 18.9879
60
4.3 Comparison of conventional crop yield forecast with the developed
model
As is the case in many countries, CSA estimate crop production by calling a meeting of
knowledgeable people and stake holders. Central Statistical Agency (CSA) had increased
the number of stakeholders data on condition factor collected from one to five, that used
to be only one prior to the year 2005/06 (1998 E.C.), with the objective to keep up and
improve the data quality in terms of reliability and accuracy. Since then, the Annual Crop
Production Forecast Survey conducted included stakeholders (sampled households,
development agents, chairperson of the rural kebele, community leaders and observations
from highly qualified professionals from CSA and FAO ) as ultimate statistical unit on
collecting “condition factors”(CSA, 2013).
Under such circumstances, any system which will avoid bias and ensure at least a
reasonable degree of consistency from year to year and from place to place should be
preferred (Gommes, 2001).
61
The other point of difference is the remote sensing supported approach can provide location
information in that after the forecast is made, you can verify the result by taking GPS
reading and navigate to the areas. Therefore this approach creates an opportunity to exactly
indicate which areas have high yield and viceversal in a tangible manner which the
conventional approaches lucks badly.
Therefore, it is clear that maize yield forecast using remote sensing and GIS improves both
quality and timeliness of the data more over it minimizes subjectivity considerably.
Remote sensing supported approach has also a capacity to demonstrate areas (lower
administrative areas) where there is relatively high, medium and low production and this
makes intervention very easy for the decision makers.
The following figure 4.9 illustrates the comparison between the conventional yield estimate
with the remote sensing supported approach result.
Figure 4.9. Comparison between maize yield (quintal/hectare)estimated by the model and the
observed yield.
62
4.4 A Maize crop forecast for the year 2013
Using the developed model, the 2013 maize crop forecast was made. Accordingly an
average of 16.2 quintal per hectare is expected with 20.63 quintal per hectare as the
highest and 11.84 quintal per hectare as the lowest. Table 4.7 shows the productivity
level of maize crop for the year 2013 in the study area.
Table 4.7 Maize production level of the year 2013 for south Tigray zone
Level of production Quintal per hectare Area coverage Percentage
I 19 - 21 9.6%
II 17 - 18 26.3%
III 12 - 16 64.1%
The above table indicates that more than 64.1% of the area covered with maize indicates
12 – 16 quintal/ha production while 9.6% of the area falls in 19 – 21 quintal/ha
production. The rest 26.3% of the area is within the range of 17 – 18 quintal/ha
production.
As the above table illustrates the productivity of maize crop in south Tigray zone in three
categories, the following figure 4.10 demonstrates the spatial situation of the production
level.
63
Figure 4.10 Maize yield forecast map of 2013.
The result of the analysis for the year 2013 forecast of maize crop was also compared
with zone agricultural office data in a personal communication with Mr. Nebeyiu legesse
and it revealed that south western part of the study area (Ofla Woreda) which falls in the
most productive area according to the forecast result of the analysis was proofed as the
most productive Woreda by the agriculture office too.
64
CHAPTER FIVE
5 DISCUSSION
The regression-based model developed in this study has utilized official crop statistics
from CSA to derive a relationship between different variables derived from remotely
sensed parameters and reported maize yield statistics. A correlation analysis was carried
out for the identified seven variables and variables with high correlation coefficient
(rainfall and NDVI average) were selected with a correlation coefficient of 85% and 80%
respectively and they have significant Probability value. These variables were processed
for model building using multiple linear regression model.
The results of the model developed showed promising results in that it has high
prediction capability (R2 0.88 and RMSE 1.4 qt/ha). When this result is compared with
other findings, there are researches whose results align with the finding and viceversal.
In addition to vegetation indices deduced from remote sensing data, rainfall distribution
needs to be incorporated in to crop yield models according to the observation of Potdar et
al. (1999) as cited in Greatrex (2012). This observation agrees with the result of this
research in that there is high correlation between rainfall and maize yield in the analysis
of identification of factors determinant for the maize crop forecast.
In a related research conducted on application of remote sensing and agrometeorological
derived variables like actual evapotranspiration (ETa) calculated by the FAO CSWB
model and NDVI as independent variables in a regression analysis in order to estimate
maize yield in Kenya, got the two most correlated variables ETa total and CNDVI with
73% and 87 % correlation coefficient combined in the model to explain 83% of the maize
crop yield variance with a RMSE of 0.333 t/ha with 21 % coefficient of variation as it is
indicated in Rojas (2006). As compared with the result for south Tigray zone, even
though Rojas developed model used ETa instead of rainfall which is not the most
correlated variable in this case but the result showed that spectro-agrometerological
model is possible in fragmented agricultural land which strengthens the acceptability of
the developed spectro agrometeorological model of maize for south Tigray zone since the
model was devised for fragmented land.
65
Rijks et al (2007) demonstrates that GEOWRSI is a tool that can be used for reliable and
early estimation of maize production estimation in Kenya. According to their findings,
WRSI can help improve yield and production estimates but the findings of this research
in South Tigray does not agree with this findings because due to its small correlation
coefficient and insignificant P value of WRSI with maize yield.
The correlation coefficient for the observed and predicted values of the yield was
computed for the study area and it revealed a correlation coefficient of 0.94. This result
agrees with Prasad et al. (2007) findings in that they also observed a high correlation
coefficient of 0.9 between the predicted and observed values in a research conducted on
wheat and rice yield forecast in India.
Mantasa et al. (2011) in their research entitled Maize Yield Forecasting for Zimbabwe
Farming Sectors using Satellite Rainfall Estimates clearly shows that when WRSI values
were regressed with historical yield data, the results showed that relatively high skill
yield forecasts can be made even when the crops are at their early stage. But this finding
is different from south Tigray research in that WRSI is not highly correlated and also
have insignificant P value.
The findings of this research demonstrates a clear potential of spectro-
agrometeorological factors for maize yield forecasting which aligns with many findings
carried out by different researchers for example Rojas 2005 stated that meteorological
information of the CSWB model, CPSZ and real time satellite data were used for the crop
yield forecast which shows a potential of the spectro-agrometerological factors for crop
yield forecast.
66
CHAPTER SIX
6 CONCLUSION AND RECOMMENDATIONS
6.1. Conclusion
Policy makers need accurate and timely information on crop production and areas as soon
as possible at the lower administrative level. Such information should be available before
the harvest so that preparation can be made which is the very cause of forecasting crop
yield using different approaches.
The major objective of this study was to develop a model for maize crop using remote
sensing and GIS. Accordingly crop statistical data as a dependent variable and different
predictor variables derived from remotely sensed imageries were computed and those
variables with higher correlation and significant P value are selected for model
development. The analysis result confirmed that rainfall and NDVIa of study area have
good correlation (r=0.85 and r=0.8 ) with significant P value respectively.
Using the regression model developed for the study area, yield forecast is possible
roughly well before the date of the harvest. Maize yield forecast map of the year 2013
was also prepared using the developed model and an average result of 16.2 quintal per
hectare was forecasted showing the south western part of the zone having high
productivity per hectare and can be used by the decision makers to identify relative
productive areas prior to harvest at the lower administration level.
Generally it is possible to conduct maize yield forecast using NDVI derived from SPOT
VEGETATION and Rainfall from the RFE 2.0 for areas similar to south Tigray zone.
67
6.2. Recommendations
Based on the encouraging results of this research output, the developed model can be
checked in areas other than south Tigray after a procedures stated in the methodology of
the paper is followed meanwhile more research and broader testing is necessary. As an
initial effort, this application of the agro meteorological yield model appears promising.
Some further effort is necessary to operationalize the results of this research which
includes:
- A relatively longer period of time series data should be analyzed in order to reach
to operational application.
- A crop mask data should be improved in order to get a more refined crop mask
data.
68
References
Allbed, A., Kumar, L. and Sinha, P. (2014). Mapping and Modeling Spatial Variation in
Soil Salinity in the Al Hassa Oasis Based on Remote Sensing Indicators and
Regression Techniques. Remote Sens. 6: 1137–1157.
Becker-Reshef, I., Justice, C.O., Sullivan, M., Vermote, E.F., Tucker, C., Anyamba, A.,
Small, J., Pak, E., Masuoka, E. and Schmaltz, J. (2010). Monitoring global
croplands with coarse resolution Earth observation: The Global Agriculture
Monitoring (GLAM) project. Remote Sens. 2: 1589–1609.
Benedetti, R. and Rossini, P. (1993). On the Use of NDVI Profiles as a Tool for
Agricultural Statistics: The Case Study of Wheat Yield Estimate and
Forecast in Emilia Romagna. Remote Sens. Environ. 45: 311–326.
69
Eerens, H. and Haesen, D. (2013). Software for the Processing and Interpretation of
Remotely Sensed Image Time Series. Vito, Belgium, pp288.
Eerens, H., Haesen, D., Rembold, F., Urbano, F., Tote, C. and Bydekerke, L. (2014).
Image time series processing for Agricultural monitoring.
Envt.Mod.Software.53: 154–162.
Food and Agriculture Organization (FAO) (2010). FAO Global Information and Early
Warning System on Food and Agriculture Special Report. Unpublished
working report, FAO, Addis Ababa, Ethiopia, 21 pp.
George, A. and Hanuschak, S. (2010). Timely and Accurate Crop Yield Forecasting and
Estimation, History and Initial Gap Analysis.
http://www.fao.org/fileadmin/templates/ess/documents/meetings_and_workshops/GS_SAC_2013/Improvi
ng_methods_for_crops_estimates/Crop_Yield_Forecasting_and_Estimation_Lit_review.pdf
accessed on 10/08/2013
Grenier, M., Labrecque, S., Benoit, M. and Allard, M. (2008). Accuracy assessment
method for wet land object based classification. In : Proceedings of the
XXXVIII-4c1. ISPRS, Aug 5-8,2008.Calgary,Canada.
Hastings, D.A. and Emery, W.J. (1992). The Advanced Very High Resolution
Radiometer (AVHRR): A brief reference guide. Photogram. Eng. Remote
Sensing. 58: 1183–1888.
70
http://earlywarning.usgs.gov/fews/downloads/index.php?regionID accessed on
10/12/2013
http://www.ecmwf.int/products/data/archive/descriptions/od/oper/an/sfc/index.html
accessed on 11/18/2013
International Food Policy Research Institute (IFPRI) (2010). Maize value chain in
Ethiopia. Constraints and opportunities for enhancing the system.
Unpublished working report, IFPRI, Washington D.C., USA, 42 pp.
JMP Manual. (2009). Introductory Guide, Second edition. JMP, A business unit of SAS
version 8.0.2.146 pp.
Kindu, M., Thomas, S., Teketay. D. and Thomas, K. (2013). Landuse/Landcover change
analysis using Object based classification approach in Munesa Shashemene
Landscape of the Ethiopian highlands. Remote sens.5: 2411–2435.
LEAP software Manual.(2012). LEAP Version 2.61 for Ethiopia, 103pp. Addis
Ababa,Ethiopia
Lillesand,T. and Kiefer, R.W.(1994). Remote sensing and Image interpretation. Third
ed., John Wiley & Sons, Inc., 750pp.
Manatasa, D., Nyakudya, W., Mukwada, G. and Matsikwa, H. (2011). Maize Yield
Forecasting for Zimbabwe farming Sectors using Satellite Rainfall
Estimates. Nat.Hazards. 10:1007/5.
Novella, N.S. and Thiaw, W.M. (2012). African Rainfall Climatology Version 2 for
Famine Early Warning Systems. Journal of Applied Metereology and
climatology. 52: 588 - 606.
71
National Meteorological Agency (NMA)
http://www.ethiometmaprooms.gov.et:8082/maproom/ accessed on 10/03/14
Prasad, k., Chai, L., Singh, P. and Kafatos, M. (2007). Use of Vegetation index and
Meteorological parameters for the prediction of crop yield in India. Int. J.
Remote Sensing. 28 (23): 5207–5235.
Rembold, F. and Maseli, F. (2004). Estimating Inter annual Crop Area Variation using
Multi Resolution Satellite sensor images. Int. J. Remote Sensing. 25: 2641–
2647.
Rembold, F., Atzberger, C., Savin, I. and Rojas, O. (2013). Using Low Resolution
Satellite Imagery for Yield Prediction and Yield Anomaly Detection.
Remote Sens. 5: 1704–1733.
Rijks, O., Massart, M., Rembold, F., Gommes, R. and Leo, O. (2007). Crop and
rangeland monitoring in eastern Africa. In: Proceedings of the 2nd
International workshop, pp.95–104. Nairobi, Kenya.
Rojas, O. (2006). Operational maize yield model development and validation based on
remote sensing and Agrometereological data in kenya. In: proceedings of
remote sensing support to crop yield forecast and area estimates workshop,
pp. 325. ISPRS Archives xxxvi, ISPARA, ITALY.
Rojas, O., Rembold, F., Royer, A. and Negere, T. (2005). Real time agrometereological
crop yield monitoring in eastern Africa. Agron.sustain.dev.25: 63–77.
Sawasawa H.L.A. ( 2003). Crop Yield Estimation: Integrating Remote Sensing ,GIS and
Management Factors: A Case Study of BIRKOOR and HORTGIRI
MANDALS – Nizambad District, India. Unpublished MSc Thesis, ITC,
Enschede, The Netherlands.
Senay, G.B. and Verdin, J. (2003). Characterization of Yield reduction in Ethiopia using
a GIS based crop water balance model. Remote Sensing. 29 (6): 687–692.
SPIRIT Manual. (2013). Software for the Processing and interpretation of Remotely
Sensed Image Time Series. User’s Manual, Version:1.1.1. 288 pp. Vito,
Belgium.
72
Tigray, Ethiopia. Unpublished MSc Thesis, University of Twenty,
Enschede, The Netherlands.
Washington, R., Todd, C., Lizcano, G., Tegen, L., Flamant, C., Koren, L.,Ginoux, P.,
Engelstaedter, S., Bristow, S., Zender, S. Goudie, S., Warren, A. and
Prospero, M. (2006). Links between topography, wind, deflation, lakes and
dust: The case of the Bode le Depression, Chad. Geophysical Research
Letters. 3: pp4.
73
Appendix 1: Sample GPS readings for accuracy assessment
Agriculture class
MAP Ground
CID POINT_X POINT_Y CLASS Truth
1 572364 1407353 Agriculture
2 540030 1458297 Agriculture
3 556241 1398341 Agriculture
4 577254 1373260 Agriculture
5 560309 1478257 Agriculture
6 524197 1480418 Agriculture
7 548230 1472973 Agriculture
8 543006 1476323 Agriculture
9 584690 1400727 Agriculture
10 546073 1497742 Agriculture
11 565592 1438329 Agriculture
12 542128 1480989 Agriculture
13 538367 1414586 Agriculture
14 583470 1418315 Agriculture
15 561089 1449430 Agriculture
16 570908 1376916 Agriculture
17 535134 1426832 Agriculture
74
Appendix 2: Sample pictures from field
X= 0580169 X= 0576709
Y=1400857 Y=1405400
X= 0558104 X= 0561400
Y=1424619 Y=1412102
75
APPENDIX 3: Accuracy assessment matrix result
REPORT
-----------------------------------------
Image File : c:/abiy/south_unsup.img
User Name : user
Date : Thu Apr 10 11:47:01 2014
ERROR MATRIX
-------------
Reference Data
--------------
Classified Data Unclassifi Class 1 Class 2 Row Total
--------------- ---------- ---------- ---------- ----------
Unclassified 0 0 0 0
Class 1 0 130 14 144
Class 2 0 23 121 144
ACCURACY TOTALS
----------------
76
KAPPA (K^) STATISTICS
---------------------
77
Appendix 4:Sample Dekedal SPOT VEG image output of SPIRIT software
78
DECLARATION
I hereby declare that the thesis entitled. SPECTRO AGROMETEREOLOGICAL MAIZE YIELD
FORECAST MODEL USING REMOTE SENSING AND GIS IN SOUTH TIGRAY ZONE, ETHIOPIA.
has been carried out by me under the supervision of Dr. K. V. Suryabhagavan,
Department of Earth Sciences, Addis Ababa University, Addis Ababa during the year
2014 as a part of Master of Science program in Remote Sensing and GIS. I further
declare that this work has not been submitted to any other University or Institution for
the award of any degree or diploma.
Signature: _______________________
Addis Ababa University
Addis Ababa
Date: June, 2014
CERTIFICATE
This is certified that the thesis entitled. SPECTRO AGROMETEREOLOGICAL MAIZE YIELD
FORECAST MODEL USING REMOTE SENSING AND GIS IN SOUTH TIGRAY ZONE, ETHIOPIA.
is a bona fed work carried out by Abiy Wogderes Zinna under my guidance and
supervision. This is the actual work done by Abiy Wogderes Zinna for the partial
fulfillment of the award of the Degree of Master of Science in Remote Sensing and GIS
Dr. K. V. Suryabhagavan
Assistant Professor
Signature: _______________________
Department of Earth Science
Addis Ababa University
79