7. Data mining RDBMS Spatial analysis and modelling
7. Data mining RDBMS Spatial analysis and modelling
7. Data mining RDBMS Spatial analysis and modelling
SPATIAL ANALYSIS
BLOCK 1
BASICS OF SPATIAL ANALYSIS 7
BLOCK 2
SPATIAL ANALYSIS 65
BLOCK 3
MODELLING, PROJECT MANAGEMENT AND PROGRAMMING 193
MGY-006: SPATIAL ANALYSIS AND MODELLING
Programme Design Committee
Prof. Sujatha Verma Dr. I. M. Bahuguna Mr. Manish Parmar
Former Director Deputy Director (Rtd.) Scientist
School of Sciences Space Applications Centre Space Applications Centre
IGNOU, New Delhi Indian Space Research Organisation (ISRO) Ahmedabad, Gujarat
Dr. Shailesh Nayak (ISRO), Ahmedabad, Gujarat Dr. Akella V.S. Aswani
Director Prof. Shamita Kumar ESRI India Technologies Pvt.
National Institute of Advanced Studies Institute of Environment Education and Ltd.
Bangaluru, Karnataka Research Hyderabad, Telangana
Dr. P.S. Acharya Bharati Vidyapeeth University Dr. O.M. Murali
Head, NRDMS, NSDI Division Pune, Maharashtra GIS Consultant
Department of Science and Technology Ms. Asima Misra Chennai, Tamil Nadu
Ministry of Science & Technology Associate Director
New Delhi Prof. Manish Trivedi
ES & e-Governance Group School of Sciences
Dr. Debapriya Dutta Centre for Development of Advanced
Scientist ‘G’ & Associate Head IGNOU, New Delhi
Computing (C-DAC)
National Geospatial Programme Ministry of Electronics and Information Dr. Rajesh Kaliraman
Department of Science and Technology Technology (MeitY) School of Sciences
Ministry of Science & Technology Pune, Maharashtra IGNOU, New Delhi
New Delhi Dr. V. Venkat Ramanan
Dr. Sameer Saran
Dr. L.K. Sinha Head School of Inter-Disciplinary and
Former Director Geoinformatics Department Trans-Disciplinary Studies
Defence Terrain Research Lab. (DTRL), Indian Institute of Remote Sensing IGNOU, New Delhi
Delhi & Defence Geoinformatics Dehradun, U.K.
Research Establishment (DGRE) Faculty of Geology Discipline
Defence R&D Organisation (DRDO) Prof. Daljeet Singh School of Sciences, IGNOU
Chandigarh Department of Geography Prof. Meenal Mishra
Prof. P.K. Garg Swami Shraddhanand College Prof. Benidhar Deshmukh
Civil Engineering Department University of Delhi, New Delhi Prof. R. Baskar
IIT Roorkee, Roorkee, U.K. Dr. D.R. Rajak Dr. M. Prashanth
Prof. P.K. Verma Scientist Dr. Kakoli Gogoi
School of Studies in Earth Science Space Applications Centre (ISRO)
Ahmedabad, Gujarat Dr. Omkar Verma
Vikram University, Ujjain, M.P.
2
MGY-006: SPATIAL ANALYSIS AND MODELLING
Preparation Team
Course Contributors
Dr. Nikhil Lele (Unit 1) Dr. Shukla Acharjee (Unit 2 and 3) Dr. O. M. Murali (Unit 4)
Space Application Centre, ISRO Centre for Studies in Geography GIS Consultant,
Ahmedabad, Gujrat Dibrugarh University, Assam Chennai, Tamilnadu
Prof. Sarat Pukan (Units 5 and 6) Ms. Swati Grover (Unit 7) Dr. M. Prashanth (Unit 8)
Department of Geological Sciences GIS Specialist School of Sciences
Guahati University, Assam Ghaziabad, NCR Delhi IGNOU, New Delhi
Dr. Sunil L. Londhe (Unit 9) Prof. Kiranmay Sarma (Unit 10) Dr. Maya Kumari (Unit 10)
General Manager Department of Geology Assistant Professor
Deepak Fertilisers and Petrochemicals GGSIPU, Delhi Amity School of Natural Resources
Corportation Limited, Pune Dr. Akella V.S. Aswani (Unit 12) & Sustainable Development,
Amity University, Noida
Prof. Benidhar Deshmukh (Unit 11) ESRI India Technologies Pvt. Ltd. Mr. I Prabu (Unit 13)
School of Sciences Hyderabad, Telangana Joint Director
IGNOU, New Delhi Dr. R. K. Chingkhei (Unit 12) ES & e-Governance Group
Department of Earth Science Centre for Development of
Manipur University, Imphal Advanced Computing (C-DAC),
Manipur MeitY, Pune, Maharashtra
Content Editors
Prof. Sarat Phukan (Block 1) Shri G. Sajeevan (Block 2) Prof. Benidhar Deshmukh (Block 3)
Department of Geological Sciences Emerging Solutions and School of Sciences
Guahati University, Assam e-Governance Group IGNOU, New Delhi
C-DAC, Pune
3
MGY-006: SPATIAL ANALYSIS AND MODELLING
Block 1 : Basics of Spatial Analysis
Unit 1 : Integration of RS and GIS
Unit 2 : Data Mining and Spatial Data Management
Unit 3 : Overview of Geostatistics and Spatial Data Measurements
Block 2 : Spatial Analysis
Unit 4 : Vector Data Analysis
Unit 5 : Raster Data Analysis
Unit 6 : Connectivity Analysis
Unit 7 : Analysis of 3-Dimensional Data
Unit 8 : Viewshed and Watershed Analysis
Unit 9 : Multicriteria Analysis
Block 3 : Modelling, Project Management and Programming
Unit 10 : Modelling Spatial Data
Unit 11 : Dynamic Modelling
Unit 12 : GIS Implementation and Project Management
Unit 13 : Introduction to GIS Programming
4
MGY-006: SPATIAL ANALYSIS AND MODELLING
You have studied the remote sensing and image classification techniques in the course
MGY-105 ‘Techniques in Remote Sensing and Digital Image Processing’. Spatial analysis
and modelling are important components for the functionalities of a GIS. Spatial analysis
permits to use independent sources and accomplish outcomes through a set of spatial tools.
The outcome of the spatial analysis can build geographical data that is more informative
than the unorganised collected data, whereas spatial modelling is an essential process of
spatial analysis. With the application of models or certain set of rules and procedures for
analysing spatial data in a GIS will result in better understanding and representation of
information with enhanced accuracy.
This course comprises 3 blocks. First block introduces the spatial analysis and the second
block comprehensively describes spatial analysis. The remaining block, i.e., third block deals
with GIS modelling, implementation and project management.
Block 1 “Basics of Spatial Analysis” discusses the raster and vector integration, methods
of integration and software and hardware considerations. It also introduces you to spatial
data mining, database modelling and data organisation. Further, it provides an overview of
the geostatistics and data measurement.
Block 2 “Spatial Analysis” introduces vector data analysis along with proximity analysis,
buffering and overlay analysis. It also discusses the basics of raster analysis and
connectivity analysis. In addition, it provides an overview of analysis of 3-dimensional data,
viewshed and watershed analysis and multicriteria analysis.
Block 3 “Modelling, Project Management and Programming” as its name suggests,
deals with spatial modelling of spatial data, GIS design and project management, and GIS
programming. It gives an account of the spatial modelling in general and emphasis on static
modelling in first unit of the block and dynamic modelling in the second unit. GIS design,
implementation and project management are discussed in the next unit. In the next and last
unit of the block, an overview of GIS programming, comparison of various languages and its
scope is provided.
Expected Learning Outcomes
After studying this course, you should be able to:
discuss integration of remote sensing and GIS;
describe concepts of data mining and spatial data mining;
explain fundamentals of geostatistics and spatial data measurements;
describe vector data analysis, raster data analysis and connectivity analysis;
discuss analysis of 3-dimensional data, viewshed and watershed and multicriteria
analysis;
illustrate modelling spatial data and dynamic modelling; and
5
elucidate GIS design, implementation and project management and basics of GIS
programming.
We wish you all the best and hope you will enjoy reading this course!
6
MGY-006
SPATIAL ANALYSIS
Indira Gandhi National Open University
School of Sciences
AND MODELLING
Block
1
BASICS OF SPATIAL ANALYSIS
UNIT 1
Integration of RS and GIS 11
UNIT 2
Data Mining and Spatial Data Management 25
UNIT 3
Overview of Geostatistics and Spatial
Data Measurements 45
Glossary 63
7
BLOCK 1: BASICS OF SPATIAL ANALYSIS
GIS is found to be one of the simple sources for mapping and analysis used for various
applications. It is recognised to be appropriate for storing and manipulating large volumes of
complex spatially referenced data. It is quite challenging to change the data into meaningful
form and this generally requires simplification of data adequately to make them coherent.
GIS acts as a major tool in spatial data integration. For mapping and analysis spatial data
integration plays a major role. Spatial data integration is the method of combining multiple
spatial data types and providing applications for its storage, retrieval, analysis and display.
One of the basic ways of integration is to spatially extract an area as displayed in a raster
data, using a polygon boundary of vector data. Another simple way is to extract value of
given raster cell is by overlaying a point vector layer. This interaction between the layers
uses the concept of spatial overlay that allows one to transfer data between objects of
different types and different layers, according to spatial relationship with each other. Spatial
data mining is an impression of general data mining that signifies the balances between
computational scalability and mathematical accuracy in managing the spatial data. It is the
procedure of determining possible useful patterns from large spatial datasets. A type of
statistics called geostatistics is engaged to examine and forecast the values connected to
spatial or spatiotemporal occurrences. The analyses include the spatial coordinates of the
data. Initially designed as a useful way to characterise spatial patterns and extrapolate
values for areas where samples were not gathered, many geostatistical tools were created.
Since then, these instruments and techniques have developed to offer not just interpolated
values but also measurements of uncertainty for those values. Thus, the spatial analysis in
GIS helps in processing and management of geospatial data that presents it as a significant
tool in widespread applications.
Unit 1 “Integration of RS and GIS” introduces you to key concepts of raster and vector data
integration, methods and techniques of raster and vector data integration, utilities of vector and
raster data integration and software and hardware tasks requirements necessary to perform
raster and vector data integration.
Unit 2 “Data Mining and Spatial Database Management” familiarises with basic concepts of
data mining. You will know spatial data mining with emphasis on spatial data mining techniques,
concepts of DBMS and SDBMS and data modelling and data organisation.
Unit 3 “Overview of Geostatistics and Spatial Data Measurements” discusses basics of
geostatistics and techniques and distance and length measurements that include spatial
distance measurements and spectral distance measurements. You will learn to know the
polygon perimeter, area measurement and recognise spatial relationships between variables
and mathematical operations.
Expected Learning Outcomes
After studying this block, you should be able to:
know the basics of raster and vector data integration, methods of raster and vector data
integration;
explain software and hardware requirements for raster and vector data integration;
8
describe the spatial data mining covering spatial data mining techniques, data modelling and
data organisation;
identify essentials of geostatistics and its techniques and distance and length
measurements;
illustrate the polygon perimeter and area measurement; and
We wish you all the best and hope you will enjoy reading this course.
9
10
UNIT 1
Structure____________________________________________________
1.1 Introduction 1.4 Software and Hardware
Considerations
Expected Learning Outcomes
1.5 Summary
1.2 Raster and Vector Data integration
1.6 Activity
Data Integration
1.7 Terminal Questions
Stages of Integration
1.8 References
GIS Integration
1.9 Further/SuggestedReadings
1.3 Methods of Integration
1.10 Answers
Contributions of RS in Integration with GIS
1.1 INTRODUCTION
Data interpretation and analysis has become common in today’s world with the availability of larger
volumes of digital data in various formats. Data integration relates to various approaches that
combine or merge data obtainedfrom various sources to extractbetter and accurate information. It
includes data of different resolution, multi-temporal, multi-sensor, or multi-data type. Data
integration will offer a lot more applications through designing various models, running simulations
and offering wider scope for effective decision making. Though the technique of integrating raster
and vector had been there for over couple of decades now, recently there are large scale
developments in analytical methods- such as machine learning based algorithms, visualisation
techniques that had contributed in delivering solutions to complex problems. In this unit, we shall
discuss raster and vector data integration, methods of integration and software and hardware
considerations.
Block 1 Basics of Spatial Analysis
…………………………………………………………………….…………………………………………………
Fig. 1.1: Three stages in remote sensing and GIS integration. (Source: modified
after Lo and Yeung 2009)
SAQ I
a) What is data integration in GIS?
b) What is the Stage II of GIS integration?
c) Define Internet GIS.
iii) Digital RS data that is classified using automated method is retained in its
digital form and used as input in GIS. On the other hand, digital RS data
can be directly entered in its raw form for further analyses.
b) Cartographic Information Extraction
The automatic procedures adopted in extraction of cartographic information
such as lines, polylines, polygons and other geographical entities is one of the
major achievements in the RS data input in GIS. The task of geographic feature
extraction is accomplished by applying pattern recognition, edge extraction, and
segmentation algorithm techniques. Hence, RS images will aid in better
production and improving the existing base maps. Further the extracted
cartographic information can be applied to enhance the process of image
classification.
Method Features
Road density
road coverage
Census data
Multisource data Spectral, texture, and ancillary data (like DEM, geology, soil,
usage existing GIS-based maps)
GIS applications and GIS-based projects form core of the work for many
industries and small-scale companies that work in geo-spatial domain. Other
than this, a large number of service provider industries also acquire satellite
datasets, generate vector datasets, attribute datasets, use various automate
techniques for data processing and data extraction in order to achieve and
deliver desired outputs. These individuals/industries need infrastructure to
handle database efficiently.
Hardware and software considerations can vary significantly depending on the
task in hand. Recent trends show that data volumes are increasing day-by-day.
Hence, system requirements are also growing higher to store and process large
volumes of data. Yet, below are few minimum configurations that will allow
proper functioning of modern applications to work with sub-components.These
specifications indicate recommended requirements and packages that may run
on very small spatial extent and less below these specifications also.
Hardware Requirements
The following are the essential hardware requirements for a well-equipped GIS
Lab:
Computer Workstations: High-performance desktop or laptop computers
with sufficient processing power, RAM, and storage are essential. For
intensive GIS tasks, a multi-core processor (e.g., Intel Core i7 or AMD
Ryzen series) with at least 16 GB of RAM is recommended. Additionally,
solid-state drives (SSD) provide faster data access and reduce loading
times.
Graphics Processing Unit (GPU): A dedicated GPU with good processing
capabilities can significantly enhance GIS performance, especially when
working with large datasets and 3D visualizations. NVIDIA GeForce or AMD
Radeon graphics cards are popular choices for GIS applications.
Display Monitors: High-resolution monitors (e.g., 24 inches or larger) with
accurate color reproduction are necessary for better visualization and data
analysis. Dual monitors can improve productivity by allowing users to view
multiple maps and applications simultaneously.
Storage Solutions: Ample storage capacity is crucial to accommodate GIS
data and projects. A combination of fast SSDs for the operating system and
applications, along with larger capacity HDDs for data storage, is a
recommended setup.
Peripherals: Standard peripherals like keyboard, mouse, and speakers are
required. Additionally, a digitizing tablet can be beneficial for precise
mapping and digitization tasks. Large format scanners play a vital role in
digitizing large, old paper maps, or hard copy satellite images, enabling their
conversion into digital formats for further processing.
Network Infrastructure: A reliable and high-speed network connection is
essential to enable data sharing and collaboration within the GIS Lab. A
high-speed internet connection is crucial for leveraging cloud computing,
SAQ II
a) List the methods of GIS integration.
b) What is the important role played by GIS in integration with RS?
c) What is QGIS?
1.5 SUMMARY
Let us summarise what we have studied in this unit.
GIS is found to be one of the basic sources for mapping and analysis used
for various applications. It is found to be a major tool in spatial data
integration. For mapping and analysis spatial data integration plays a major
role.
Data integration plays an important role in enhancing the usage of data.
One of the basic ways of integration is to spatially extract an area as
displayed in a raster data, using a polygon boundary of vector data. And the
other simple way is to extract value of given raster cell is by overlaying a
point vector layer.
There are three stages in RS and GIS integration. In stage I, the GIS and
image processing are treated as separate systems. But, they are connected
by means of data exchange format that permits to exchange data. The
Stage II is known as stage of seamless integration and Stage III is
considered as process of total integration.
Integration of GPS with GIS will facilitate to combine data and enhance the
capabilities that cannot be provided individually either by the GIS or the
GPS. Normally, the term WebGIS and Internet GIS are used as
synonymous with each other where the Internet supports many services
with the Web being one of these services.
Internet Map Server (IMS) application allows custodians of GIS database to
easily make the spatial data accessible for end users through a web
browser. Integration of geospatial data with wireless technology provides
wide variety of services and with an intention to raise the business
alternatives to serve end users, industries keep on adding innovations.
Web services that are offered via internet technology have become a cheap
and easy way of disseminating geospatial data and processing tools.
Integration of GIS with the Internet technology has revolutionary effects like
interactive access to geospatial data, real-time data integration and
transmission, and access to platform-independent GIS analysis tools.
There are three methods of RS and GIS technologies that can be combined
to support and improve the process of their integration. They are
contributions of RS in Integration with GIS, contributions of GIS in
integration with RS and integration of RS and GIS for analysis and
modelling
GIS applications and GIS-based projects form core of the work for many
industries and small-scale companies that work in geo-spatial domain.
1.6 ACTIVITY
You have read in this Unit that QGIS works on the basis of user developed
plugins. List the popularly used plugins in QGIS.
1.8 REFERENCES
Davis, Jr. C. and LacerdaAlves, L. (2008) Web Services, Geospatial, In:
Shekhar, S., Xiong, H., (eds.), Encyclopedia of GIS. Springer, Boston, MA.
https://doi.org/10.1007/978-0-387-35973-1_1490
Gao, J. (2002) Integration of GPS with Remote Sensing and GIS: Reality
and Prospect. Photogrammetric Engineering and Remote Sensing 68(5):
447-453.
Karnatak, H.C., Shukla, R., Sharma, V.K., Murthy, Y.V.S., Bhanumurthy, V.
(2012) Spatial mashup technology and real time data integration in geo-web
application using open source GIS-a case study for disaster management.
GeocartoInt 27(6):499-514.
Lo, C.P. and Yeung, A. K.W. (2009) Concepts and techniques of
geographic information system. PHI Learning Private Limited, New Delhi,
532p.
Weng, Q. (2010). Remote Sensing and GIS Integration: Theories, Methods,
and Applications.New York: McGraw-Hill, 416p.
https://desktop.arcgis.com/en/arcmap/10.3/tools/analysis-toolbox/clip.htm
https://enterprise.arcgis.com/en/get-started/10.9.1/windows/what-is-arcgis-
enterprise-.htm
https://slideplayer.com/slide/5822638/
https://www.geospatialworld.net/blogs/how-gnss-works/
https://doi.org/10. 1080/10106049.2011.650651
https://gis4africa.files.wordpress.com/2013/06/gisforafrica-soft-and-
hardware-requiremenst.pdf (Retrieved on 24/4/2023)
1.10 ANSWERS
SAQ I
a) Data integration in GIS is the method of combining spatial data procured for
different sources and formats to create an integrated dataset used for
analysis and decision making.
b) Stage II of GIS integration is also known as stage of seamless integration.
In this stage GIS and image processing system share the same user
interface, but act individually and are complementary to each other in
seamless exchange of data.
c) If GIS uses many services of internet including web services is known as
Internet GIS.
SAQ II
a) The three methods of integration are i) Contributions of RS in integration
with GIS ii) Contributions of GIS in integration with RS iii) Integration of RS
and GIS for analysis and modelling.
b) In the process of integration, GIS data plays an important role in its use that
will enhance the functionality of RS at various stages of image processing
such as selection of area of interest for processing, pre-processing, and
image classification.
c) QGIS is open source GIS software works on the basis of user developed
plug-ins.
Terminal Questions
1. Please refer to subsections 1.2.1 and 1.2.2.
2. Please refer to subsection 1.3.1.
3. Please refer to subsection 1.3.2.
4. Please refer to section 1.4.
Structure____________________________________________________
2.1 Introduction Characteristics of Good DBMS
2.1 INTRODUCTION
Spatial data mining (SDM) is an overview of general data mining that signifies the balances
between computational scalability and mathematical accuracy in managing the spatial data. It is the
procedure of determining possible useful patterns from large spatial datasets. SDM is important as
it has vast potential in various applications such as climate change studies, epidemiology and public
safety, etc. For instance, SDM is useful in knowing the locations with high distribution of disease
outbreaks such as cholera for taking suitable measures in controlling the further spread of the
disease.
Block 1 Basics of Spatial Analysis
…………………………………………………………………….…………………………………………………
In the previous unit raster and vector data integration, methods of integration
and software and hardware considerations were discussed. In this unit, we will
discuss basic concepts of data mining, and spatial data mining. We will also
discuss concepts of DBMS and SDBMS, database modelling and database
organisation.
Fig. 2.1: Steps followed in data mining process for extraction of hidden
information from a given data set. (Source: modified after Lo and Yeung
2009)
Nowadays machine learning techniques are being adapted for data mining.
These machine learning (i.e., the computer) techniques firstly apply learning
algorithm to know the characteristic of a training data set. Based on the training
data set, machine learning techniques are further used to create a model for
which new data sets are mapped so as to undertake classification, patterns,
predictions, and trends.
SAQ I
a) What is spatial database?
b) Define data mining.
c) List the different classes of SDM techniques.
Entity-relationship model
Similar to the network model, this model also records relationships between
real-world items, although it is less closely related to the physical structure of
the database. Instead, it is frequently utilised for conceptually constructing a
database (Fig. 2.5). Entities are used to refer to the people, places, and things
that data points are kept about. Each entity has a set of characteristics that
collectively make up their domain. Additionally, the cardinality, or connections
between entities, is mapped.The star schema, in which a central fact table
relates to numerous dimensional tables, is a popular variation of the ER
diagram.
There are a variety of other database models have been or are still used today.
Let us discuss them in detail.
Inverted file model
A database constructed using the inverted file format is intended to make quick
full-text searches possible. In this architecture, the location of the associated
files is indicated by the values of a lookup table's keys, which are used to index
the data content. For example, this structure can offer big data and analytics
reporting that is almost instantaneous.The Software AG ADABAS database
management system has been using this approach since 1970 and it is still
supported today.
Flat model
The earliest and most basic data model is the flat model. It merely lists every
piece of information in a single table with columns and rows. This technique is
inefficient except for small data sets since the computer must read the full flat
file into memory in order to access or change the data.
Multidimensional model
If necessary, this model can combine components from other database models.
It combines components from network models, object-oriented models, and
semi-structured models.
SAQ II
a) Define DBMS.
b) List the database management functions performed by a typical DBMS.
c) What is semantic model?
2.7 SUMMARY
Let us summarise what you have studied in this unit.
2.8 ACTIVITY
Match the following
2.10 REFERENCES
Bhatta, B. (2022) Remote Sensing and GIS. Oxford University Press, New
Delhi, 732p
Bonczek, R.H., Holsapple, C.W. and Whinston, A.B. (1981) Foundations of
Decision Support Systems. Academic Press, New York. 393p
Burrough, P. A., McDonnell, R.A. and Llyod, C. D. (1998) Principles of
Geographical Information Systems. Oxford University Press, New York,
352p
Chaisman, N. (1992) Exploring Geographical Information Systems. John
Wiley and Sons Inc., New York, 198p.
Chakraborty, D. and Sahoo, R.N. (2008) Fundamentals of Geographic
Information Systems. Viva Books Private Limited, India, 280p.
Chang, K-t. (2002) Introduction to Geographic Information Systems. Tata
McGraw Hill, New Delhi.
Chrisman, N.R. (2002) Exploring Geographic Information Systems. Wiley,
New York, 305p.
Date, C.J., Kannan, A. and Swamynathan, S. (2009) An Introduction to
Database Systems (8th Ed.), Pearson Education.
DeMers, M.N. (2008) Fundamentals of geographic information system,
Wiley, New York, 443p.
Elmasri, R. and Navathe, S.B. (2011) Fundamentals of Database Systems
(6th Ed.), Addison-Wesley, Boston, 1200p.
Goodchild, M.F. (1978) Statistical Aspects of the Polygon Overlay
Problems, In: Dulton, E. G.., (eds.), Harvard papers on GIS, Addison
Wesley, Reading Press.
Harvey, F. (2008) A Primer of GIS: Fundamental Geographic and
Cartographic Concepts, The Guilford Press, New York.
Huxhold, W.E. (1991) An Introduction to Urban Information Systems. New
York, OUP.
Laurini, R. and Thompson, D. (1992) Fundamentals of Spatial Information
Systems. London, Academy Press.
2.12 ANSWERS
SAQ I
a) Spatial database is the database that is responsible to provide support to
databases for finding the objects in a special domain.
SAQ II
a) DBMS is defined as the system software used to create and administer
databases.
b) There are various database management functions performed by a DBMS.
They are 1) data definition 2) data manipulation 3) data recovery and
concurrency 4) data dictionary maintenance 5) performance.
c) Sematic model is the less common database model that consists of
information about how the stored data relates to the real world.
Terminal Questions
1. Please refer to section 2.2.1.
2. Please refer to section 2.3.
3. Please refer to section 2.4.
4. Please refer to section 2.5.
Structure____________________________________________________
3.1 Introduction 3.5 Exploring Spatial Relationships
between Variables
Expected Learning Outcomes
3.6 Mathematical Operations
3.2 Overview of Geostatistics and
Techniques 3.7 Summary
Geostatistics Tools and Techniques 3.8 Activity
3.3 Distance and Length Measurements 3.9 Terminal Questions
Spatial Distance Measurements 3.10 References
Spectral Distance Measurements 3.11 Further/Suggested Readings
3.4 Perimeter and Area Measurement 3.12 Answers
3.1 INTRODUCTION
The spatial data measurements that are generally performed in a usual GIS project are
accomplished automatically and efficiently with the application of designed algorithms that form
parts of the GIS software. However it is important to know the nature of the computations that run in
the background while performing a particular operation in a GIS environment. This familiarity will
support in application of suitable spatial analysis functions.
In the previous unit basic concepts of data mining, and spatial data mining, concepts of DBMS and
SDBMS, database modelling and database organisation were discussed. In this unit, we will
discuss overview of geostatistics and techniques, distance and length measurements that include
spatial distance measurements and spectral distance measurements, and polygon perimeter and
area measurement. We will also discuss exploring spatial relationships between variables and
mathematical operations.
Block 1 Basics of Spatial Analysis
…………………………………………………………………….…………………………………………………
Let us discuss how the geostatistical tools can be used by taking an example.
For instance, you have collected soil samples from certain locations of a
particular area. The geostatistical tools can be helpful in answering the queries
such as.
The probable soil moisture present in locations, where the sampling was not
undertaken.
To what extent the spatial prediction of soil moisture is correct.
The applied geostatistical tool such as kirging method varies from deterministic
interpolation method like Inverse Distance Weighting (IDW) method where both
have the similarity in estimation of the unidentified locations (Fig.3.1).
Generally, in IDW model mathematical predetermined power function is used.
However, in kirging method mathematical and statistical function like semi-
variograms is applied, which we will discuss in detail in the preceding sub-
Fig. 3.2: Semi-variogram representing nugget, range and sill. (Source: modified
after www.https://gisgeography.com/kriging-interpolation-prediction/)
The semi-variogram shows the association until it reaches the sill, whereas
farther samples are no longer associated. The goal is to mathematically fit a
function that represents the semi-variogram's trend (Spherical, Circular,
Exponential, Gaussian, Linear).For example, you can select a semi-variogram
as shown in Fig. 3.3.
When there are few observations, the standard of error is typically larger. The
variogram procedure benefits from expert knowledge when error rises above a
key level.
Fig. 3.5: Higher standard of error with sparse amount of observations. (Source:
https://gisgeography.com/kriging-interpolation-prediction/)
Fig. 3.6: Higher standard of error with sparse amount of observation. (Source:
modified after https://gisgeography.com/kriging-interpolation-prediction/)
Fig. 3.7: Calculation of Euclidean distance between two points (Point #1 and
Point #2) in an X, Y Cartesian coordinate system by applying
Pythagorean Theorem. (Source: modified after Jensen and Jensen 2018)
Fig. 3.8: Calculation of Manhattan distance between two points (Point #1 and
Point #2) in an X, Y Cartesian coordinate system. (Source: modified after
Jensen and Jensen 2018)
52 Contributor: Dr. (Ms.) Shukla Acharjee
Unit 3 Overview of Geostatistics and Spatial Data Measurement
…………………………….…………………………………….………………………………………………......
The Manhattan distance (occasionally mentioned as “round-the-block” or “city
block” distance) between two points instead of using the hypotenuse of the right
triangle uses the lengths of the two sides (Fig. 3.6). For instance in an urban
area going from Point #1 to Point #2 is like walking through houses or climbing
the buildings. Instead, the better option is to walk around the block to get from
Point #1 to Point #2 (Fig. 3.8).
3.3.2 Spectral Distance Measurements
Spectral distance is a spectral measure generally used in unsupervised
classification and supervised minimum distance classification. The formula
used to determine the spectral separation between a pixel spectrum and a
reference spectrum is as given below.
d=
where d is the spectral distance, yi is the reflectance in band i for a pixel, ri is
the reflectance in band i for a reference and n is the number of bands in the
image.
The spectral distance has the potential to quantify the variation in crop growth
and yield. Let us discuss with an example. For grain sorghum, the best
phenological stage is around the peak vegetative development for yield
estimation (Yang and Everitt 2002). The spectral gap between healthy plants
and the reference, for instance, will be modest if a pure healthy crop canopy is
used as the reference, whereas the spectral distance between stressed plants
and the reference will be significant. As a result, spectral distance can be
utilised as a proximate indicator of the health and abundance of plants.
SAQ I
a) What are the three main tools of geostatistics used in interpolation and
uncertainty models?
b) What is Tobler’s First Law of Geography?
c) What is kriging interpolation?
d) Based on which theorem does Euclidean distance measurement work?
The length associated with each line segment is calculated by applying the
Pythagorean Theorem as discussed in the previous section (Section 3.3).
b) Area Measurement
The polygon area is a measurement of the geographic area that is enclosed by
a polygon. It is very simple to calculate the polygon area when it has a regular
geometric shape like rectangle, circle, square or right angle triangle. For
calculation of these regular geometric areas certain equations are used as
given in Table 3.1. Interestingly these types of regular shapes are mostly found
in manmade environments. For example rectangular or square shaped
buildings, water harvesting structures, road networks, farm lands, etc. But,
these regular geometric shapes are uncommon in natural landscapes and
require complex calculations to measure the irregular shaped polygons.
Table 3.1: Equations used for calculating the area of regular geometric shapes.
2
Square side
2
Circle π X radius
To calculate the complex polygon’s area the Cartesian coordinates (X1, Y1),
(X2, Y2)..., (Xn, Yn) of all its vertices which are listed in order are to be known.
The equation for calculation is as given below:
Logarithmic functions
The logarithmic functions work with input rasters or values to execute
exponential and logarithmic calculations. The natural (Ln), base 2 (Log2), and
base 10 (Log10) logarithmic functions as well as the base e (Exp), base 2
(Exp2), and base 10 (Exp10) exponential functions are accessible. For
example, the result of taking the log of the values in a raster is shown in Fig.
3.11.
Power functions
The general math tools incorporate three distinct power functions. These
functions enable the manipulation of numbers within the input raster in various
ways: by calculating the square root (Square Root), squaring (Square), or
SAQ II
a) How is polygon perimeter measured?
b) In what way is Poisson model useful?
c) What are the functions of the arithmetic operators?
d) What are the power functions available in General Math tools?
3.7 SUMMARY
Let us summarise what we have studied in this unit.
Geostatistics which is considered as a part of statistics is employed to
examine and forecast the values connected to spatial or spatiotemporal
occurrences. The analyses include the spatial (and, in some cases,
temporal) coordinates of the data.
The three main tools that geostatistics provides are i) Semi-variograms to
model the relationship between all pairs of points; ii) Kriging modeling to
predict values at unsampled locations; and iii) Standard error to measure
confidence at unsampled values.
Commonly used linear distance measurement tools in GIS are the
Euclidean (Pythagorean Theorem) and Manhattan distance methods. The
Euclidean distance method is applied to know the distance between
immediate neighbours or the number of points covered under a particular
buffer distance from a point of interest.
The Manhattan distance, sometimes referred to as the 'round-the-block' or
'city block' distance between two points, replaces the hypotenuse of the
right triangle with the sum of the lengths of the two sides. The spectral
distance, on the other hand, serves as a spectral metric commonly
3.12 ANSWERS
SAQ I
a) The three main tools of geostatistics used in interpolation and uncertainty
models are 1) semi-variograms, 2) kriging technique, and 3) standard error.
b) Tobler’s First Law of Geography states that closer things are more linked
than further ones.
c) Kriging interpolation is a method of interpolation that uses the spatial
correlation between samples to forecast values at unsurveyed locations.
d) Euclidean distance measurement works on the basis of the Pythagorean
Theorem.
63
64