Principal of Geography Information - Unit1-Unit5
Principal of Geography Information - Unit1-Unit5
Principal of Geography Information - Unit1-Unit5
com
Acuity Educare
PRINCIPLES OF
GEOGRAPHIC
INFORMATION SYSTEM
SEM : VI
SEM VI : UNIT 1- 5
UNIT – 1
A Gentle Introduction to GIS
Fundamental Observations:
Many aspects of our daily lives and our environment are constantly changing,
and not always for the better. Some of these changes appear to have natural
causes (e.g. volcanic eruptions, meteorite impacts), while others are the result of
human modification of the environment (e.g. land use changes or land
reclamation from the sea).
There are also a large number of global changes for which the cause remains
un- clear: these include global warming, landslides and soil erosion.
Defining GIS:
This implies that a GIS user can expect support from the system to enter (geo-
referenced) data, to analyse it in various ways, and to produce presentations
(including maps and other types) from the data.
This would include support for various kinds of coordinate systems and
transformations between them, options for analysis of the georeferenced data.
GISystem
Page 1 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
(1) Hardware
(2) Software
(3) Data/Information
(4) Users/People
(5) Procedures/Methods and
(6) Network
GIScience
Page 2 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
GIS Application
Geographic Information Application is the kind of services dealing with the geographic
information, such as the design and development of the GIS, geographic information
retrieval, analysis, etc. For example, MapQuest (www.mapquest.com) provides a
routing service for people to find the best driving route between two points.
GIService allows GIS users to access specific functions that are provided by remote
sites through the internet.
Some examples are: MapQuest, Google maps, Bing Maps, Yahoo Maps, Apple
Maps, Yandex Maps, OpenStreetMap and WikiMapia Maps.
Page 3 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
Maps
maps are perhaps the best known (conventional) models of the real world.
Maps have been used for thousands of years to represent information about the
real world, and continue to be extremely useful for many applications in various
domains.
A disadvantage of the traditional paper map is that it is generally restricted to two-
dimensional static representations, and that it is always displayed in a fixed scale.
The map scale determines the Map spatial resolution of the graphic feature
representation.
Page 4 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
Databases
A database is a repository for storing large amounts of data. It comes with a number
of useful functions:
A database can be used by multiple users at the same time—i.e. it allows
concurrent use
A database offers a number of techniques for storing data and allows the use of the
most efficient one—i.e. it supports storage optimization
A database allows the imposition of rules on the stored data; rules that will be
automatically checked after each update to the data—i.e. it supports data integrity
A database offers an easy to use data manipulation language, which allows the
execution of all sorts of data extraction and data updates—i.e. it has a query facility,
A database will try to execute each query in the data manipulation lan- guage in the
most efficient way—i.e. it offers query optimization.
A GIS must store its data in some way. For this purpose the previous generation of
software was equipped with relatively rudimentary facilities.
Since the 1990’s there has been an increasing trend in GIS applications that used a
GIS for spatial analysis, and used a database for storage.
In more recent years, spatial databases (also known as geodatabases) have
emerged. Besides traditional administrative data, they can store representations of
real world geographic phenomena for use in a GIS.
databases are special because they use additional techniques different from tables
to store these spatial representations.
Spatial analysis is the generic term for all manipulations of spatial data carried out
to improve one’s understanding of the geographic phenomena that the data
represents.
It involves questions about how the data in various layers might re- late to each
other, and how it varies over space.
Page 5 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
For example, in the El Nin˜o case, we may want to identify the the steepest gradient
in water temperature.
The aim of spatial analysis is usually to gain a better understanding of geographic
phenomena through discovering patterns that were previously unknown to us, or to
build arguments on which to base important decisions
It should be noted that some GIS functions for spatial analysis are simple and easy-
to- use, others are much more sophisticated, and demand higher levels of analytical
and operating skills.
Successful spatial analysis requires appropriate software, hardware, and perhaps
most importantly, a competent user.
As discussed in the previous chapter, we use GISs to help analyse and under- stand
more about processes and phenomena in the real world.
Modelling is the process of producing an abstraction of the ‘real world’ so that some
part of it can be more easily handled.
the process of modelling, or building a representation which has certain
characteristics in common with the real world.
In practical terms, this refers to the process of representing key aspects of the real
world digitally (inside a com- puter). These representations are made up of spatial
data, stored in memory in the form of bits and bytes, on media such as the hard drive
of a computer.
This digital representation can then be subjected to various analytical functions
(computations) in the GIS, and the output can be visualized in various ways.
Page 6 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
A GIS operates under the assumption that the relevant spatial phenomena occur in
a two- or three-dimensional Euclidean space, unless otherwise specified.
Euclidean space can be informally defined as a model of space in which locations
are represented by coordinates—(x, y) in 2D; (x, y, z) in 3D—and distance and
direction can defined with geometric formulas.In the 2D case, this is known as the
Euclidean plane,which is the most common Euclidean space in GIS use.
In order to be able to represent relevant aspects real world phenomena inside a
GIS, we first need to define what it is we are referring to.
In order to be able to represent relevant aspects real world phenomena inside a
GIS, we first need to define what it is we are referring to.
• Can be named or described,
• Can be georeferenced, and
The geographic phenomena come in so many different ‘flavours’, which we will try
to categorize below. Before doing so, we must make two further observations.
Firstly, In order to be able to represent a phenomenon in a GIS requires us to state
what it is, and where it is. We must provide a description—or at least a name—on
the one hand, and a georeference on the other hand.
Secondly, some phenomena manifest themselves essentially everywhere in the
study area, while others only do so in certain localities. If we define our study area
as the equatorial Pacific Ocean, we can say that Sea Surface Temperature can be
measured anywhere in the study area. Therefore, it is a typical example of a
(geographic) field.
A (geographic) field is a geographic phenomenon for which, for every point in the
study area, a value can be determined.
Some common examples of geographic fields are air temperature, barometric
pressure and elevation.
Elevation in the Falset study area, Tarragona prov×ince, Spain. The area is
approximately 25 20 km. The illustration has been aesthetically improved by a
technique known as ‘hillshading’. In this case, it is as if the sun shines from the
north-west, giving a shadow effect towards the south-east. Thus, colour alone
is not a good indicator of elevation; observe that elevation is a continuous
function over the space.
Page 7 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
Figure 2.2: A continuous field example, namely the elevation in the study area of
Falset, Spain.
Data source: Department of Earth Systems Analysis (ESA, ITC)
Geographic fields
A field is a geographic phenomenon that has a value ‘everywhere’ in the study area.
We can therefore think of a field as a mathematical function f that asso- ciates a
specific value with any position in the study area.
Hence if (x, y) is a position in the study area, then f (x, y) stands for the value of the
field f at local- ity (x, y).
Fields can be discrete or continuous. In a continuous field, the underlying function is
assumed to be ‘mathematically smooth’, meaning that the field values along any
path through the study area do not change abruptly, but only gradually.
Good examples of continuous fields are air temperature, barometric pressure, soil
salinity and elevation. Continuity means that all changes in field values are gradual.
Discrete fields divide the study space in mutually exclusive, bounded parts, with all
locations in one part having the same field value.
Typical examples are land classifications, for instance, using either geological
classes, soil type, land use type, crop type or natural vegetation type.
Page 8 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
Since we have now differentiated between continuous and discrete fields, we may
also look at different kinds of data values which we can use to represent our
‘phenomena’. It is important to note that some of these data types limit the types of
analyses that we can do on the data itself:
Nominal data values are values that provide a name or identifier so that we can
discriminate between different values, but that is about all we can do. Specifically,
we cannot do true computations with these values. An example are the names of
geological units.
Ordinal data values are data values that can be put in some natural sequence but
that do not allow any other type of computation. Household income, for instance,
could be classified as being either ‘low’, ‘average’ or ‘high’.
Interval data values are quantitative, in that they allow simple forms of com- putation
like addition and subtraction. However, interval data has no arithmetic zero value,
and does not support multiplication or division
Ratio data values allow most, if not all, forms of arithmetic computation.Rational data
have a natural zero value, and multiplication and division of values are possible
operators (distances measured in metres are an ex- ample).
Page 9 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
Geographic objects
When a geographic phenomenon is not present everywhere in the study area, but
somehow ‘sparsely’ populates it, we look at it as a collection of geographic objects.
Such objects are usually easily distinguished and named, and their po- sition in
space is determined by a combination of one or more of the following parameters:
• Location (where is it?),
• Shape (what form is it?),
• Size (how big is it?), and
• Orientation (in which direction is it facing?).
Page 10 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
Boundaries
Where shape and/or size of contiguous areas matter, the notion of boundary comes
into play. This is true for geographic objects but also for the constituents of a
discrete geographic field Location, shape and size are fully determined if we know
an area’s boundary, so the boundary is a good candidate for representing it.
This is especially true for areas that have naturally crisp boundaries.
Fuzzy boundaries contrast with crisp boundaries in that the boundary is not a
precise line, but rather itself an area of transition.
In order to represent such a phenomenon faithfully in com- puter memory, we could either:
Try to store as many (location,elevation) observation pairs as possible, or
Try to find a symbolic representation of the elevation field function, as a formula in
•
x and y—like (3.0678x2 + 20.08x 7.34y) or so—which can be evaluated to give us
the elevation at any given (x, y) location.
Both of these approaches have their drawbacks. The first suffers from the fact that
we will never be able to store all elevation values for all locations; after all, there
are infinitely many locations.
The second approach suffers from the fact that we do not know just what this
function should look like, and that it would be extremely difficult to derive such a
function for larger areas.
Page 11 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
Regular tessellations
A tessellation (or tiling) is a partitioning of space into mutually exclusive cells that
together make up the complete study space. With each cell, some (thematic) value
is associated to characterize that part of space.
In a regular tessellation, the cells are the same shape and size. The simplest
example is a rectangular raster of unit squares, represented in a computer in the
2D case as an array of n m elements.
Figure 2.5: The three most common regular tessellation types: square cells,
hexagonal cells, and triangular cells.
In all regular tessellations, the cells are of the same shape and size, and the field
attribute value assigned to a cell is associated with the entire area occu- pied by
the cell. The square cell tessellation is by far the most commonly used, mainly
because georeferencinga cell is so straightforward. These tessellations are known
under various names in different GIS packages, but most frequently as rasters.
A raster is a set of regularly spaced (and contiguous) cells with associated (field)
values. The associated values represent cell values, not point values. This means
that the value for a cell is assumed to be valid for all locations within the cell
Irregular tessellations
Irregular tessellations are more complex than the regular ones, but they are also
more adaptive, which typically leads to a reduction in the amount of memory used
to store the data.
A well-known data structure in this family—upon which many more variations
have been based—is the region quadtree. It is based on a regular tessellation of
square cells, but takes advantage of cases where neigh- bouring cells have the
same field value, so that they can together be represented as one bigger cell.
Page 12 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
It shows a small 8 8 raster with three possible field values: white, green and
blue. The quadtree that represents this raster is constructed by repeatedly
splitting up the area into four quadrants, which are called NW, NE, SE, SW for
obvious rea- sons. This procedure stops when all the cells in a quadrant have
the same field value.
The procedure produces an upside-down, tree-like structure, known as a
quadtree. In main memory, the nodes of a quadtree (both circles and squares
in the figure below) are represented as records. The links between them are
point- ers, a programming technique to address (i.e. to point to) other records.
Vector representations
A commonly used data structure in GIS software is the triangulated irregular net-
work, or TIN.
It is one of the standard implementation techniques for digital terrain models, but
it can be used to represent any continuous field. The prin- ciples behind a TIN
are simple.
It is built from a set of locations for which we have a measurement, for instance
an elevation. The locations can be arbitrar- ily scattered in space, and are usually
not on a nice regular grid. Any location together with its elevation value can be
viewed as a point in three-dimensional space.
Figure 2.9: Two trian- gulations based on the input locations of Fig- ure2.8.
(a) one with many ‘stretched’ trian- gles; (b) the triangles are more equilateral;
this is a Delaunay triangulation
Page 14 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
Point representations
Points are defined as single coordinate pairs (x, y) when we work in 2D, or co-
ordinate triplets (x, y, z) when we work in 3D. The choice of coordinate system is
another matter, which we will discuss in Chapter4.
Points are used to represent objects that are best described as shape- and size-
less, one- dimensional features. Whether this is the case really depends on the
purposes of the spatial application and also on the spatial extent of the objects
compared to the scale applied in the application. For a tourist city map, a park
will not usually be considered a point feature, but perhaps a museum will, and
certainly a public phone booth might be represented as a point.
Line representations
Line data are used to represent one-dimensional objects such as roads, railroads,
canals, rivers and power lines. Again, there is an issue of relevance for the appli-
cation and the scale that the application requires. For the example application of
mapping tourist information, bus, subway and streetcar routes are likely to be
relevant line features. Some cadastral systems, on the other hand, may consider
roads to be two-dimensional features,
i.e. having a width as well.
Above, we discussed the notion that arbitrary, continuous curvilinear features are
as equally difficult to represent as continuous fields. GISs therefore approxi- mate
such features (finitely!) as lists of nodes. The two end nodes and zero or more
internal nodes or vertices define a line. Other terms for ’line’ that are commonly
used in some GISs are polyline, arc or edge. A node or vertex is like a point (as
discussed above) but it only serves to define the line, and provide shape in order
to obtain a better approximation of the actual feature.
The straight parts of a line between two consecutive vertices or end nodes are
called line segments.
Figure 2.10: A line is de- fined by its two end nodes and zero or more internal
nodes, also known as ver- tices. This line represen- tation has three vertices, and
therefore four line seg- ments.
Page 15 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
Area representations
When area objects are stored using a vector approach, the usual technique is to
apply a boundary model. This means that each area feature is represented by
some arc/node structure that determines a polygon as the area’s bound ary.
Common sense dictates that area features of the same kind are best stored in a
single data layer, represented by mutually non-overlapping polygons. In essence,
what we then get is an application- determined (i.e. adaptive) partition of space.
Topological relationships are built from simple elements into more complex el-
ements: nodes define line segments, and line segments connect to define lines,
which in turn define polygons.
Topological relationships
The mathematical properties of the geometric space used for spatial data can be
described as follows:
Page 16 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
We can use the topological properties of interior and boundary to define relationships
between spatial features.
Suppose we consider a spatial region A. It has a boundary and an interior, both seen as
(infinite) sets of points, and which are denoted by boundary(A) and interior (A), respectively.
We consider all possible combinations of intersections ( ) between the boundary and the
interior of A with those of another region B, and test whether they are the empty set ( ) or not.
From these intersectionpatterns, we can derive eight (mutually exclusive) spatial relationships
between two regions. If, for instance, the interiors of A and B do not intersect, but their
boundaries do, yet a boundary of one does not intersect the interior of the other, we say that
A and B meet. In mathematics, we can therefore define the meets relationship using set
theory, as
Page 17 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
Figure2.15shows all eight spatial relationships: disjoint, meets, equals, inside, cov-
ered by, contains, covers, and overlaps. These relationships can be used in queries
against a spatial database, and represent the ‘building blocks’ of more complex spatial
queries.
It is not without reason that our discussion of vector representations and spatial
topology has focused mostly on objects in two-dimensional space. The history of
spatial data handling is almost purely 2D, and this is remains the case for the
majority of present-day GIS applications. Many application domains make use of
elevational, but these are usually accommodated by so-called 2 1 D data
structures. These 2 1 D data structures are similar to the (above discussed) 2D
data structures using points, lines and areas.
There is, on the other hand, one important aspect in which 2 1 D data does dif-
fer from standard 2D data, and that is in their association of an additional z- value
with each 0- simplex (‘node’). Thus, nodes also have an elevation value
associated with them. Essentially, this allows the GIS user to represent 1- and 2-
simplices that are non- horizontal, and therefore, a piecewise planar, ‘wrinkled
surface’ can be constructed as well, much like a TIN. Note however, that one
cannot have two different nodes with identical x- and y-coordinates, but differ- ent
z-values. Such nodes would constitute a perfectly vertical feature, and this is not
allowed. Consequently, true solids cannot be represented in a 2 1 D GIS.
Page 18 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
In Figure2.17, we illustrate how a raster represents a continuous field like ele- vation. Different
shades of blue indicate different elevation values, with darker blues indicating higher
elevations.The choice of a blue colour spectrum is only to make the illustration aesthetically
pleasing; real elevation values are stored in the raster, so instead we could have printed a
real number value ineach cell. This would not have made the figure very legible, however
Page 19 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
A raster can be thought of as a long list of field values: actually, there should be
m n such values. The list is preceded with some extra information, like a single
georeference as the origin of the whole raster, a cell size indicator, the integer
values for m and n, and a data type indicator for interpreting cell values. Rasters
and quadtrees do not store the georeference of each cell, but infer it from the
above information about the raster.
We briefly mention a final representation for fields like elevation, but using a vector
representation. This technique uses isolines of the field. An isoline is a linear feature that
connects the points with equal field value. When the field iselevation, we also speak of
contour lines. The elevation of the Falset study area is represented with contour lines in
Figure2.18. BothTINs and isoline representa- tions use vectors.
Page 20 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
Remotely sensed images are an important data source for GIS applications. Un-
processeddigital images contain many pixels, with each pixel carrying a re-
flectance value. Various techniques exist to process digital images into classi
fied images that can be stored in a GIS as a raster.Image classification attempts
to characterize each pixel into one of a finite list of classes, thereby obtaining an
interpretation of the contents of the image. The classes recognized can be crop
types as in the case of Figure2.19or urban land use classes as in the case of
Figure2.20.
Page 21 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
The main principle of data organization applied in GIS systems is that of a spatial
data layer. A spatial data layer is either a representation of a continuous or discrete
field, or a collection of objects of the same kind. Usually, the data is organized so
that similar elements are in a single data layer. For example, all telephone booth
point objects would be in one layer, and all road line objects in another. A data layer
contains spatial data—of any of the types discussed above— as well as attribute (or:
thematic) data, which further describes the field or objects in the layer. Attribute data
is quite often arranged in tabular form, maintained in some kind of geodatabase, as
we will see in Chapter3. An example of two field data layers is provided in
Figure2.23.
Data layers can be overlaid with each other, inside the GIS package, so as to study
combinationsof geographic phenomena. We shall see later that a GIS can be used to study
Page 22 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
Page 23 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
UNIT: 2
Data management and processing systems
Hardware and software trends
Computers are also becoming increasingly affordable. Hand-held computers are now
commonplace in business and personal use, equipping field surveyors with powerful
tools, complete with GPS capabilities for instantaneous georefer- encing.
In general, soft- ware technology has developed somewhat slower and often cannot
fully utilise the possibilities offered by the exponentially growing hardware
capabilities.
Ex- isting software obviously performs better when run on faster computers.
Alongside these trends, there have also been significant developments in com- puter
networks. In essence, today almost any computer on Earth can connect to some
network, and contact computers virtually anywhere else, allowing fast and reliable
exchange of (spatial) data.
Mobile phones are more and more frequently being used to connect to computers on
the Internet. The UMTS protocol (Univer- sal Mobile Telecommunications System),
allows digital communication of text, audio, and video at a rate of approximately 2
Mbps.
Bluetooth version 2.0 is a standard that offers up to 3 Mbps connections, espe- cially
between palm- and laptop computers and their peripheral devices, such as a mobile
phone, GPS or printer at short range.
Wireless LANs (Local Area Networks), under the so-called WiFi standard, nowadays
Page 24 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
For many years, analogue data sources were used, processing was done man-
ually, and paper maps were produced. The introduction of modern techniques has
led to an increased use of computers and digital information in all aspects of spatial
data handling. The software technology used in this domain is centered around
geographic information systems.
Typical planning projects require data sources, both spatial and non-spatial, from
different national institutes, like national mapping agencies, geological, soil, and
forest survey institutes, and national census bureaus.
GIS software
GIS can be considered to be a data store (i.e. a system that stores spatial data), a
toolbox, a technology, an information source or a field of science. The main
characteristics of a GIS software package are its analytical functions that provide
means for deriving new geoinformation from existing spatial and attribute data.
The use of tools for problem solving is one thing, but the production of these tools is
something quite different. Not all tools are equally well-suited for a particular
Page 25 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
application, and they can be improved and perfected to better serve a particular
need or application.
The discipline of geographic information science is driven by the use of our GIS
tools, and these are in turn improved by new insights and information gained through
their application in various scientific fields.
All GIS packages available on the market have their strengths and weaknesses,
typically resulting from the development history and/or intended application domain(s)
of the package.
Well-known, full-fledged GIS packages include ILWIS, Intergraph’s GeoMedia,
ESRI’s ArcGIS, and MapInfo from Map-Info Corp.
Page 26 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
Typically, an SDI provides its users with different facilities for finding, viewing,
downloading and processing data. Because the organizations in an SDI are nor-
mally widely distributed over space, computer networks are used as the means of
communication.
With the development of the internet, the functional compo- nents of GIS have been
gradually become available as web-based applications.
Page 27 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
The functions for capturing data are closely related to the disciplines of survey- ing
engineering, photogrammetry, remote sensing, and the processes of digitiz- ing, i.e.
the conversion of analogue data into digital representations.
Remote sensing, in particular, is the field that provides photographs and images as
the raw base data from which spatial data sets are derived.
Surveys of the study area often need to be conducted for data that cannot be
obtained with remote sensing techniques, or to validate data thus obtained.
Traditional techniques for obtaining spatial data, typically from paper sources,
included
manual digitizing and scanning.
Table3.2lists the main methods and devices used for data capture. In recent years
there has been a significant increase in the availability and sharing of digital
(geospatial) data.
The data, once obtained in some digital format, may not be quite ready for use in the
system. This may be because the format obtained from the capturing process is not
quite the format required for storage and further use, which means that some type of
data conversion is required.
In part, this problem may also arise when the captured data represents only raw
base data, out of which the real data objects of interest to the
system still need to be constructed.
Page 28 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
The way that data is stored plays a central role in the processing and the eventual
understanding of that data. In most of the available systems, spatial data is orga-
nized in layers by theme and/or scale
For instance, the data may be organized in thematic categories, such as land use,
topography and administrative subdi- visions, or according to map scale.
In a GIS, features are represented ometry of features is represented with primitives of
the respective dimension: a windmill probably as a point, an agricultural field as a
polygon. The primitives follow either the vector, as in the example, or the raster
approach.
vector data types describe an object through its bound- ary, thus dividing the space
into parts that are occupied by the respective objects. The raster approach subdivides
space into (regular) cells, mostly as a square tessellation of dimension two or three.
These cells are called either cells or pixels in2D, and voxels in 3D. The data indicates
for every cell which real world feature it covers, in case it represents a discrete field.
In case of a continuous field, the cell holds a representative value for that field.
Table3.3lists advantages and disadvantages of raster and vector representations.
Page 29 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
GIS software packages provide support for both spatial and attribute data, i.e. they
accommodate spatial data storage using a vector approach, and attribute data using
tables. Historically, however, database management systems (DBMSs) have been
based on the notion of tables for data storage
For some time, substantial GIS applications have been able to link to an external
database to store attribute data and make use of its superior data management
functions.
currently, All major GIS packages provide facilities to link with a DBMS and ex-
change attribute data with it.
The most distinguishing parts of a GIS are its functions for spatial analysis, i.e.
operators that use spatial data to derive new geoinformation.
Spatial queries and process models play an important role in this functionality. One
of the key uses of GISs has been to support spatial decisions.
Spatial decision support systems (SDSS) are a category of information systems
composed of a database, GIS software, models, and a so-called knowledge engine
which allow users to deal specifically with locational problems.
The analysis functions of a GIS use the spatial and non-spatial attributes of the
data in a spatial database to provide answers to user questions. GIS functions
Page 30 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
are used for maintenance of the data, and for analysing the data in order to infer
information from it.
Table3.4lists several different methods and devices used for the presentation of
spatial data. Cartography and scientific visualization make use of these methods
and devices to produce their products.
Page 31 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
Secondly, one needs to identify the available data sources and define the format
in which the data will be organized within the database. This format is usually
called the database structure. Lastly, data can be entered into the database.
There are various reasons why one would want to use a DBMS for data storage and processing.
• A DBMS supports the storage and manipulation of very large data sets.
• A DBMS can be instructed to guard over data correctness.
• A DBMS supports the concurrent use of the same data set by many users.
The decision whether or not to use a DBMS will depend, among other things, on
howmuch data there is or will be, what type of use will be made of it, and how
many users might be involved.
On the small-scale side of the spectrum—when the data set is small, its userela-
tively simple, and with just one user—we might use simple text files, and a text
processor. Think of a personal address book as an example, or a small set of
sim- ple field observations. Text files offer no support for data analysis
whatsoever, except perhaps in alphabetical sorting.
Page 32 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
If our data set is still small and numeric by nature, and we have a single type of
use in mind, a spreadsheet program will suffice. This might be the case if we
have a number of field observations with measurements that we want to prepare
for statistical analysis, for example. However, if we carry out region- or nation-
wide censuses, with many observation stations and/or field observers and all
sorts of different measurements, one quickly needs a database to keep track of
all the data. It should also be noted that spreadsheets do not accommodatecon-
current use of the data set well, although they do support some data analysis,
especially when it comes to calculations over a single table, like averages, sums,
minimum and maximum values.
All such computations are usually restricted to just a single table of data. When
one wants to relate the values in the table with values of another nature insome
other table, some expertise and significant amounts of time are usually required
to make this happen.
Page 34 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
When a relation is created, we need to indicate what type of tuples it will store. This means
that we must
Database systems are particularly good at storing large quantities of data. The
DBMS must support quick searches amongst many tuples. This is why the
relational data model uses the notion of a key.
A key of a relation comprises one or more attributes. A value for these attributes
uniquely identifies a tuple. If we have a value for each of the key attributes we are
guaranteed to find no more than one tuple in the table with that combination of
values, such that there is no tuple for the given combination. Every relation has a
key.
A tuple can refer to another tuple by storing that other tuple's key value. This
attribute is called a foreign key because it refers to the primary key of another
relation. Two tuples of the same relation instance can have identical foreign key
values.
A query is a computer program that extracts data from the database that meet
the conditions indicated in the query. The first query operator is called tuple
selection; Tuple selection works like a filter: it allows tuples that meet the selection
condition to pass, and disallows tuples that do not meet the condition.
The operator is given some input relation, as well as a selection condition about
tuples in the input relation. A selection condition is a truth statement about a tuple's
attribute values such as: Distance <1000.
Page 35 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
The second operator is called attribute projection. Besides an input relation, this
operator requires a list of attributes, all of which should be attributes of the
schema of the input relation. Attribute projection works like a tuple formatter: it
passes through all tuples of the input, but reshapes each of them in the same way.
The output relation of this operator has as its schema only the list of attributes
given, and we say that the operator projects onto these attributes. The most
common way of defining queries in a relational database is through the SQL
language. SQL stands for Structured Query Language.
Page 36 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
'understanding' of geographic space and all functions that derive from this, for
purposes such as storage, analysis, and map production.
GIS packages themselves can store tabular data; however, they do not
always provide a full-fledged query language to operate on the tables.
DBMSs have a long tradition in handling attribute (i.e. administrative, non-
spatial, tabular, thematic) data in a secure way, for multiple users at the
same time.
DBMS offer much better table functionality, since they are specifically
designed for this purpose. A iot of the data in GIS applications is attribute
data, so it made sense to use a DBMS for it. For this reason, many GIS
applications have made use of external DBMS for data support.
With raster representations, each raster cell stores a characteristic value.
This value can be used to look up attribute data in an accompanying
database table.
For instance, the land use raster of Figure3.7indicates the land use class for
each of its cells, while an accompanying table provides full descriptions for all
classes, including perhaps some statistical information for each of the types.
Observe the similarity with the key/foreign key concept in relational
databases.
Page 37 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
DBMS vendors have recognized the need for storing more complex data, like
spatial data. The main problem was that there is additional functionality needed
by DBMS in order to process and manage spatial data. Object-oriented and
object-relational data models were developed for just this purpose. These extend
standard relational models with support for objects, including 'spatial' objects.
GIS software packages are able to store spatial data using a range of commercial
and open source DBMSs such as Oracle, Informix, IBM DB2, Sybase, and
PostgreSQL, with the help of spatial extensions. Some GIS software have
integrated database 'engines', and therefore do not need these extensions.
ESRI's ArcGIS and QGIS for example, have data base software built-in. This
means that the designer of a GIS application can choose whether to store the
application data in the GIS or in the DBMS. Spatial databases, also known as
geodatabases, are implemented directly on existing DBMS, using extension
software to allow them to handle spatial objects.
A spatial database allows users to store query and manipulate collections of
spatial data.
There are several advantages in doing this, spatial data can be stored in a
special database column, known as the geometry column, (or feature or shape,
depending on the specific software package),. This means GISs can rely fully on
DBMS support for spatial data, making use of a DBMS for data query and
storage (and multi-user support), and GIS for spatial functionality. Small-scale
GIS applications may not require a multi-user capability, and can be supported by
spatial data support from a personal database.
A geodatabase allows a wide variety of users to access large data sets (both
geographic and alphanumeric), and the management of their relations,
guaranteeing their integrity. The Open Geospatial Consortium (OGC) has
released a series of standards relating to geodatabases that (amongst other
things), define :
Page 38 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
❖ The data formats, called 'Simple Features' (i.e. point, line, polygon, etc.)
As a result one is able to use functions for 'spatial query' (exploring spatial
relationships). To illustrate, a spatial query using SQL to find all the Metro City
within 20 km of a River GANGA would look like this:
In this case the WHERE clause uses the ST_Intersects function to perform a
spatial join between a 20000 m buffer of the selected River and the selected
subset of Cities. The Geometry column carries the spatial data.
Page 39 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
UNIT: 3
SPATIAL REFERENCING AND POSITIONING
SPATIAL REFERENCING
One of the defining features of GIS is their ability to combine spatially referenced
data. A frequently occurring issue is the need to combine spatial data from different
sources that use different spatial reference systems. This section provides a broad
background of relevant concepts relating to the nature of spatial refer- ence
systems and the translation of data from one spatial referencing system into
another.
The surface of the Earth is anything but uniform. The oceans can be treated as
reasonablyuniform, but the surface or topography of the land masses exhibits large
vertical variations between mountains and valleys.
These variations make it impossible to approximate the shape of the Earth with
any reasonably simple mathematical model. Two main reference surfaces have
been established to approximate the shape of the Earth. One reference surface is
called the Geoid, the other reference surface is the ellipsoid as shown in the figure
below.
Imagine that the entire Earth's surface is covered by water. If ignored tidal and
current effects on this 'global ocean', the resultant water surface is affected only by
gravity. This has an effect on the shape of this surface because the direction of
gravity- more commonly known as plumb line-is dependent on the mass distribution
inside the Earth.
Page 40 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
Due to irregularities or mass anomalies in this distribution the 'global ocean' results
in an undulated surface. This surface is called the Geoid. The plumb line through
any surface point is always perpendicular to it.
The Geoid is used to describe heights. In order to establish the Geoid as reference
for heights, the ocean's water level is registered at coastal places over several
years using tide gauges (mareographs).
Averaging the registrations largely eliminates variations of the sea level with time.
The resulting water level represents an approximation to the Geoid and is called
the mean sea level.
The ellipsoid
The physical surface, called Geoid, is used as a reference surface for heights. Also
a reference surface for the description of the horizontal coordinatesof points of
interest is required.
This will later used to project these horizontal coordinates onto a mapping plane,
the reference surface for horizontal coordinates requires a mathematical definition
and description. The most convenient geometric reference is the oblate ellipsoid.
It provides a relatively simple figure which fits the Geoid to a first order
approximation, though for small scale mapping purposes a sphere may be used. An
ellipsoid is formed when an ellipse is rotated about its minor axis. This ellipse which
defines an ellipsoid or spheroid is called a meridian ellipse.
Page 41 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
Page 42 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
Local horizontal datums have been established to fit the Geoid well over the area
of local interest, which in the past was never larger than a continent. With
increasing demands for global surveying activities are underway to establish global
reference surfaces.
The objective is to make geodetic results mutually comparable and to provide
coherent results also to other disciplines like astronomy and geophysics.
The most important global (geocentric) spatial reference system for the GIS
community is the International Terrestrial Reference System (ITRS).
It is a three dimensional coordinate system with a well-defined origin (the centre of
mass of the Earth) and three orthogonal coordinate axes (X, Y, Z).
The Z-axis points towards a mean Earth north pole. The X-axis is oriented towards
a mean Greenwich meridian and is orthogonal to the Z-axis. The Y -axis completes
the right-handed reference coordinate system.
Page 43 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
We can easily transform ITRF coordinates (X, Y and Z in metres) into geo-
graphic coordinates (φ, λ, h) with respect to the GRS80 ellipsoid without the
loss of accuracy. However, the ellipsoidal height h, obtained through this
straight- forward transformation, has no physical meaning and does not
correspond to intuitive human perception of height. We therefore use the
height H, above the Geoid (see Figure4.8).
Coordinate systems
Different kinds of coordinate systems are used to position data in space. Spatial
(or global) coordinate systems are used to locate data either on the Earth's surface
in a 3D space, or on the Earth's reference surface in a 2D space. The geographic
coordinate system in 2D and 3D space and the geocentric coordinate system, also
known as the 3D Cartesian coordinate system. Planar coordinate systems on the
other hand are used to locate data on the flat surface of the map in a 2D space.
1. 2D Geographic coordinates (φ, λ)
The most widely used global coordinate system consists of lines of geographic
latitude (phi or cf) or <p) and longitude (lambda or A). Lines of equal latitude are
called parallels. They form circles on the surface of the ellipsoid4. Lines of equal
longitude are called meridians and they form ellipses (meridian ellipses) on the
ellipsoid.
Page 44 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
1. The latitude (cf>) of a point P is the angle between the ellipsoidal normal through P
' and the equatorial plane. Latitude is zero on the equator (cf» = 0°), and
increases towards the two poles to maximum values of 4> = +90° (N 90°) at the
North Pole and cj) = -90° (S 90°) at the South Pole.
2. The longitude (A) is the angle between the meridian ellipse which passes through
Greenwich and the meridian ellipse containing the point in question. It is
measured in the equatorial plane from the meridian of Greenwich (A = 0°) either
eastwards through A = + 180° (E 180°) or westwards through A = -180° (W 180°).
Page 45 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
It should be noted that the rotational axis of the earth changes its position over time
(referred to as polar motion). To compensate for this, the mean position of the pole
in the year 1903 (based on observations between 1900 and 1905) has been used
to define the so-called Conventional International Origin (CIO).
A flat map has only two dimensions: width (left to right) and length (bottom to
top). Transforming the three dimensional Earth into a two-dimensional map is
subject of map projections and coordinate transformation. Like in several other
cartographic applications, two-dimensional Cartesian coordinates (x, y), also
known as planar rectangular coordinates, are used to describe the location of any
point unambiguously. The two coordinates x and y for point P, specify any location
P on the map.
Page 46 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
Polar coordinate is the distance "d" from the origin to the point concerned and the
angle a between a fixed (or zero) direction and the direction to the point. The angle
a is called azimuth or bearing and is measured in a clockwise direction.
It is given in angular units while the distance d is expressed in length units.
Bearings are always related to a fixed direction (initial bearing) or a datum line.
In principle, this reference line can be chosen freely. However, in practice three
different directions are widely used: True North, Grid North and Magnetic North. The
corresponding bearings are called: true (or geodetic) bearing, grid bearing and
magnetic (or compass) bearing.
Map projections
Page 47 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
A large number of map projections have been developed, each with its own specific
qualities. These qualities in turn make resulting maps useful for certain purposes.
By definition, any map projection is associated with scale distortions.
There is simply no way to flatten out a piece of ellipsoidal or spherical surface
without stretching some parts of the surface more than others. Some map
projections can be visualized as true geometric projections directly onto the
mapping plane, in which case we call it an azimuthal projection, or onto an
intermediate surface, which is then rolled out into the mapping plane.
Typical choices for such intermediate surfaces are cones and cylinders. Such map
projections are then called conical, and cylindrical, respectively.
Page 48 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
Coordinate transformations
Map and GIS users are mostly confronted in their work with transformations from
one two-dimensional coordinate system to another. This includes the trans-
formation of polar coordinates delivered by the surveyor into Cartesian map
coordinates or the transformation from one 2D Cartesian (x, y) system of a spe-
cific map projection into another 2D Cartesian (xj, yj) system of a defined map
projection.
Datum transformations are transformations from a 3D coordinate system (i.e.
horizontal datum) into another 3D coordinate system. These kinds of transfor-
mations are also important for map and GIS users. They are usually collecting
spatial data in the field using satellite navigation technology and need to repre-
sent this data on published map on a local horizontal datum.
Page 49 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
x = d(sin(a))
y = d(cos(a))
The inverse equation is: a = tan'1 (x/y)
d2 = x2 + y2
A more realistic case makes use of a translation and a rotation to transform one
system to the other.
Forward and inverse mapping equations are normally used to transform data from
one map projection to another. The inverse equation of the source projec- tion is
used first to transform source projection coordinates (x,y) to geographic
coordinates (φ, λ).
Next, the forward equation of the target projection is used to transform the
geographic coordinates (φ,λ) into target projection coordinates (xj, yj).
The first equation takes us from a projection A into geographic coordi- nates. The
second takes us from geographic coordinates (φ, λ) to another map projection B.
These principles are illustrated in Figure4.22.
Page 50 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
Datum transformations
A change of map projection may also include a change of the horizontal datum.
This is the case when the source projection is based upon a different horizontal
datum than the target projection. If the difference in horizontal datums is ignored,
there will not be a perfect match between adjacent maps of neighboring countries or
between overlaid maps originating from different projections.
It may result in up to several hundred meters difference in the resulting coordinates.
Therefore, spatial data with different underlying horizontal datums may need a so-
called datum transformation.
Suppose we wish to transform spatial data from the UTM projection to the Dutch
RD system, and that the data in the UTM system are related to the European
Datum 1950 (ED50), while the Dutch RD system is based on the Amersfoort datum
In this example the change of map projection should be combined with a datum
transformation step for a perfect match. This is illustrated in Figure4.23.
Satellite-based positioning
Satellites are used in geocentric reference systems, and increase the level of spatial
accuracy substantially. They are critical tools in geodetic engineering for the
maintenance of the ITRF. They also play a key role in mapping, surveying, and in
a growing number of applications requiring positioning techniques.
Nowadays, for fieldwork that includes spatial data acquisition, the use of satellite-
based positioning is considered indispensable. Satellite-based positioning was
developed and implemented to address military needs, somewhat analogously to
the early development of the internet.
The technology is now widely available for civilians use. The requirements for
the development of the positioning system were:
Page 51 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
Suitability for all kinds of military use: ground troops and vehicles,
aircraft and missiles, ships;
Requiring only low-cost equipment with low energy consumption at
the receiver end;
Provision of results in real time for an unlimited number of users concurrently;
Support for different levels of accuracy (military versus civilian);
Around-the-clock and weather-proof availability;
Use of a single geodetic datum;
Protection against intentional and unintentional disturbance, for
instance, through a design allowing for redundancy.
A satellite-based positioning system set-up involves implementation of three
hardware segments:
1. The space segment, i.e. the satellites that orbit
the Earth, and the radio signals that they emit,
2. The control segment, i.e. the ground stations
that monitor and maintain the space segment
components, and
3. The user segment, i.e. the users with their hard- and software to conduct
positioning
Absolute positioning
The working principles of absolute, satellite-based positioning are fairly simple:
1. A satellite, equipped with a clock, at a specific moment sends a radio message that
includes:
a) The satellite identifier,
b) Its position in orbit, and
c) Its clock reading.
2. A receiver on or above the planet, also equipped with a clock, receives the message
slightly later, and reads its own clock.
3. From the time delay observed between the two clock readings, and know- ing the
speed of radio transmission through the medium between (satel- lite) sender and
receiver, the receiver can compute the distance to the sender, also known as the
satellite’s pseudorange.
Page 52 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
While latitude was determined with a sextant from the position of the Sun in the
sky, they carried clocks with them to determine the longitude of their position. Early
ship clocks were unreliable, having a drift of multiple seconds a day, which could
result in positional error of a few kilometers.
Before any notion of standard time existed, villages and cities simply kept track of
their local time determined from position of the Sun in the sky. When trains became
an important means of transportation, these local time systems became
problematic as the schedules required a single time system.
Such a time system needed the definition of time zones: typically as 24 geographic
strips between certain longitudes that are multiples of 15°. This all gave rise to
Greenwich Mean Time (GMT). GMT was the world time standard of choice. It was
a system based on the mean solar time at the meridian of Greenwich, United
Kingdom, which is the conventional O-meridian in geography.
Even atomic clocks can be off by a small margin, and since Einstein, we know
that travelling clocks are slower than resident clocks, due to a so-called
relativistic effect. If one understands that a clock that is off by 0.000001 sec
causes a computation error in the satellite's pseudorange of approximately 300
m, it is clear that these satellite clocks require very strict monitoring.
The medium between sender and receiver may be of influence to the radio
signals. The middle atmospheric layers of stratosphere and mesosphere are
relatively harmless and of little hindrance to radio waves, but this is not true of the
lower and upper layer. They are, respectively:
• The troposphere : the approximate 14 km high airspace just above the
Earth's surface, which holds much of the atmosphere's oxygen and which
envelopes all phenomena that we call the weather. It is an obstacle that
delays radio waves in a rather variable way.
• The ionosphere : the most outward part of the atmosphere that starts
at an altitude of 90 km, holding many electrically charged atoms,
thereby forming a protection against various forms of radiation from
space, including to some extent radio waves. The degree of ionization
shows a distinct night and day rhythm, and also depends on solar
activity
Page 54 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
The error occurring when a radio signal is received via two or more paths
between sender and receiver, some of which typically via a bounce off of some
nearby surface, like a building or rock face. The term applied to this phenomenon
is multi-path; when it occurs the multiple receptions of the same signal may
interfere with each other. Multipath is a difficult to avoid error source.
There is one more source of error that is unrelated to individual radio signal
characteristics, but that rather depends on the combination of the satellite sig-
nals used for positioning. Of importance is their constellation in the sky from
the receiver perspective.
Referring to Figure4.27, one will understand that the sphere intersection
technique of positioning will provide more precise results when the four
satellites are nicely spread over the sky, and thus that the satel- lite constellation
of Figure4.27(b) is preferred over the one of4.27(a).
Page 55 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
Relative positioning
Network positioning
Page 56 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
So far, we have assumed that the receiver determines the range of a satellite
by measuring time delay on the received ranging code. There exists a more
advanced range determination technique known as carrier phase
measurement. This typically requires more advanced receiver technology, and
longer observation sessions.
Carrier phase measurement can currently only be used with relative
positioning, as absolute positioning using this method is not yet well developed.
The technique aims to determine the number of cycles of the (sine-shaped)
radio signal between sender and receiver.
Each cycle corresponds to one wavelength of the signal, which in the applied
L-band frequencies is 19-24 cm. Since this number of cycles cannot be directly
measured, it is determined, in a long observation session, from the change in
carrier phase with time. This happens because the satellite is orbiting itself.
From its orbit parameters and the change in phase over time, the number of
cycles can be derived.
Positioning technology
1) GPS
2) GLONASS
3) Galileo
In the 1990’s, the European Union (EU) judged that it needed to have its own
satellite- based positioning system, to become independent of the GPS monopoly
and to support its own economic growth by providing services of high reliability
under civilian control.
Galileo is the name of this EU system. The vision is that satellite-based position-
ing will become even bigger due to the emergence of mobile phones equipped with
receivers, perhaps with some 400 million users by the year 2015.
Develop- ment of the system has experienced substantial delays, and at the time
of writing European ministers insist that Galileo should be up and running by the
Page 58 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
end of 2013. The completed system will have 27 satellites, with three in reserve,
orbit- ing in one of three, equally spaced, circular orbits at an elevation of 23,222
km, inclined 56◦ with the equator. This higher inclination, when compared to that of
GPS, has been chosen to provide better positioning coverage at high latitudes,
such as northern Scandinavia where GPS performs rather poorly.
One way to obtain spatial data is by direct observation of the relevant geographic
phenomena. This can be done through ground-based field surveys, or by using
remote sensors in satellites or airplanes.
Many Earth sciences have developed their own survey techniques, as ground-
based techniques remain the most important source for reliable data in many
cases.
Data which is captured directly from the environment is known as primary data
Remotely sensed imagery is usually not fit for immediate use, as various sources
of error and distortion may have been present, and the imagery should first be
freed from these.
This is the domain of remote sensing, and these issues are discussed further in
Principles of Remote Sensing.
An image refers to raw data produced by an electronic sensor, which are not
pictorial, but arrays of digital numbers related to some property of an object or
scene, such as the amount of reflected light.
Factors of cost and available time may be a hindrance in using existing remotely
sensed images because previous projects sometimes have acquired data that may
not fit the current project's purpose.
Page 59 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
3) Digitizing
A traditional method of obtaining spatial data is through digitizing existing paper
maps. This can be done using various techniques. Before adopting this approach,
one must be aware that positional errors already in the paper map will further
accumulate, and one must be willing to accept these errors.
There are two forms of digitizing: on-tablet and on-screen manual digitizing. In on-
tablet digitizing, the original map is fitted on a special surface (the tablet), while in
on-screen digitizing, a scanned image of the map (or some other image) is shown
on the computer screen.
In both of these forms, an operator follows the map's features with a mouse device,
thereby tracing the lines, and storing location coordinates relative to a number of
previously defined control points.
The function of these points is to 'lock' a coordinate system onto the digitized data:
the control points on the map have known coordinates, and by digitizing them we
tell the system implicitly where all other digitized locations are. At least three
control points are needed, but preferably more should be digitized to allow a check
on the positional errors made.
4) Scanning
A scanner is an input device that illuminates a document and measures the
intensity of the reflected light with a CCD array. The result is an image as a matrix
of pixels, each of which holds an intensity value.
Office scanners have a fixed maximum resolution, expressed as the highest
number of pixels they can identify per inch; the unit is dots-per- inch (dpi). For
manual on-screen digitizing of a paper map, a resolution of 200-300 dpi is usually
sufficient, depending on the thickness of the thinnest lines. For manual on-screen
digitizing of aerial photographs, higher resolutions are recommended — typically,
at least 800 dpi.
After scanning, the resulting image can be improved with various image processing
techniques. It is important to understand that scanning does not result in a
structured data set of classified and coded objects. Additional work is required to
recognize features and to associate categories and other thematic at- tributes with
them.
Page 60 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
5) Vectorization
The process of distilling points, lines and polygons from a scanned image is called
vectorization. As scanned lines may be several pixels wide, they are often first
thinned to retain only the centreline. The remaining centreline pixels are converted
to series of (x, y) coordinate pairs, defining a polyline.
Subsequently, features are formed and attributes are attached to them. This
process may be entirely automated or performed semi- automatically, with the
assistance of an operator. Pattern recognition methods—like Optical Character
Recognition (OCR) for text—can be used for the automatic detection of graphic
symbols and text.
Vectorization causes errors such as small spikes along lines, rounded comers,
errors in T- & X-junctions, displaced lines or jagged curves. These errors are
corrected in an automatic or interactive post-processing phase. The phases of
the vectorization process are illustrated in Figure below.
The choice of digitizing technique depends on the quality, complexity and con-
tents of the input document. Complex images are better manually digitized; simple
images are better automatically digitized. Images that are full of detail and
symbols—like topographic maps and aerial photographs—are therefore bet- ter
manually digitized.
The optimal choice may be a combination of methods. For example, contour line
film separations can be automatically digitized and used to produce a DEM.
Existing topographic maps must be digitized manually, but new, geometrically
corrected aerial photographs, with vector data from the topographic maps displayed
directly over it, canbe used for updating existing data files by means of manual on-screen
digitizing.
Page 61 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
Spatial data has been collected in digital form at increasing rate, stored in
various databases by the individual producers for their own use and for
commercial purposes. More and more of this data is being shared among GIS
users. This is for several reasons.
Some of this data is freely available, although other data is only available
commercially, as is the case for most satellite imagery. High quality data remain
both costly and time- consuming to collect and verify, as well as the fact that
more and more GIS applications are looking at not just local, but national or even
global processes.
Metadata
Metadata is defined as background information that describes all necessary
information about the data itself. More generally, it is known as 'data about data'.
This includes: • Identification information : Data source(s), time of acquisition, etc.
• Data quality information : Positional, attribute and temporal accuracy, lineage,
etc. • Entity and attribute information: Related attributes, units of measure, etc.
Page 62 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
DATA QUALITY
GIS is being increasingly used for geospatial decision support applications, with
increasing reliance on secondary data sourced through data providers or via the
internet, through geo-webservices.
The implications of using low-quality data in important decisions are potentially
severe. There is also a danger that uninformed GIS users introduce errors by
incorrectly applying geometric and other transformations to the spatial data held in
their database.
The main issues related to data quality in spatial data are positional, temporal and
attribute accuracy, lineage, completeness, and logical consistency.
Page 63 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
2) Positional accuracy
The surveying and mapping profession has a long tradition of determining and
minimizing errors. This applies particularly to land surveying and photogrammetry,
both of which tend to regard positional and height errors as undesirable.
Cartographers also strive to reduce geometric and attribute errors in their products,
and, in addition, define quality in specifically cartographic terms, for example
quality of linework, layout, and clarity of text. It must be stressed that all
measurements made with surveying and photogrammetric instruments are subject
to error.
These include:
1. Human errors in measurement (e.g. reading errors) generally referred to as gross
errors or blunders. These are usually large errors resulting from care lessness
which could be avoided through careful observation, although it is never absolutely
certain that all blunders have been avoided or eleminated.
3. Random errors caused by natural variations in the quantity being measured. These
are effectively the errors that remain after blunders and systematic errors have
been removed. They are usually small, and dealt with in least-squares adjustment,
more general ways of quantifying positional accuracy using root mean square
error (RMSE).
Page 64 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
3) Accuracy tolerances
Any probability density function p has the characteristic that the area between its
curve and the horizontal axis has size 1. Probabilities P can be inferred from p as
the size of an area under p's curve. Figure above, for instance, depicts P (x - a <
Y < x - a), i.e. the probability that the value for Y is within distance a from |a. In a
normal distribution this specific probability for Y is always 0.6826.
4) Attribute Accuracy
Two types of attribute accuracies, related to the type of data it is dealing with:
❖ For nominal or categorical data, the accuracy of labeling (for example the type
of land cover, road surface, etc).
❖ For numerical data, numerical accuracy (such as the concentration of
pollutants in the soil, height of trees in forests, etc).
It follows that depending on the data type, assessment of attribute accuracy may
range from a simple check on the labelling of features—for example, is a road
classified as a metalled road actually surfaced or not?—to complex statistical
procedures for assessing the accuracy of numerical data, such as the percentage
of pollutants present in the soil.
5) Temporal Accuracy
Spatial data sets captured through remotely sensed data has increased
enormously over the last decade. These data can provide useful temporal
information such as changes in land ownership and the monitoring of
Page 65 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
6) Lineage
Lineage describes the history of a data set. In the case of published maps, some
lineage information may be provided as part of the metadata, in the form of a note
on the data sources and procedures used in the compilation of the data.
Examples include the date and scale of aerial photography, and the date of field
verification. For digital data sets, however, lineage may be defined as: "that part of
the data quality statement that contains information that describes the source of
observations or materials, data acquisition and compilation methods, conversions,
transformations, analyses and derivations that the data has been subjected to, and
the assumptions and criteria applied at any stage of its life."
7) Completeness
Completeness refers to whether there are data lacking in the database compared
to what exists in the real world. Essentially, it is important to be able to assess
what does and what does not belong to a complete dataset as intended by its
producer.
It might be incomplete (i.e. it is 'missing' features which exist in the real world), or
overcomplete (i.e. it contains 'extra' features which do not belong’within the scope
of the data set as it is defined). Completeness can relate to spatial, temporal, or
thematic aspects of a data set.
For example, a data set of property boundaries might be spatially incomplete
because it contains only 10 out of 12 suburbs; it might be temporally incomplete
because it does not include recently subdivided properties; and it might be
thematically over complete because it also includes building footprints.
8) Logical consistency
For any particular application, (predefined) logical rules concern:
❖ The compatibility of data with other data in a data set (e.g. in terms of data
format),
Page 66 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
❖ DATA PREPARATION
Spatial data preparation aims to make the acquired spatial data fit for use. Im- ages
may require enhancements and corrections of the classification scheme of the
data.
Vector data also may require editing, such as the trimming of over- shoots of
lines at intersections, deleting duplicate lines, closing gaps in lines, and
generating polygons.
Data may require conversion to either vector format or raster format to match other
data sets which will be used in the analysis. Ad- ditionally, the data preparation
process includes associating attribute data with the spatial features through either
manual input or reading digital attribute files into the GIS/DBMS.
Acquired data sets must be checked for quality in terms of the accuracy,
consistency and completeness parameters discussed above. Often, errors can
be identified automatically, after which manual editing methods can be applied to
correct the errors. Alternatively, some software may identify and automatically
correct certain types of errors.
Below, we focus on the geometric, topological, and attribute components of
spatial data.
'Clean-up' operations are often performed in a standard sequence. For example,
crossing lines are split before dangling lines are erased, and nodes are created
at intersections before polygons are generated. Thefce are illustrated in Table
below.
Page 67 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
Rasterization or vectorization
Vectorization produces a vector data set from a raster. We have looked at this in
some sense already: namely in the production of a vector set from a scanned
image. Another form of vectorization takes place when we want to identify features
or patterns in remotely sensed imagery. The keywords here are fea- ture extraction
and pattern recognition, which are dealt with in Principles of Remote Sensing.
If much or all of the subsequent spatial data analysis is to be carried out on
raster data, one may want to convert vector data sets to raster data. This
process is known as rasterization.
It involves assigning point, line and polygon attribute values to raster cells that
overlap with the respective point, line or polygon. To avoid information loss, the
raster resolution should be carefully chosen on the basis of the geometric
resolution.
A cell size which is too large may result in cells that cover parts of multiple
Page 68 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
vector features, and then ambiguity arises as to what value to assign to the
cell. If, on the other hand, the cell size is too small, the file size of the raster
may increase significantly.
Topology generation
Topological relations may sometimes be needed, for instance in networks, e.g. the
questions of line connectivity, flow direction, and which lines have over- and
underpasses. For polygons, questions that may arise involve polygon inclusion: Is
a polygon inside another one, or is the outer polygon simply around the inner
polygon? Many of these questions are mostly questions of data semantics, and
can therefore usually only be answered by a human operator.
A GIS project usually involves multiple data sets, so the next step addresses the
issue of how these multiple sets relate to each other. There are four fundamental
cases to be considered in the combination of data from different sources:
1. They may be about the same area, but differ in accuracy,
2. They may be about the same area, but differ in choice of representation,
3. They may be about adjacent areas, and have to be merged into a single data
set.
4. They may be about the same or adjacent areas, but referenced in different
coordinate systems.
Differences in accuracy
These are clearly relevant in any combination of data sets which may themselves
have varying levels of accuracy. Images come at a certain resolution, and paper
maps at a certain scale. This typically results in differences of resolution of acquired
data sets, all the more since map features are sometimes intentionally displaced
to improve readability of the map.
For instance, the course of a river will only be approximated roughly on a small-
scale map, and a village on its northern bank should be depicted north of the
river, even if this means it has to be displaced on the map a little bit.
Page 69 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
Differences in representation
Some advanced GIS applications require the possibility of representing the same
geographic phenomenon in different ways. These are called multi representation
systems. The production of maps at various scales is an example, but there are
numerous others.
The commonality is that phenomena must sometimes be viewed as points, and at
other times as polygons. For example, a small-scale national road network analysis
may represent villages as point objects, but a nation-wide urban population density
study should regard all municipalities as represented by polygons.
The links between various representations for the same object maintained by the
system allows switching between them, and many fancy applications of their use
seem possible. A comparison is illustrated in Figure5.11.
When individual data sets have been prepared as described above, they some-
times have to be matched into a single 'seamless' data set, whilst ensuring
thatthe appearance of the integrated geometry is as homogeneous as possible.
Edge matching is the process of joining two or more map sheets, for instance,
after they have separately been digitized.
Page 70 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
Map projections provide means to map geographic coordinates onto a flat surface
(for map production), and vice versa. It may be the case that data layers which are
to be combined or merged in some way are referenced in different coordinate
systems, or are based upon different datums.
As a result, data may need coordinate transformation, or both a coordinate
transformation and datum transformation. It may also be the case that data has
been digitized from an existing map or data layer. In this ase, geometric
transformations help to transform device coordinates (coordinates from digitizing
tablets or screen coordinates) into world coordinates (geographic coordinates,
meters, etc.).
Page 71 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
A simple example is given in Figure5.13. Our field survey has taken only two
measurements, one at P and one at Q. The values obtained in these two
locations are represented by a dark and light green tint, respectively. If we are
dealing with qualitative data, and we have no further knowledge, the only
assumption we can make for other locations is that those nearer to P probably
have P ’s value, whereas those nearer to Q have Q’s value. This is illustrated in
part (a)
Page 72 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
The main alternative for continuous field representation is a polyline vector layer,
in which the lines are isolines. We will also address these issues of representation
below.
The aim is to use measurements to obtain a representation of the entire field using
point samples. In this section we outline four techniques to do so:
a. Trend surface fitting using regression,
b. Triangulation,
c. Spatial moving averages using inverse distance weighting,
d. Kriging.
In trend surface fitting, the assumption is that the entire study area can be
represented bya formula f(x, y) that for a given location with coordinates (x, y) will
give us the approximated value of the field in that location. The key objective in
trend surface fitting is to derive a formula that best describes the field. Various
classes of formulae exist, with the simplest being the one that describes a flat, but
tilted plane: f(x, y) = ci • x + c2 • y + c3.
The field under consideration can be best approximated by a tilted plane, then the
problem of finding the best plane is the problem of determining best values for the
coefficients c\, c2 and C 3.
Page 73 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
In figure 5.15, We have used the same set of point measurements, with four
different approximation functions. Part (a) has been determined under the
assumption that the field can be approximated by a tilted plane, in this case with
a downward slope to the southeast. The values found by regression techniques
were: ci = -1.83934, c2 = 1.61645 and c3 = 70.8782, giving f(x, y) = -1.83934 • x
+ 1.61645 • y + 70.8782.
Triangulation
Page 74 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
Moving window averaging attempts to directly derive a raster dataset from a set of
sample points. This is why it is sometimes also called 'gridding'. The principle
behind this technique is illustrated in Figure below.
The cell values for the output raster are computed one by one. To achieve this, a
'window' (also known as a kernel) is defined, and initially placed over the top left
raster cell. Measurement points falling inside the window contribute to the
averaging computation, those outside the window do not.
In part (b) of the figure, the 295th cell value out of the 418 in total, is being
computed. This computation is based on eleven measurements, while that of the
first cell had no measurements available. Where this is the case, the cell should
be assigned a value that signals this 'non-availability of measurements'.
The principle of spatial autocorrelation suggests that measurements closer to the
cell centre should have greater influence on the predicted value than those further
away. In order to account for this, a distance factor can be brought into the
averaging function. Functions that do this are called inverse distance weighting
functions (IDW). This is one of the most commonly used functions in interpolating
spatial data.
Page 75 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
Kriging
Kriging was originally developed my mining geologists attempting to derive
accurate estimates of mineral deposits in a given area from limited sample
measurements. It is an advanced interpolation technique belonging to the field of
geostatistics, which can deliver good results if applied properly and with enough
sample points.
Kriging is usually used when the variation of an attribute and/or the density of
sample points is such that simple methods of interpolation may give unreliable
predictions.
The first step in the kriging procedure is to compare successive pairs of point
measurements to generate a semi-variogram.
In the second step, the semi-variogram is used to calculate the weights used in
interpolation. Although kriging is a powerful technique, it should not be applied
without a good understanding of geostatistics, including the principle of spatial
autocorrelation. It should be noted that there is no single best interpolation method,
since each method has advantages and disadvantages in particular contexts.
As a general guide, the following questions should be considered in selecting an
appropriate method of interpolation:
* For what type of application will the results be used?
* What data type is being interpolated (e.g. categorical or continuous)?
* What is the nature of the surface (for example, is it a 'simple' or complex
surface)?
* What is the scale and resolution of the data (for example, the distance
between sample points)?
Page 76 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
UNIT – 4
Spatial Data Analysis
2) Overlay functions
These belong to the most frequently used functions in a GIS application. They
allow the combination of two (or more) spatial data layers comparing them position
by position, and treating areas of overlap—and of non-overlap —in distinct ways.
In this way, we can find
❖ The cotton fields on black soils (select the 'cotton' cover in the crop data
layer and the 'black' cover in the soil data layer and perform an intersection),
❖ The fields where cotton or jowar is the crop (select both areas of 'cotton' and
'jowar' cover in the crop data layer and take their union),
❖ The cotton fields not on red soils (perform a difference operator of areas
with 'cotton' cover with the areas having red soil),
❖ The fields that do not have wheat as crop (take the complement of the wheat
areas).
Page 77 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
3) Neighborhood functions
❖ Search functions allow the retrieval of features that fall within a given search
window. This window may be a rectangle, circle, or polygon.
❖ Buffer zone generation (or buffering) is one of the best known neighborhood
functions. It determines a spatial envelope (buffer) around (a) given feature(s). The
created buffer may have a fixed width, or a variable width that depends on
characteristics of the area.
❖ Interpolation functions predict unknown values using the known values at
nearby locations. This typically occurs for continuous fields, like elevation, when the
data actually stored does not provide the direct answer for the location(s) of interest.
❖ Topographic functions determine characteristics of an area by looking at the
immediate neighborhood as well. Typical examples are slope computations on
digital terrain models (i.e. continuous spatial fields). The slope in a location is
defined as the plane tangent to the topography in that location. Various
computations can be performed, such as determination of slope angle, slope
aspect, slope length, contour lines.
These are lines that connect points with the same value (for elevation, depth,
temperature, barometric pressure, water salinity etc).
4) Connectivity functions
These functions work on the basis of networks, including road networks, water
courses in coastal zones, and communication lines in mobile telephony. These
networks represent spatial linkages between features. Main functions of this type
include:
The primitives of vector data sets are point, (poly)line and polygon. Related
geometric measurements are location, length, distance and area size. Some of
these are geometric properties of a feature in isolation (location, length, area size);
others (distance) require two features to be identified.
The location property of a vector feature is always stored by the GIS: a single
coordinate pair for a point, or a list of pairs for a polyline or polygon boundary.
Occasionally, there is a need to obtain the location of the centroid of a polygon;
some GISs store these also, others compute them 'on-the-fly'
Length is a geometric property associated with polylines, by themselves, or in their
function as polygon boundary.
Area size is associated with polygon features. Again, it can be computed, but
usually is stored with the polygon as an extra attribute value. This speeds up the
computation of other functions that require area size values.
Another geometric measurement used by the GIS is the minimal bounding box
computation. It applies to polylines and polygons, and determines the minimal
rectangle- with sides parallel to the axes of the spatial reference system-that covers
the feature. This is illustrated in Figure6.1
Page 79 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
Measurements on raster data layers are simpler because of the regularity of the
cells. The area size of a cell is constant, and is determined by the cell resolution.
Horizontal and vertical resolution may differ, but typically do not. Together with the
location of a so- called anchor point, this is the only geometric information stored
with the raster data, so all other measurements by the GIS are computed. The
anchor point is fixed by convention to be the lower left (or sometimes upper left)
location of the raster.
Location of an individual cell derives from the raster's anchor point, the cell
resolution, and the position of the cell in the raster.
Again, there are two conventions: the cell's location can be its lower left comer, or
the cell's midpoint. These conventions are set by the software in use, and in case
of low resolution data they become more important to be aware of. The area size
of a selected part of the raster (a group of cells) is calculated as the number of cells
multiplied by the cell area size.
When exploring a spatial data set, the first thing one usually wants is to select
certain features, to (temporarily) restrict the exploration. Such selections can be
made on geometric/spatial grounds, or on the basis of attribute data associated
with the spatial features.
Interactive spatial selection
Page 80 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
When multiple criteria have to be used for selection, we need to carefully express
all of these in a single composite condition. The tools for this come from a field of
mathematical logic, known as propositional calculus.
Page 81 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
Atomic conditions such as Ari'ct < 400000, and LandUse = 80. Atomic conditions
use apredicate symbol, such as < (less than) or = (equals). Other possibilities are
<= (less than or equal), > (greater than), >= (greater than or equal) and o (does not
equal). Any of these symbols is combined with an expression on the left and one
on the right.
Atomic conditions can be combined into composite conditions using logical
connectives. The most important ones are AND, OR, NOT and the bracket pair (•
• •). If we write a composite condition like Area < 400000 AND LandUse = 80,
Selecting features that are inside selection objects This type of query uses the
containment relationship between spatial objects. Obviously, polygons can contain
polygons, lines or points, and lines can contain lines or points, but no other
containment relationships are possible.
Selecting features that intersect The intersect operator identifies features that
are not disjoint to include points and lines.
Selecting features adjacent to selection objects Adjacency is the meet
relationship. It expresses that features share boundaries, and therefore it applies
only to line and polygon features.
Selecting features based on their distance One may also want to use the
distance function of the GIS as a tool in selecting features.
Afterthought on selecting features The selection conditions on attribute values
can be combined using logical connectives like AND,OR and NOT. A fact is that
the other techniques of selecting features can usually also be combined.
3. Classification
Classification is a technique of purposefully removing detail from an input data set,
in the hope of revealing important patterns (of spatial distribution). In the process,
we produce an output data set, so that the input set can be left intact.
We do so by assigning a characteristic value to each element in the input set, which
is usually a collection of spatial features that can be raster cells or points, lines or
polygons. If the number of characteristic values is small in comparison to the size
of the input set, we have classified the input set.
The pattern that we look for may be the distribution of household income in a city.
Temperature Shift is called the classification parameter. If we know for each ward
in the city the associated average recorded temperature, will have many different
Page 82 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
values.
It can be defined in three different categories (or: classes): 'low', 'Moderate' and
'high', and provide value ranges for each category. If these three categories are
mapped in a sensible color scheme, this may reveal interesting information. This
has been done for Dares Salaam in Figure6.9in two ways.
User-controlled classification
In user-controlled classification, a user selects the attribute(s) that will be used as
the classification parameter(s) and defines the classification method. The
latter involves declaring the number of classes as well as the correspondence
between the old attribute values and the new classes. This is usually done via a
classification table.
Another case exists when the classification parameter is nominal or at least dis-
crete. Such an example is given in Figure6.10.
Page 83 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
Automatic classification
User-controlled classifications require a classification table or user interaction. GIS
software can also perform automatic classification, in which a user only specifies
the number of classes in the output data set. The system automati- cally
determines the class break points. Two main techniques of determining break
points are in use.
Page 84 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
OVERLAY FUNCTIONS
Overlay is a technique of combining two spatial data layers and producing a third
from them. The binary operators that we discuss are known as spatial overlay
operators. We will firstly discuss vector overlay operators, and then focus on the
raster case.
Standard overlay operators take two input data layers, and assume they are
georeferenced in the same system, and overlap in study area. If either of these
requirements is not met, the use of an overlay operator is senseless.
The principle of spatial overlay is to compare the characteristics of the same
location in both data layers, and to produce a result for each location in the output
data layer. The specific result to produce is determined by the user. It might involve
a calculation, or some other logical function to be applied to every area or location.
The expression on the right is evaluated by the GIS, and the raster in which it
results is then stored under the name on the left. The expression may contain
references to existing rasters, operators and functions; the format is made clear
below. The raster names and constants that are used in the expression are
called its operands.
Arithmetic operators
Various arithmetic operators are supported. The standard ones are multiplication
(*), division (/), subtraction (-) and addition (+). Other arithmetic operators may
include modulo division (MOD) and integer division (DIV). Modulo division returns
the remainder of division: for example, 11 MOD 5 will return 1 as 10 - 5 * 2 = 1.
Similarly, 10 DIV 2 will return 5.
Page 86 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
Map algebra also allows the comparison of rasters cell by cell. To this end, we may
use the standard comparison operators (<, <=, =, >=, > and o ) that we introduced
before. A simple raster comparison assignment is: C: = A o B, will store truth value
either true orfalse in the output raster C. Logical connectives like AND, OR, XOR,
NOT are also supported in map algebra.
Conditional expressions
The above comparison and logical operators produce rasters with the truth value
true and false. In practice, we often need a conditional expression with them that
allows us to test whether a condition is fulfilled. The general format is:
Output raster: = CON (condition, then expression, else expression).
Page 87 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
NEIGHBORHOOD FUNCTIONS
The principle in Neighborhood function is to find out the characteristics of the
vicinity, here called neighborhood, of a location. After all, many suitability
questions, for instance, depend not only on what is at the location, but also on
what is near the location. Thus, the GIS must allow us 'to look around locally'.
To perform neighborhood analysis, we must:
1. State which target locations are of interest to us, and define their spatial
extent,
2. Define how to determine the neighborhood for each target,
3. Define which characteristic(s) must be computed for each neighborhood.
For instance, our target might be a nearby ATM. Its neighborhood could be
defined as:
❖ An area within 100m walking distance of an State Bank ATM, or
❖ An area within 2 km travel distance, or
❖ All roads within 500 m travel distance, or
❖ All other Bank ATM within 5 minutes travel time, or
❖ All Banks, for which the ATM is the closest.
Page 88 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
1. Proximity Computations
In proximity computations, we use geometric distance to define the neighborhood
of one or more target locations. The most common and useful technique is buffer
zone generation.
The principle of buffer zone generation is simple : we select one or more target
locations, and then determine the area around them, within a certain distance. In
Figure below, the main roads were selected as targets, and a 75 meter buffer was
computed from them.
This technique will generate a polygon around each target location that identifies
all those locations that 'belong to' that target. We have already seen the use of
Thiessen polygons in the context of interpolation of point data.
Given an input point set that will be the polygon's midpoints, it is not difficult to
Page 89 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
Figure below repeats the Delaunay triangulation of the Thiesse polygon partition
constructed from it is on the right.
2. Computation of Diffusion
The determination of neighborhood of one or more target locations may depend
not only on distance—cases which we discussed above—but also on direction and
differences in the terrain in different directions. This typically is the case when the
target location contains a 'source material' that spreads over time, referred to as
diffusion.
This 'source material' may be air, water or soil pollution, commuters exiting a train
station, people from an opened-up refugee camp, a water spring uphill, or the radio
waves emitted from a radio relay station. In all these cases, one will not expect the
spread to occur evenly in all directions. There will be local terrain factors that
influence the spread, making it easier or more difficult.
Diffusion computation involves one or more target locations, which are better
called source locations in this context. They are the locations of the source of
whatever spreads. The computation also involves a local resistance raster, which
for each cell provides a value that indicates how difficult it is for the 'source
material' to pass by that cell.
The value in the cell must be normalized: i.e. valid for a standardized length
(usually the cell's width) of spread path. From the source location(s) and the local
resistance raster, the GIS will be able to compute a new raster that indicates how
much minimal total resistance the spread has witnessed for reaching a raster cell.
This process is illustrated in Figure below.
Page 90 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
While computing total resistances, the GIS take proper care of the path lengths.
Obviously, the diffusion from a cell csrc to its neighbor cell to the east ce is
shorter than to the cell that is its northeast neighbor cne.
3. Flow Computation
Flow computations determine how a phenomenon spreads over the area, in
principle in all directions, though with varying difficulty or resistance. There are also
cases where a phenomenon does not spread in all directions, but moves or 'flows'
along a given, least- cost path, determined again by local terrain characteristics.
The typical case arises when we want to determine the drainage patterns in a
catchment: the rainfall water 'chooses' a way to leave the area.
Cells with a high accumulated flow count represent areas of concentrated flow, and
thus may belong to a stream. By using some appropriately chosen threshold value
in a map algebra expression, we may decide whether they do. Cells with an
accumulated flow count of zero are local topographic highs, and can be used to
identify ridges.
Page 91 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
Applications
There are numerous examples where more advanced computations on
continuous field representations are needed. A short list is provided below.
❖ Slope angle calculation
The calculation of the slope steepness, expressed as an angle in degrees or
percentages, for any or all locations.
The calculation of the aspect (or orientation) of the slope in degrees (between 0
and 360 degrees), for any or all locations.
With the use of neighborhood operations, it is possible to calculate for each cell
the nearest distance to a watershed boundary (the upslope length) and to the
nearest stream (the downslope length).
Page 92 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
❖ Dynamic modeling
DEMs are increasingly used in GIS-based dynamic modeling, such as the
computation of surface run-off and erosion, groundwater flow, the delineation of
areas affected by pollution, the computation of areas that will be covered by
processes such as debris flows and lava flows.
♦ Visibility analysis
A viewshed is the area that can be 'seen', i.e. in the direct line-of-sight from a
specified target location.
Filtering
The principle of filtering is quite similar to that of moving window averaging. We
define a window and let the GIS move,it over the raster cell-by-cell. For each cell,
the system performs some computation, an4\assigns the result of this
computation to the cell in the output raster.
The. difference withimoving window averaging is that the moving window in filtering
is itself a little raster, which contains cell values that are used in the computation
for the output cell value.
This little raster is a filter, also known as a kernel which may be square (such as a
3x3 kernel), but it does not have to be. The values in the filter are used as weight
factors.
NETWORK ANALYSIS
Page 93 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
❖ Network partitioning
Notice that it is possible to travel on line b in Figure above, then take a U-turn at
node N, and return along a to where one came from. The question is whether doing
this makes sense in optimal path finding.
Network partitioning
In network partitioning, the purpose is to assign lines and/or nodes of the network,
in a mutually exclusive way, to a number of target locations. Typically, the target
locations play the role of service centre for the network. This may be any type of
Page 94 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
❖ The capacity with which a centre can produce the resources (whether they
are medical operations, school pupil positions, kilowatts, or bottles of milk), and
❖ The consumption of the resources, which may vary amongst lines or line
segments. After all, some streets have more accidents, more children who live
there, more industry in high demand of electricity or just more thirsty workers
Trace analysis
Trace analysis is performed when we want to understand which part of a network
is 'conditionally connected' to a chosen node on the network, known as the trace
origin. For a node or line to be conditionally connected, it means that a path exists
from the node/line to the trace origin, and that the connecting path fulfills the
conditions set.
What these conditions are depends on the application, and they may involve
direction of the path, capacity, length, or resource consumption along it. The
condition typically is a logical expression, as we have seen before, for example:
❖ The path must be directed from the node/line to the trace origin,
❖ Its capacity (defined as the minimum capacity of the lines that constitute
the path) must be above a given threshold, and
❖ The path's length must not exceed a given maximum length.
Page 95 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
It is important to note that the categories above are merely different characteristics
of any given application model. Any model can be described according to these
characteristics. Each is briefly discussed below.
Page 96 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
Page 97 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
Modeling of error propagation has been defined by Veregin as: "the application of
formal mathematical models that describe the mechanisms whereby errors in
source data layers are modified by particular data transformation operations."
Page 98 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
UNIT – 5
Data Visualization
GIS AND MAPS
A map is "a representation or abstraction of geographic reality. A tool for presenting
geographic information in a way that is visual, digital or tactile."
The definition holds three key words. The "geographic reality" represents the
object of study, our world. "Representation" and "abstraction" refer to models of
these geographic phenomena. The second sentence reflects the appearance of
the map. A map is a reduced and simplified representation of the Earth's surface
on a plane.
Maps and GIS are closely related to each other. Maps can be used as input for a
GIS. Also play a key role in relation to all the functional components of a GIS.
A map can often be the most suitable tool to solve the question contains "where",
and provide the answer. "Where do I find GPO?" and "Where do B. Sc. IT colleges
are located?". The answers could be in non-map form like "in the FORT Region"
or "in all over Mumbai." These answers could be satisfying; however, they do not
give the full picture.
A map would put these answers in a spatial context. It could show where in the
Netherlands Enschede is to be found and where it is located with respect to
Schiphol– Amsterdam airport, where most students arrive. A world map would
refine the answer “from all over the world,” since it reveals that most students arrive
from Africa and Asia, and only a few come from the Americas, Australia and
Europe as can be seen in Figure7.1.
Page 99 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
❖ What will be the scale of the map: large, small, other? This introduces the
problem of generalization. Generalization addresses the meaningful reduction of
the map content during scale reduction.
❖ Are we dealing with topographic or thematic data? These two categories
traditionally resulted in different design approaches as was explained in the
previous section.
Page 100 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
❖ More important for the design is the question of whether the data to be
represented are of a quantitative or qualitative nature.
Qualitative data is also called nominal or categorical data. This data exists as
discrete, named values without a natural order amongst the values. Examples are
the different languages (e.g. English, Hindi, Marathi, Tamil), the different soil types
(e.g. sand, clay, peat) or the different land use categories (e.g. arable land,
pasture). In the map, qualitative data are classified according to disciplinary
insights such as a soil classification system represented as basic geographic units:
homogeneous areas associated with a single soil type, recognized by the soil
classification.
Quantitative data can be measured, either along an interval or ratio scale. For
data measured on an interval scale, the exact distance between values is known,
Page 102 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
but there is no absolute zero on the scale. Temperature is an example: 40° C is not
twice as warm as 20° C, and 0° C is not an absolute zero. Quantitative data with a
ratio scale does have a known absolute zero. An example is income: someone
earning ? 1000 earns twice as much as someone with an income of ? 500. In order
to generate maps, quantitative data are often classified into categories according
to some mathematical method.
These visual variables can be used to make one symbol different from another.
In doing this, map makers in principle have free choice, provided they do not
violate the rules of cartographic grammar. They do not have that choice when
deciding where to locate the symbol in the map. The symbol should be located
where features belong. Visual variables influence the map user’s perception in
different ways. What is perceived depends on the human capacity to see or
perceive:
HOW TO MAP...?
The application of colour would be the best solution since is has characteristics
that allow one to quickly differentiate between different geographic units. How-
ever, since none of the watersheds is more important than the others, the colours
used have to be of equal visual weight or brightness. Figure7.12gives an example
of a correct map.
The fact that it is easy to make errors can be seen in Figure7.15. In7.15(a), differ-
ent tints of green (the visual variable ‘value’) have been used to represent absolute
population numbers. The reader might get a reasonable impression of the indi-
vidual amounts but not of the actual geographic distribution of the population, as
the size of the geographic units will influence the perceptional properties too much.
Imagine a small and a large unit having the same number of inhabitants.
The large unit would visually attract more attention, giving the impression there are
more people than in the small unit. Another issue is that the population is not
necessarily homogeneously distributed within the geographic units. Colour has
also been misused in Figure7.15(b).
The shaded relief map uses the full three-dimensional information to create
shading effects. This map, represented on a two-dimensional surface, can also
be floated in three- dimensional space to give it a teal three-dimensional
appearance of a 'virtual world', as shown in Figure (d). Looking at such a
representation one can immediately imagine that it will not always be effective.
Certain (low) objects in the map will easily disappear behind other (higher)
objects.
Socio-economic data can also be viewed in three dimensions. This may result in
dramatic images, which will be long remembered by the map user. Figure7.19
shows the absolute population figures of Overijssel in three dimensions.
Single static map: Specific graphic variables and symbols are used to indi- cate
change or represent an event. Figure7.20(a) applies the visual variable value to
represent the age of the built-up areas;
Series of static maps: A single map in the series represents a ‘snapshot’ in time.
Together, the maps depict a process of change. Change is perceived by the
succession of individual maps depicting the situation in successive snapshots. It
could be said that the temporal sequence is represented by a spatial sequence,
which the user has to follow, to perceive the temporal variation. The number of
images should be limited since it is difficult for the human eye to follow long series
of maps (Figure7.20(b));
Animated map: Change is perceived to happen in a single image by display- ing
several snapshots after each other just like a video cut with successive frames.
The difference with the series of maps is that the variation can be deduced from
real ‘change’ in the image itself, not from a spatial sequence (Figure7.20(c)).
MAP COSMETICS
Most maps in this chapter are correct from a cartographic grammar perspective.
However, many of them lack the additional information needed to be fully
understood that is usually placed in the margin of printed maps. Each map should
have, next to the map image, a title, informing the user about the topic visualized.
A legend is necessary to understand how the topic is depicted.
Additional marginal information to be found on a map is a scale indicator, a north
arrow for orientation, the map datum and map projection used, and some lineage
information, (such as data sources, dates of data collection, methods used, etc.).
Further information can be added that indicates when the map was issued, and by
whom (author / publisher). All this information allows the user to obtain an
impression of the quality of the map, and is comparable with metadata describing
the contents of a database or data layer.
Figure below illustrates these map elements. On paper maps, these elements (if
all relevant) have to appear next to the map face itself. Maps presented on screen
often go without marginal information, partly because of space constraints.
However, on-screen maps are often interactive, and clicking on a map element
may reveal additional information from the database. Legends and titles are often
available on demand as well.
Text is used to transfer information in addition to the symbols used. This can be
done by the application of the visual variables to the text as well. Italics—cf. the
visual variable of orientation—have been used for building names to distinguish
them from road names. Another common example is the use of colour to
differentiate (at nominal level) between hydrographic names (in blue) and other
names (in black). The text should also be placed in a proper position with respect
to the object to which it refers.
MAP DISSEMINATION
The map design will not only be influenced by the nature of the data to be mapped
or the intended audience (the 'what' and 'whom' from "How do I say What to Whom,
and is it Effective"), the output medium also plays a role. Traditionally, maps were
produced on paper, and many still are. Currently, most maps are presented on
screen, for a quick view, for an internal presentation or for presentation on the
WWW.
Compared to maps on paper, on-screen maps have to be smaller, and therefore
their contents should be carefully selected. This might seem a disadvantage, but
presenting maps on-screen offers very interesting alternatives. A mouse click could
also open the link to a database, and reveal much more information than a paper
map could ever offer. Links to other than tabular or map data could also be made
available.
Maps and multimedia (photography, sound, video or animation) can be integrated.
Some of today's electronic atlases, such as the Encarta World Atlas are good
examples of how multimedia elements can be integrated with the map. Pointing to
a country on a world map starts the national anthem of the country or shows its
flag. It can be used to explore a country's language; moving the mouse would start
a short sentence in the region's dialects.
The World Wide Web is a popular medium used to present and disseminate spatial
data. Here, maps can play their traditional role, for instance to show the location of
objects, or provide insight into spatial patterns, but because of the nature of the
internet, the map can also function as an interface to additional information.
Geographic locations on the map can be linked to photographs, text, sound or other
Page 109 of 110
YouTube - Abhay More | Telegram - abhay_more
607A, 6th floor, Ecstasy business park, city of joy, JSD road, mulund (W) | 8591065589/022-25600622
TRAINING -> CERTIFICATION -> PLACEMENT BSC IT : SEM - VI : PGIS U1 - U5
maps, perhaps even functions such as on-line booking services. Maps can also be
used as 'previews' of spatial data products to be acquired through a spatial data
clearinghouse that is part of a Spatial Data Infrastructure. For that purpose we can
make use of geo-webservices which can provideinteractive map views as intermediate
between data and web browser.