Unit-2 - GIS Data Model
Unit-2 - GIS Data Model
What is important
is some measure of distance and impendence (interaction) between specified
GIS does not store a map in any conventional sense. Instead GIS stores the
phenomena.
data from which we can draw a desired view to suit a particular purpose.
Network Based Model
There are two types of data in GIS
E.g. Studies of Traffic on road, analysis of flow of water, flow of electricity
Spatial Data (Location of a particular feature)
etc.
Attribute data (information about features. E.g. name of roads, forest
Field Model:
type etc.)
Field Based model is appropriate for modeling phenomena that are regarded
The two data models common in GIS are Vector data model and Raster
as continuously variable across some region of space.
data Model.
E.g. concentration of pollutants in the air, temperature of ground surface,
2.1 Spatial Information:
moisture level of soil, elevation of ground etc.
Spatial characteristics of information can be broadly distinguished between:
Field model may represent either 2 or 3 dimensions depending upon the
a) Those that describe where things are? Using locations consisting of
applications.
reference positions, spatial units and spatial relationships.
2.3 Vector Based Model:
b) Those that describe the form of phenomena using qualitative and
A vector based GIS is defined by the vectorial representation of its
quantitative description of shape and structure.
geographic data. According with the characteristics of this data model,
c) Those that describe associations and interaction between different
geographic objects are explicitly represented and, within the spatial
phenomena.
characteristics, the thematic aspects are associated.
Basic Concepts:
The vector representation of an object is an attempt to represent the object as
1) All geographic data can be represented by three basic
exactly as possible.
entities:
The geographical phenomena are represented by three basic entities along
Point , Line, Area or Polygon plus a label saying what is it E.g
with their attributes.
An oil well could be represented by a single point
Point – City – population, no. of school, no. of houses etc.
consisting of X, Y coordinates.
Line – Road – Type of road, road name etc.Area – Landuse – class, soil type
Road – represented by a series of X, Y coordinates
etc.
Forest – represented by a set of X, Y coordinates plus
The coordinate space is assumed to be continuous, allowing all positions,
the label forest. The label could be actual name or a special symbol.
lengths and dimensions to be defined precisely.
2) Layers and Coverages:
The vector data structure represents each geographical feature by a set of
GIS organize spatial data into layers or coverages
coordinates.
Typical layers represent information belonging to
The basic thing is to define a 2D space where coordinates on the two axes
particular class. E.g. Roads, Rivers, Vegetation types are different layers.
represent features.
All the layers or coverages pertaining to an area are
Point Features:
referenced to a common projection system
A zero-dimensional abstraction of an object represented by a single X,Y co-
The layers can be combined with each other in
ordinate. A point normally represents a geographic feature too small to be
various ways to create new layers that are functions of individual layers. Land
displayed as a line or area; for example, the location of a building location on a
use- settlement – drainage- road
small-scale map, or the location of a service cover on a medium scale map.
3) Data Model:
Besides the X, Y coordinate, other data must be stored to indicate what kind
In order to represent the spatial information and their
of point it is and other information associated with it. Fig. 1 shows a typical
attributes, a Data Model – a set of logical definitions or rules for characterizing
point data stored in GIS.
the geographical data is adopted
Line Features:
The Data Model represents the linkages between the real
A set of ordered co-ordinates that represent the shape of geographic features
world domain of geographical data and the computer and GIS representation of
too narrow to be displayed as an area at the given scale (contours, street
these features.
centrelines, or streams), or linear features with no area (county boundary lines).
2.2 Conceptual Models of Spatial Information:
A lines is synonymous with an arc.
There are different models – which have influenced the way in which data
Simplest line required the storage of begin point and end point. (Two X, Y
are organized and processed within GIS.
coordinates plus a possible record). An arc, a chain or string is asset of n X, Y
They are based on Objects, Network and Fields.
coordinate pairs describing a continuous complex line.
Object Based Model:
Shorter the line segment and larger the no. of X,Y coordinate pairs, the
Object based spatial models emphasize individual phenomena that are to be
closer the chain will approximate a complex curve. Fig.2
studied in isolation or in terms of their relationship with other phenomena.
Area Features (Polygon Features):
An object-based view is appropriate to phenomena that have a well-defined
A feature used to represent areas. A polygon is defined by the lines that
boundary.
make up its boundary and a point inside its boundary for identification. Polygons
Network Model:
have attributes that describe the geographic feature they represent.
Network based spatial module share some aspects of the object based
The boundary of area features separate the interior area from the exterior
module in that “they often deal with discrete phenomenon”.
area.
But the essential characteristic is the need to consider interaction between
It may be isolated or connected. Fig:3
multiple objects, often along discrete path or routes that connect them.
2.4 Raster Based Model:
Raster based spatial models regard space as a tessellation (resembled A database is a collection of data that can be shared by different users. It is a group of
mosaic) of cells, each of which is associated with a record of classification or records and files that are organized so that there is little or no redundancy.
identity of the phenomena that occupies it. A database consists of data in many files. In order to access data from one or more files
easily it is necessary to have some kind of structures or organization.
The raster model represents the 2D location of phenomena as a matrix of
Data Base Management System (DBMS) is a tool for representing, in computer, real
grid cell.
world oriented model of set of data in a predefined structure and organized manner.
Each cell is known as pixel (Short form of Picture Element).
This high level representation or abstraction is refereed to as Conceptual Model which
Since the cells are of fixed size and location, raster tend to represent natural ensures the data linking, data security, sub-setting, query using logical / arithmetic syntax
and human made objects in a blocky fashion. etc.
The information content in one cell depends upon the size of the cell. If the cells are Most commercial DBMS softwares like Oracle, Dbase, MS Access etc are implemented
sufficiently small, the information present in one cell will be more. This is called resolution by three types of data models namely Hierarchical data structure, Network structure and
of the image. Relational structure.
The raster model or grid cell is relatively simple approach to data representation both 1. Hierarchical Data Model:
conceptually and operationally, and hence has been popular since the earliest days of GIS It is a tree-based structure. The tree
development. is composed of nodes; the upper most node is called a root.
The simplest raster data structure consists of an array of grid cells. A row and column With the exception of this root,
number references each grid cell and it contains a number representing the type or value of every node is related to a node at higher level called its parent. The lower level is called
the attribute being mapped. Fig: 4 a, 4b and 4c explain the raster model, raster child.
representation of location and raster resolution respectively. This approach is efficient if all desired access paths follow the parent child linkage.
In raster structure a single cell represents a point. A line by a number of neighboring However, it requires a relatively inflexible structure and hence linkage with other
cells string out in a given direction and area by agglomeration (mass) of neighboring cells. branch of database is tough or cumbersome. That is why this data base structure is not very
Fig: 4d show the raster representation of discrete features, Point, Line and Area. common in flexible GIS.
Since each cell is associated with a value called cell value or pixel value, it is very easy 2. Network Structure:
to carry out overlay operations to compare attributes recorded in different layers. Network structure exists when child in a data relationship has more than one parent.
Each attribute associated with a grid cell can be combined logically or arithmetically An item in such a structure can be linked to any other item.
with attributes in corresponding cells of the other layers to create a new attribute value for It is good for network-based analysis.
the resulting overlay.
3. Relational Structure:
Transitional areas are poorly represented by raster-based model.
In this case data are organized in 2D tables consisting of rows and columns. The rows
The Choice between Raster and Vector Models are called records and columns are called items or fields.
. However, there is always scope to convert one form to other. i.e., raster to vector or vector to raster
Such tables are easy to develop and understand.
Raster method for spatial data structure requires large memory space as compared to
Different sets of tables are created within database and a relationship is established
vector data.
between each table.
Certain kinds of data manipulation such as polygon intersection, union, clipping,
Because of this, it is easy to create a subset of data fro one user or to join two tables for
merging etc are complex in raster data model as compared to vector.
other user to form a large table.
However, multi-theme overlay operations are more easy in raster data model.
The structure can be described mathematically, hence mathematics provide the basis for
Similarly, representation of surfaces is more common in raster-based model. extracting some columns from the table and for joining various columns.
Vector Data Model: This capability to manipulate relations provides flexibility that is normally not available
Advantages:
in hierarchy and network structures.
Good and real representation of geographic data Relational Operators:
Compact data structure Retrieval of data sets from relational model involves creation of new relation, which is a
Topology can be completely described characteristic of permanently stored relations.
Accurate graphic output There are several relations algebra operators that can be used to search and manipulate
Less storage space. relations.
Disadvantages: These operators are implemented by means of Structured Query Language (SQL) using
Data structure is complex number of commands.
Combination of several vector polygons create difficulties in handling Important Features of Relational database:
i) Primary and Foreign Keys
Simulation is difficult because each unit has a different topological form
ii) Relational Joins
Display and plotting are expensive.
i) Primary and Foreign Keys:
Raster Data Model:
Relational approach is used to design database table
Advantages:
Since each table or relation represents a set, it cannot have any rows whose
Simple data structure
entire contents are duplicated.
The overlay of mapped data with remote sensing data is easy
Secondly, as each row must be different to every other, a value in a single
Simulation is easy because each spatial unit has same size and shape
column or a combination of values in multiple columns can be used to define a primary key
Good for multiplayer overlay.
for the table, which allows each row to be uniquely identified.
Disadvantages:
The uniqueness allows the primary key to serve as the sole row level
Data is voluminous and require large storage space
addressing mechanism in the relational database model.
Use of large cell to reduce data volume loses significant information
A field that stores the key of another table is called foreign key
Crude raster maps have ugly look ii) Relational join:
Network linkages are difficult to establish. The mechanism for linking data in different tables is called relational join.
2.5 Conceptual Model for Non-Spatial Information: Values in a column or columns in one table are matched to corresponding values
Non-spatial information, also known as attribute data, is descriptive data that in column in second table.
defines spatial data. Matching is frequently based on primary key in one table and foreign key in the
They are gathered and assembled into records and files second column.