Geographic Information System
Geographic Information System
Geographic Information System
1 Contents
[hide]
1History of development
2GIS techniques and technology
o 2.1Relating information from different sources
o 2.2GIS uncertainties
o 2.3Data representation
o 2.4Data capture
o 2.5Raster-to-vector translation
o 2.6Projections, coordinate systems, and registration
3Spatial analysis with geographical information system (GIS)
o 3.1Slope and aspect
o 3.2Data analysis
o 3.3Topological modeling
o 3.4Geometric networks
o 3.5Hydrological modeling
o 3.6Cartographic modeling
o 3.7Map overlay
o 3.8Geostatistics
o 3.9Address geocoding
o 3.10Reverse geocoding
o 3.11Multi-criteria decision analysis
o 3.12Data output and cartography
o 3.13Graphic display techniques
o 3.14Spatial ETL
o 3.15GIS data mining
4Applications
o 4.1Open Geospatial Consortium standards
o 4.2Web mapping
o 4.3Adding the dimension of time
5Semantics
6Implications of GIS in society
7See also
8References
9Further reading
2 History of development[edit]
The first known use of the term "geographic information system" was by Roger Tomlinson in the
year 1968 in his paper "A Geographic Information System for Regional Planning". [5] Tomlinson is
also acknowledged as the "father of GIS".[6]
E. W. Gilbert's version (1958) of John Snow's 1855 map of the Soho cholera outbreak showing the clusters
of cholera cases in the London epidemic of 1854
Previously, one of the first applications of spatial analysis in epidemiology is the 1832 "Rapport
sur la marche et les effets du choléra dans Paris et le département de la Seine".[7] The French
geographer Charles Picquet represented the 48 districts of the city of Paris by halftone color
gradient according to the number of deaths by cholera per 1,000 inhabitants. In 1854 John
Snow determined the source of a cholera outbreak in London by marking points on a map
depicting where the cholera victims lived, and connecting the cluster that he found with a nearby
water source. This was one of the earliest successful uses of a geographic methodology in
epidemiology. While the basic elements of topography and theme existed previously
in cartography, the John Snow map was unique, using cartographic methods not only to depict
but also to analyze clusters of geographically dependent phenomena.
The early 20th century saw the development of photozincography, which allowed maps to be
split into layers, for example one layer for vegetation and another for water. This was particularly
used for printing contours – drawing these was a labour-intensive task but having them on a
separate layer meant they could be worked on without the other layers to confuse
the draughtsman. This work was originally drawn on glass plates but later plastic film was
introduced, with the advantages of being lighter, using less storage space and being less brittle,
among others. When all the layers were finished, they were combined into one image using a
large process camera. Once color printing came in, the layers idea was also used for creating
separate printing plates for each color. While the use of layers much later became one of the
main typical features of a contemporary GIS, the photographic process just described is not
considered to be a GIS in itself – as the maps were just images with no database to link them to.
Computer hardware development spurred by nuclear weapon research led to general-purpose
computer "mapping" applications by the early 1960s. [8]
The year 1960 saw the development of the world's first true operational GIS in Ottawa, Ontario,
Canada by the federal Department of Forestry and Rural Development. Developed by Dr. Roger
Tomlinson, it was called the Canada Geographic Information System (CGIS) and was used to
store, analyze, and manipulate data collected for the Canada Land Inventory – an effort to
determine the land capability for rural Canada by mapping information about soils, agriculture,
recreation, wildlife, waterfowl, forestry and land use at a scale of 1:50,000. A rating classification
factor was also added to permit analysis.
CGIS was an improvement over "computer mapping" applications as it provided capabilities for
overlay, measurement, and digitizing/scanning. It supported a national coordinate system that
spanned the continent, coded lines as arcs having a true embedded topology and it stored the
attribute and locational information in separate files. As a result of this, Tomlinson has become
known as the "father of GIS", particularly for his use of overlays in promoting the spatial analysis
of convergent geographic data.[9]
CGIS lasted into the 1990s and built a large digital land resource database in Canada. It was
developed as a mainframe-based system in support of federal and provincial resource planning
and management. Its strength was continent-wide analysis of complex datasets. The CGIS was
never available commercially.
In 1964 Howard T. Fisher formed the Laboratory for Computer Graphics and Spatial Analysis at
the Harvard Graduate School of Design (LCGSA 1965–1991), where a number of important
theoretical concepts in spatial data handling were developed, and which by the 1970s had
distributed seminal software code and systems, such as SYMAP, GRID, and ODYSSEY – that
served as sources for subsequent commercial development—to universities, research centers
and corporations worldwide.[10]
By the late 1970s two public domain GIS systems (MOSS and GRASS GIS) were in
development, and by the early 1980s, M&S Computing (later Intergraph) along with Bentley
Systems Incorporated for the CAD platform, Environmental Systems Research Institute
(ESRI), CARIS (Computer Aided Resource Information System), MapInfo Corporation and
ERDAS (Earth Resource Data Analysis System) emerged as commercial vendors of
GIS software, successfully incorporating many of the CGIS features, combining the first
generation approach to separation of spatial and attribute information with a second generation
approach to organizing attribute data into database structures.[11]
In 1986, Mapping Display and Analysis System (MIDAS), the first desktop GIS product [citation
needed]
was released for the DOS operating system. This was renamed in 1990 to MapInfo for
Windows when it was ported to the Microsoft Windows platform. This began the process of
moving GIS from the research department into the business environment.
By the end of the 20th century, the rapid growth in various systems had been consolidated and
standardized on relatively few platforms and users were beginning to explore viewing GIS data
over the Internet, requiring data format and transfer standards. More recently, a growing number
of free, open-source GIS packages run on a range of operating systems and can be customized
to perform specific tasks. Increasingly geospatial data and mapping applications are being made
available via the world wide web.[12]
Several articles on the history of GIS have been published. [13][14]
3 GIS techniques and technology[edit]
Modern GIS technologies use digital information, for which various digitized data creation
methods are used. The most common method of data creation is digitization, where a hard
copy map or survey plan is transferred into a digital medium through the use of a CAD program,
and geo-referencing capabilities. With the wide availability of ortho-rectified imagery (from
satellites, aircraft, Helikites and UAVs), heads-up digitizing is becoming the main avenue through
which geographic data is extracted. Heads-up digitizing involves the tracing of geographic data
directly on top of the aerial imagery instead of by the traditional method of tracing the geographic
form on a separate digitizing tablet (heads-down digitizing).[clarification needed]
Relating information from different sources[edit]
GIS uses spatio-temporal (space-time) location as the key index variable for all other information.
Just as a relational database containing text or numbers can relate many different tables using
common key index variables, GIS can relate otherwise unrelated information by using location as
the key index variable. The key is the location and/or extent in space-time.
Any variable that can be located spatially, and increasingly also temporally, can be referenced
using a GIS. Locations or extents in Earth space–time may be recorded as dates/times of
occurrence, and x, y, and z coordinates representing, longitude, latitude, and elevation,
respectively. These GIS coordinates may represent other quantified systems of temporo-spatial
reference (for example, film frame number, stream gage station, highway mile-marker, surveyor
benchmark, building address, street intersection, entrance gate, water depth
sounding, POS or CAD drawing origin/units). Units applied to recorded temporal-spatial data can
vary widely (even when using exactly the same data, see map projections), but all Earth-based
spatial–temporal location and extent references should, ideally, be relatable to one another and
ultimately to a "real" physical location or extent in space–time.
Related by accurate spatial information, an incredible variety of real-world and projected past or
future data can be analyzed, interpreted and represented. [15] This key characteristic of GIS has
begun to open new avenues of scientific inquiry into behaviors and patterns of real-world
information that previously had not been systematically correlated.
GIS uncertainties[edit]
GIS accuracy depends upon source data, and how it is encoded to be data referenced. Land
surveyors have been able to provide a high level of positional accuracy utilizing the GPS-derived
positions.[16] High-resolution digital terrain and aerial imagery, [17] powerful computers and Web
technology are changing the quality, utility, and expectations of GIS to serve society on a grand
scale, but nevertheless there are other source data that affect overall GIS accuracy like paper
maps, though these may be of limited use in achieving the desired accuracy.
In developing a digital topographic database for a GIS, topographical maps are the main source,
and aerial photography and satellite imagery are extra sources for collecting data and identifying
attributes which can be mapped in layers over a location facsimile of scale. The scale of a map
and geographical rendering area representation type[clarification needed] are very important aspects since
the information content depends mainly on the scale set and resulting locatability of the map's
representations. In order to digitize a map, the map has to be checked within theoretical
dimensions, then scanned into a raster format, and resulting raster data has to be given a
theoretical dimension by a rubber sheeting/warping technology process.
A quantitative analysis of maps brings accuracy issues into focus. The electronic and other
equipment used to make measurements for GIS is far more precise than the machines of
conventional map analysis. All geographical data are inherently inaccurate, and these
inaccuracies will propagate through GIS operations in ways that are difficult to predict.
Data representation[edit]
Main article: GIS file formats
GIS data represents real objects (such as roads, land use, elevation, trees, waterways, etc.) with
digital data determining the mix. Real objects can be divided into two abstractions: discrete
objects (e.g., a house) and continuous fields (such as rainfall amount, or elevations).
Traditionally, there are two broad methods used to store data in a GIS for both kinds of
abstractions mapping references: raster images and vector. Points, lines, and polygons are the
stuff of mapped location attribute references. A new hybrid method of storing data is that of
identifying point clouds, which combine three-dimensional points with RGB information at each
point, returning a "3D color image". GIS thematic maps then are becoming more and more
realistically visually descriptive of what they set out to show or determine.
For a list of popular GIS file formats, such as shapefiles, see GIS file formats § Popular GIS file
formats.
Data capture[edit]
Example of hardware for mapping (GPS and laser rangefinder) and data collection (rugged computer). The
current trend for geographical information system (GIS) is that accurate mapping and data analysis are
completed while in the field. Depicted hardware (field-map technology) is used mainly for forest inventories,
monitoring and mapping.
The elevation at a point or unit of terrain will have perpendicular tangents (slope) passing
through the point, in an east-west and north-south direction. These two tangents give two
components, ∂z/∂x and ∂z/∂y, which then be used to determine the overall direction of slope, and
the aspect of the slope. The gradient is defined as a vector quantity with components equal to
the partial derivatives of the surface in the x and y directions. [26]
The calculation of the overall 3x3 grid slope S and aspect A for methods that determine east-
west and north-south component use the following formulas respectively:
Zhou and Liu[25] describe another formula for calculating aspect, as follows:
Data analysis[edit]
It is difficult to relate wetlands maps to rainfall amounts recorded at different points such as
airports, television stations, and schools. A GIS, however, can be used to depict two- and three-
dimensional characteristics of the Earth's surface, subsurface, and atmosphere from information
points. For example, a GIS can quickly generate a map with isopleth or contour lines that indicate
differing amounts of rainfall. Such a map can be thought of as a rainfall contour map. Many
sophisticated methods can estimate the characteristics of surfaces from a limited number of point
measurements. A two-dimensional contour map created from the surface modeling of rainfall
point measurements may be overlaid and analyzed with any other map in a GIS covering the
same area. This GIS derived map can then provide additional information - such as the viability
of water power potential as a renewable energy source. Similarly, GIS can be used to compare
other renewable energy resources to find the best geographic potential for a region. [27]
Additionally, from a series of three-dimensional points, or digital elevation model, isopleth lines
representing elevation contours can be generated, along with slope analysis, shaded relief, and
other elevation products. Watersheds can be easily defined for any given reach, by computing all
of the areas contiguous and uphill from any given point of interest. Similarly, an
expected thalweg of where surface water would want to travel in intermittent and permanent
streams can be computed from elevation data in the GIS.
Topological modeling[edit]
A GIS can recognize and analyze the spatial relationships that exist within digitally stored spatial
data. These topological relationships allow complex spatial modelling and analysis to be
performed. Topological relationships between geometric entities traditionally include adjacency
(what adjoins what), containment (what encloses what), and proximity (how close something is to
something else).
Geometric networks[edit]
Geometric networks are linear networks of objects that can be used to represent interconnected
features, and to perform special spatial analysis on them. A geometric network is composed of
edges, which are connected at junction points, similar to graphs in mathematics and computer
science. Just like graphs, networks can have weight and flow assigned to its edges, which can
be used to represent various interconnected features more accurately. Geometric networks are
often used to model road networks and public utility networks, such as electric, gas, and water
networks. Network modeling is also commonly employed in transportation
planning, hydrology modeling, and infrastructure modeling.
Hydrological modeling[edit]
GIS hydrological models can provide a spatial element that other hydrological models lack, with
the analysis of variables such as slope, aspect and watershed or catchment area.[28] Terrain
analysis is fundamental to hydrology, since water always flows down a slope. [28] As basic terrain
analysis of a digital elevation model (DEM) involves calculation of slope and aspect, DEMs are
very useful for hydrological analysis. Slope and aspect can then be used to determine direction
of surface runoff, and hence flow accumulation for the formation of streams, rivers and lakes.
Areas of divergent flow can also give a clear indication of the boundaries of a catchment. Once a
flow direction and accumulation matrix has been created, queries can be performed that show
contributing or dispersal areas at a certain point. [28] More detail can be added to the model, such
as terrain roughness, vegetation types and soil types, which can influence infiltration and
evapotranspiration rates, and hence influencing surface flow. One of the main uses of
hydrological modeling is in environmental contamination research.
Cartographic modeling[edit]
An example of use of layers in a GIS application. In this example, the forest cover layer (light green) is at
the bottom, with the topographic layer over it. Next up is the stream layer, then the boundary layer, then the
road layer. The order is very important in order to properly display the final result. Note that the pond layer
was located just below the stream layer, so that a stream line can be seen overlying one of the ponds.
The term "cartographic modeling" was probably coined by Dana Tomlin in his PhD dissertation
and later in his book which has the term in the title. Cartographic modeling refers to a process
where several thematic layers of the same area are produced, processed, and analyzed. Tomlin
used raster layers, but the overlay method (see below) can be used more generally. Operations
on map layers can be combined into algorithms, and eventually into simulation or optimization
models.
Map overlay[edit]
The combination of several spatial datasets (points, lines, or polygons) creates a new output
vector dataset, visually similar to stacking several maps of the same region. These overlays are
similar to mathematical Venn diagram overlays. A union overlay combines the geographic
features and attribute tables of both inputs into a single new output. An intersect overlay defines
the area where both inputs overlap and retains a set of attribute fields for each. A symmetric
difference overlay defines an output area that includes the total area of both inputs except for the
overlapping area.
Data extraction is a GIS process similar to vector overlay, though it can be used in either vector
or raster data analysis. Rather than combining the properties and features of both datasets, data
extraction involves using a "clip" or "mask" to extract the features of one data set that fall within
the spatial extent of another dataset.
In raster data analysis, the overlay of datasets is accomplished through a process known as
"local operation on multiple rasters" or "map algebra," through a function that combines the
values of each raster's matrix. This function may weigh some inputs more than others through
use of an "index model" that reflects the influence of various factors upon a geographic
phenomenon.
Geostatistics[edit]
Main article: Geostatistics
Geostatistics is a branch of statistics that deals with field data, spatial data with a continuous
index. It provides methods to model spatial correlation, and predict values at arbitrary locations
(interpolation).
When phenomena are measured, the observation methods dictate the accuracy of any
subsequent analysis. Due to the nature of the data (e.g. traffic patterns in an urban environment;
weather patterns over the Pacific Ocean), a constant or dynamic degree of precision is always
lost in the measurement. This loss of precision is determined from the scale and distribution of
the data collection.
To determine the statistical relevance of the analysis, an average is determined so that points
(gradients) outside of any immediate measurement can be included to determine their predicted
behavior. This is due to the limitations of the applied statistic and data collection methods, and
interpolation is required to predict the behavior of particles, points, and locations that are not
directly measurable.
Hillshade model derived from a Digital Elevation Model of the Valestra area in the northern Apennines
(Italy)
Interpolation is the process by which a surface is created, usually a raster dataset, through the
input of data collected at a number of sample points. There are several forms of interpolation,
each which treats the data differently, depending on the properties of the data set. In comparing
interpolation methods, the first consideration should be whether or not the source data will
change (exact or approximate). Next is whether the method is subjective, a human interpretation,
or objective. Then there is the nature of transitions between points: are they abrupt or gradual.
Finally, there is whether a method is global (it uses the entire data set to form the model), or local
where an algorithm is repeated for a small section of terrain.
Interpolation is a justified measurement because of a spatial autocorrelation principle that
recognizes that data collected at any position will have a great similarity to, or influence of those
locations within its immediate vicinity.
Digital elevation models, triangulated irregular networks, edge-finding algorithms, Thiessen
polygons, Fourier analysis, (weighted) moving averages, inverse distance
weighting, kriging, spline, and trend surface analysis are all mathematical methods to produce
interpolative data.
Address geocoding[edit]
Main article: Geocoding
Geocoding is interpolating spatial locations (X,Y coordinates) from street addresses or any other
spatially referenced data such as ZIP Codes, parcel lots and address locations. A reference
theme is required to geocode individual addresses, such as a road centerline file with address
ranges. The individual address locations have historically been interpolated, or estimated, by
examining address ranges along a road segment. These are usually provided in the form of a
table or database. The software will then place a dot approximately where that address belongs
along the segment of centerline. For example, an address point of 500 will be at the midpoint of a
line segment that starts with address 1 and ends with address 1,000. Geocoding can also be
applied against actual parcel data, typically from municipal tax maps. In this case, the result of
the geocoding will be an actually positioned space as opposed to an interpolated point. This
approach is being increasingly used to provide more precise location information.
Reverse geocoding[edit]
Reverse geocoding is the process of returning an estimated street address number as it relates
to a given coordinate. For example, a user can click on a road centerline theme (thus providing a
coordinate) and have information returned that reflects the estimated house number. This house
number is interpolated from a range assigned to that road segment. If the user clicks at
the midpoint of a segment that starts with address 1 and ends with 100, the returned value will be
somewhere near 50. Note that reverse geocoding does not return actual addresses, only
estimates of what should be there based on the predetermined range.
Multi-criteria decision analysis[edit]
Coupled with GIS, multi-criteria decision analysis methods support decision-makers in analysing
a set of alternative spatial solutions, such as the most likely ecological habitat for restoration,
against multiple criteria, such as vegetation cover or roads. MCDA uses decision rules to
aggregate the criteria, which allows the alternative solutions to be ranked or prioritised. [29] GIS
MCDA may reduce costs and time involved in identifying potential restoration sites.
Data output and cartography[edit]
Cartography is the design and production of maps, or visual representations of spatial data. The
vast majority of modern cartography is done with the help of computers, usually using GIS but
production of quality cartography is also achieved by importing layers into a design program to
refine it. Most GIS software gives the user substantial control over the appearance of the data.
Cartographic work serves two major functions:
First, it produces graphics on the screen or on paper that convey the results of analysis to the
people who make decisions about resources. Wall maps and other graphics can be generated,
allowing the viewer to visualize and thereby understand the results of analyses or simulations of
potential events. Web Map Servers facilitate distribution of generated maps through web
browsers using various implementations of web-based application programming interfaces
(AJAX, Java, Flash, etc.).
Second, other database information can be generated for further analysis or use. An example
would be a list of all addresses within one mile (1.6 km) of a toxic spill.
Graphic display techniques[edit]
Traditional maps are abstractions of the real world, a sampling of important elements portrayed
on a sheet of paper with symbols to represent physical objects. People who use maps must
interpret these symbols. Topographic maps show the shape of land surface with contour lines or
with shaded relief.
Today, graphic display techniques such as shading based on altitude in a GIS can make
relationships among map elements visible, heightening one's ability to extract and analyze
information. For example, two types of data were combined in a GIS to produce a perspective
view of a portion of San Mateo County, California.
5 Applications[edit]
The implementation of a GIS is often driven by jurisdictional (such as a city), purpose, or
application requirements. Generally, a GIS implementation may be custom-designed for an
organization. Hence, a GIS deployment developed for an application, jurisdiction, enterprise, or
purpose may not be necessarily interoperable or compatible with a GIS that has been developed
for some other application, jurisdiction, enterprise, or purpose. [citation needed]
GIS provides, for every kind of location-based organization, a platform to update geographical
data without wasting time to visit the field and update a database manually. GIS when integrated
with other powerful enterprise solutions like SAP[31] and the Wolfram Language[32] helps creating
powerful decision support system at enterprise level.[clarification needed][citation needed]
GeaBios – tiny WMS/WFS client (Flash/DHTML)
Many disciplines can benefit from GIS technology. An active GIS market has resulted in lower
costs and continual improvements in the hardware and software components of GIS, and usage
in the fields of science, government, business, and industry, with applications including real
estate, public health, crime mapping, national defense, sustainable development, natural
resources, climatology,[33][34] landscape architecture, archaeology, regional and community
planning, transportation and logistics. GIS is also diverging into location-based services, which
allows GPS-enabled mobile devices to display their location in relation to fixed objects (nearest
restaurant, gas station, fire hydrant) or mobile objects (friends, children, police car), or to relay
their position back to a central server for display or other processing.
Open Geospatial Consortium standards[edit]
Main article: Open Geospatial Consortium
The Open Geospatial Consortium (OGC) is an international industry consortium of
384 companies, government agencies, universities, and individuals participating in a consensus
process to develop publicly available geoprocessing specifications. Open interfaces and
protocols defined by OpenGIS Specifications support interoperable solutions that "geo-enable"
the Web, wireless and location-based services, and mainstream IT, and empower technology
developers to make complex spatial information and services accessible and useful with all kinds
of applications. Open Geospatial Consortium protocols include Web Map Service, and Web
Feature Service.[35]
GIS products are broken down by the OGC into two categories, based on how completely and
accurately the software follows the OGC specifications.
OGC standards help GIS tools communicate.
6 Semantics[edit]
Tools and technologies emerging from the W3C's Data Activity are proving useful for data
integration problems in information systems. Correspondingly, such technologies have been
proposed as a means to facilitate interoperability and data reuse among GIS applications.[36]
[37]
and also to enable new analysis mechanisms.[38]
Ontologies are a key component of this semantic approach as they allow a formal, machine-
readable specification of the concepts and relationships in a given domain. This in turn allows a
GIS to focus on the intended meaning of data rather than its syntax or structure. For
example, reasoning that a land cover type classified as deciduous needleleaf trees in one
dataset is a specialization or subset of land cover type forest in another more roughly classified
dataset can help a GIS automatically merge the two datasets under the more general land cover
classification. Tentative ontologies have been developed in areas related to GIS applications, for
example the hydrology ontology[39] developed by the Ordnance Survey in the United Kingdom and
the SWEET ontologies[40] developed by NASA's Jet Propulsion Laboratory. Also, simpler
ontologies and semantic metadata standards are being proposed by the W3C Geo Incubator
Group[41] to represent geospatial data on the web. GeoSPARQL is a standard developed by the
Ordnance Survey, United States Geological Survey, Natural Resources Canada,
Australia's Commonwealth Scientific and Industrial Research Organisation and others to support
ontology creation and reasoning using well-understood OGC literals (GML, WKT), topological
relationships (Simple Features, RCC8, DE-9IM), RDF and the SPARQL database query
protocols.
Recent research results in this area can be seen in the International Conference on Geospatial
Semantics[42] and the Terra Cognita – Directions to the Geospatial Semantic Web [43] workshop at
the International Semantic Web Conference.