GIS
GIS
GIS
Possibly the earliest use of the geographic method, in 1854 John Snow depicted a
cholera outbreak in London using points to represent the locations of individual cases.
His study of the distribution of cholera led to the source of the disease, a contaminated
water pump within the heart of the outbreak.
Original map by Dr. John Snow showing the clusters of cholera cases in the London
epidemic of 1854
While the basic elements of topology and theme existed previously in cartography,
the John Snow map was unique, using cartographic methods to depict clusters of a
geographically dependent phenomena for the first time.
The early 20th century saw the development of "photo lithography" where maps were
separated into layers. Computer hardware development spurred by nuclear weapon
research would lead to general purpose computer "mapping" applications by the early
1960s. The year 1964 saw the development of the world's first true operational GIS in
Ottawa, Ontario by the federal Department of Energy, Mines, and Resources.
Developed by Roger Tomlinson, it was called "Canadian Geographic Information
Systems" (CGIS) and was used to store, analyse, and manipulate data collected for the
Canada Land Inventory (CLI)—an initiative to determine the land capability for rural
Canada by mapping information about soils, agriculture, recreation, wildlife,
waterfowl, forestry, and land use at a scale of 1:250,000. A rating classification factor
was also added to permit analysis.
CGIS was the world's first "system" and was an improvement over "mapping"
applications as it provided capabilities for overlay, measurement, and
digitizing/scanning. It supported a national coordinate system that spanned the
continent, coded lines as "arcs" having a true embedded topology, and it stored the
attribute and locational information in separate files. As a result of this, Tomlinson has
become known as the "father of GIS."
CGIS lasted into the 1990s and built the largest digital land resource database in
Canada. It was developed as a mainframe based system in support of federal and
provincial resource planning and management. Its strength was continent-wide
analysis of complex data sets. The CGIS was never available in a commercial form.
Its initial development and success stimulated various commercial mapping
applications being sold by vendors such as ESRI, MapInfo, Intergraph and CARIS to
successfully incorporate many of the CGIS features, combining the first generation
approach to separation of spatial and attribute information with a second generation
approach to organizing attribute data into database structures. The 1980s and 1990s
industry growth were spurred on by the growing use of GIS on Unix workstations and
the personal computer. By the end of the 20th century, the rapid growth in various
systems had been consolidated and standardized on relatively few platforms and users
were beginning to export the concept of viewing GIS data over the Internet, requiring
data format and transfer standards. More recently, there is a growing flavor of free,
opensource GIS packages such as GRASS GIS and Quantum GIS which run on a
range of operating systems and can be customised to perform specific tasks .
Modern GIS technologies use digital information, for which various digitized data
creation methods are used. The most common method of data creation is digitization,
where a hardcopy map or survey plan is transferred into a digital medium through the
use of a computer-aided drafting (CAD) program, and georeferencing capabilities.
If you could relate information about the rainfall of your state to aerial photographs of
your county, you might be able to tell which wetlands dry up at certain times of the
year. A GIS, which can use information from many different sources in many different
forms, can help with such analyses. The primary requirement for the source data
consists of knowing the locations for the variables. Location may be annotated by x,
y, and z coordinates of longitude, latitude, and elevation, or by other geocode systems
like ZIP Codes or by highway mile markers. Any variable that can be located spatially
can be fed into a GIS. Several computer databases that can be directly entered into a
GIS are being produced by government agencies and non-government
organizations[citation needed]. Different kinds of data in map form can be entered into a
GIS.
A GIS can also convert existing digital information, which may not yet be in map
form, into forms it can recognize and use. For example, digital satellite images
generated through remote sensing can be analyzed to produce a map-like layer of
digital information about vegetative covers. Another fairly developed resource for
naming GIS objects is the Getty Thesaurus of Geographic Names (GTGN), which is a
structured vocabulary containing around 1,000,000 names and other information
about places[1].
GIS data represents real world objects (roads, land use, elevation) with digital data.
Real world objects can be divided into two abstractions: discrete objects (a house) and
continuous fields (rain fall amount or elevation). There are two broad methods used to
store data in a GIS for both abstractions: Raster and Vector.
[edit] Raster
Digital Elevation Model (DEM) + Map (image) + Vector data
Raster data type consists of rows and columns of cells where in each cell is stored a
single value. Most often, raster data are images (raster images), but besides just color,
the value recorded for each cell may be a discrete value, such as land use, a
continuous value, such as rainfall, or a null value if no data is available. While a raster
cell stores a single value, it can be extended by using raster bands to represent RGB
(red, green, blue) colors, colormaps (a mapping between a thematic code and RGB
value), or an extended attribute table with one row for each unique cell value. The
resolution of the raster dataset is its cell width in ground units. For example, in a
LIDAR raster image, each cell may be a pixel that represents an area of 3 meters by 3
meters. Usually cells represent square areas of the ground, but other shapes can also
be used.
[edit] Vector
Vector data type uses geometries such as points, lines (series of point coordinates), or
polygons, also called areas (shapes bounded by lines), to represent objects. Examples
include property boundaries for a housing subdivision represented as polygons and
well locations represented as points. Vector features can be made to respect spatial
integrity through the application of topology rules such as 'polygons must not
overlap'. Vector data can also be used to represent continuously varying phenomena.
Contour lines and triangulated irregular networks (TIN) are used to represent
elevation or other continuously changing values. TINs record values at point
locations, which are connected by lines to form an irregular mesh of triangles. The
face of the triangles represent the terrain surface.
There are advantages and disadvantages to using a raster or vector data model to
represent reality. Raster datasets record a value for all points in the area covered
which may require more storage space than representing data in a vector format that
can store data only where needed. Raster data also allows easy implementation of
overlay operations, which are more difficult with vector data. Vector data can be
displayed as vector graphics used on traditional maps, whereas raster data will appear
as an image that may have a blocky appearance for object boundaries. Vector data can
be a lot easier to register, scale, and reproject. This can make it much simpler to
combine vector layers from different sources.
Additional non-spatial data can also be stored besides the spatial data represented by
the coordinates of a vector geometry or the position of a raster cell. In vector data, the
additional data are attributes of the object. For example, a forest inventory polygon
may also have an identifier value and information about tree species. In raster data the
cell value can store attribute information, but it can also be used as an identifier that
can relate to records in another table.
Existing data printed on paper or PET film maps can be digitized or scanned to
produce digital data. A digitizer produces vector data as an operator traces points,
lines, and polygon boundaries from a map. Scanning a map results in raster data that
could be further processed to produce vector data.
Survey data can be directly entered into a GIS from digital data collection systems on
survey instruments. Positions from a Global Positioning System (GPS), another
survey tool, can also be directly entered into a GIS.
Remotely sensed data also plays an important role in data collection and consist of
sensors attached to a platform. Sensors include cameras, digital scanners and LIDAR,
while platforms usually consist of aircraft and satellites.
The majority of digital data currently comes from photo interpretation of aerial
photographs. Soft copy workstations are used to digitize features directly from stereo
pairs of digital photographs. These systems allow data to be captured in 2 and 3
dimensions, with elevations measured directly from a stereo pair using principles of
photogrammetry. Currently, analog aerial photos are scanned before being entered
into a soft copy system, but as high quality digital cameras become cheaper this step
will be skipped.
Satellite remote sensing provides another important source of spatial data. Here
satellites use different sensor packages to passively measure the reflectance from parts
of the electromagnetic spectrum or radio waves that were sent out from an active
sensor such as radar. Remote sensing collects raster data that can be further processed
to identify objects and classes of interest, such as land cover.
When data is captured, the user should consider if the data should be captured with
either a relative accuracy or absolute accuracy, since this could not only influence
how information will be interpreted but also the cost of data capture.
In addition to collecting and entering spatial data, attribute data is also entered into a
GIS. For vector data, this includes additional information about the objects
represented in the system.
After entering data into a GIS, the data usually requires editing, to remove errors, or
further processing. For vector data it must be made "topologically correct" before it
can be used for some advanced analysis. For example, in a road network, lines must
connect with nodes at an intersection. Errors such as undershoots and overshoots must
also be removed. For scanned maps, blemishes on the source map may need to be
removed from the resulting raster. For example, a fleck of dirt might connect two lines
that should not be connected.
Data restructuring can be performed by a GIS to convert data into different formats.
For example, a GIS may be used to convert a satellite image map to a vector structure
by generating lines around all cells with the same classification, while determining the
cell spatial relationships, such as adjacency or inclusion.
More advanced data processing can occur with image processing, a technique
developed in the late 1960s by NASA and the private sector to provide contrast
enhancement, false colour rendering and a variety of other techniques including use of
two dimensional Fourier transforms.
Since digital data are collected and stored in various ways, the two data sources may
not be entirely compatible. So a GIS must be able to convert geographic data from
one structure to another.
A property ownership map and a soils map might show data at different scales. Map
information in a GIS must be manipulated so that it registers, or fits, with information
gathered from other maps. Before the digital data can be analyzed, they may have to
undergo other manipulations—projection and coordinate conversions, for example—
that integrate them into a GIS.
The earth can be represented by various models, each of which may provide a
different set of coordinates (e.g., latitude, longitude, elevation) for any given point on
the earth's surface. The simplest model is to assume the earth is a perfect sphere. As
more measurements of the earth have accumulated, the models of the earth have
become more sophisticated and more accurate. In fact, there are models that apply to
different areas of the earth to provide increased accuracy (e.g., North American
Datum, 1927 - NAD27 - works well in North America, but not in Europe). See Datum
for more information.
Since much of the information in a GIS comes from existing maps, a GIS uses the
processing power of the computer to transform digital information, gathered from
sources with different projections and/or different coordinate systems, to a common
projection and coordinate system.
[edit] Spatial analysis with GIS
Such a map can be thought of as a rainfall contour map. Many sophisticated methods
can estimate the characteristics of surfaces from a limited number of point
measurements. A two-dimensional contour map created from the surface modeling of
rainfall point measurements may be overlaid and analyzed with any other map in a
GIS covering the same area.
In the past years, were there any gas stations or factories operating next to the swamp?
Any within two miles and uphill from the swamp? A GIS can recognize and analyze
the spatial relationships that exist within digitally stored spatial data. These
topological relationships allow complex spatial modelling and analysis to be
performed. Topological relationships between geometric entities traditionally include
adjacency (what adjoins what), containment (what encloses what), and proximity
(how close something is to something else).
[edit] Networks
If all the factories near a wetland were accidentally to release chemicals into the river
at the same time, how long would it take for a damaging amount of pollutant to enter
the wetland reserve? A GIS can simulate the routing of materials along a linear
network. Values such as slope, speed limit, or pipe diameter can be incorporated into
network modelling in order to represent the flow of the phenomenon more accurately.
Network modelling is commonly employed in transportation planning, hydrology
modelling, and infrastructure modelling.
The term "cartographic modeling" was (probably) coined by Dana Tomlin in his PhD
dissertation and later in his book which has the term in the title. Cartographic
modeling refers to a process where several thematic layers of the same area are
produced, processed, and analyzed. Tomlin used raster layers, but the overlay method
(see below) can be used more generally. Operations on map layers can be combined
into algorithms, and eventually into simulation or optimization models.
Data extraction is a GIS process similar to vector overlay, though it can be used in
either vector or raster data analysis. Rather than combining the properties and features
of both datasets, data extraction involves using a "clip" or "mask" to extract the
features of one dataset that fall within the spatial extent of another dataset.
Digital cartography and GIS both encode spatial relationships in structured formal
representations. GIS is used in digital cartography modeling as a (semi)automated
process of making maps, so called Automated Cartography. In practice, it can be a
subset of a GIS, within which it is equivalent to the stage of visualization, since in
most cases not all of the GIS functionality is used. Cartographic products can be
either in a digital or in a hardcopy format. Powerful analysis techniques with different
data representation can produce high-quality maps within a short time period. The
main problem in Automated Cartography is to use a single set of data to produce
multiple products at a variety of scales, a technique known as Generalization.
[edit] Geostatistics
Using Geostatistics to predict fields from points. Point pattern analysis. A way of
looking at the statistical properties of spatial data. What makes it unique from other
kinds of statistics is the use of graph theory and matrix algebra to reduce the number
of parameters in the data being analyzed. This is necessary because it is actually the
second-order properties of the GIS data that need analyzing.
When we measure any phenomena, our observation methods dictate the accuracy of
any subsequent analysis. Whether our study is concerned with the nature of traffic
patterns in an urban core, or with the analysis of weather patterns over the Pacific,
there will always contain a variable or a degree of precision which escapes our
measurement; this is determined directly by the scale and distribution of our data
collection, or survey methods. In order to apply statistical relevance to spatial
analysis, an 'average' must be determined so that points, or gradients, outside of any
immediate measurement may be included as to their predicted behavior. Limitations
in statistics and data collection mean that it is impossible to directly measure a
continuum without the inferential methods of analysis, of which, several forms of
interpolation are used in order to predict the behavior of particles and locations not
directly measured.
Hillshade model derived from a Digital Elevation Model (DEM) of the Valestra area
in the northern Apennines (Italy)
Spatial Autocorrelation Principle: Data collected at any position will have a greater
similarity to, or influence on, those locations within its immediate vicinity.
[edit] Geocoding
Various algorithms are used to help with address matching when the spellings of
addresses differ. Address information that a particular entity or organization has data
on, such as the post office, may not entirely match the reference theme. There could
be variations in street name spelling, community name, etc. Consequently, the user
generally has the ability to make matching criteria more stringent, or to relax those
parameters so that more addresses will be mapped. Care must be taken to review the
results so as not to erroneously map addresses incorrectly due to overzealous
matching parameters.
First, it produces graphics on the screen or on paper that convey the results of analysis
to the people who make decisions about resources. Wall maps and other graphics can
be generated, allowing the viewer to visualize and thereby understand the results of
analyses or simulations of potential events. Web Map Servers facilitate distribution of
generated maps via the web technology.
Second, other database information can be generated for further analysis or use. A list
of all addresses within 1 mile of a toxic spill for instance.
Traditional maps are abstractions of the real world, a sampling of important elements
portrayed on a sheet of paper with symbols to represent physical objects. People who
use maps must interpret these symbols. Topographic maps show the shape of land
surface with contour lines; the actual shape of the land can be seen only in the mind's
eye.
Today, graphic display techniques such as shading based on altitude in a GIS can
make relationships among map elements visible, heightening one's ability to extract
and analyze information. For example, two types of data were combined in a GIS to
produce a perspective view of a portion of San Mateo County, California.
• The digital elevation model, consisting of surface elevations recorded on a 30-
meter horizontal grid, shows high elevations as white and low elevation as
black.
• The accompanying Landsat Thematic Mapper image shows a false-color
infrared image looking down at the same area in 30-meter pixels, or picture
elements, for the same coordinate points, pixel by pixel, as the elevation
information.
A GIS was used to register and combine the two images to render the three-
dimensional perspective view looking down the San Andreas Fault, using the
Thematic Mapper image pixels, but shaded using the elevation of the landforms. The
GIS display depends on the viewing point of the observer and time of day of the
display, to properly render the shadows created by the sun's rays at that latitude,
longitude, and time of day.
Spatial ETL tools provide the data processing functionality of traditional Extract,
Transform, Load (ETL) software, but with a primary focus on the ability to manage
spatial data. They provide GIS users with the ability to translate data between
different standards and proprietary formats, whilst geometrically transforming the
data en-route.
GIS software is the main method through which geographic data is accessed,
transferred, transformed, overlaid, processed and displayed. Various software form
integral components of this interface to GIS data. There are numerous commercial,
open source and even shareware products that fill these roles. Commercial software is
mostly used in industry with ESRI being the leader, while government and military
departments often use custom software, open source products, such as GRASS, or
more specialized products. The public and small organizations generally use free GIS
readers, rapidly expanding online resources or shareware.
[edit] Background
Originally up to the late 1990's, when GIS data was mostly based on large computers
and used to maintain internal records, software was a stand-alone product. However
with increased access to the internet and networks and demand for distributed
geographic data grew, GIS software gradually changed its entire outlook to the
delivery of data over a network. GIS software is now usually marketed as
combination of various interoperable applications and API's.
GIS processing software is used for the task preparing data for use within a GIS. This
is transforms the raw or legacy geographic data into a format usable by GIS products.
For example an aerial photograph may need to be stretched (orthorectified) so that its
pixels align with longitude and latitude gradations (or what ever grid is needed). This
can be distinguished from the transformations done within GIS analysis software by
the fact that these changes are permanent, more complex and time consuming. This a
specialized high-end type of software is generally used by person skilled in
photogrammetry and / or GIS processing aspects of computer science. In addition,
AutoCAD, normally used for draughts of engineering projects, can be configured for
the editing of vector maps, and has some products that have migrating towards GIS
use. It is especially useful as it has strong support for digitization. Raw geographic
data can be edited in many standard database and spreadsheet applications and in
some cases a text editor may be used as long as care is taken to properly format data.
Examples are OrthoEngine and ArcEditor
[edit] Geodatabases
The overall functionality as well as the very vast capabilities of GIS technology itself
have grown enormously with the creation of the geodatabase. This can best be
described as the latest data format to be made available in ESRI GIS software. On the
surface the data appears to be the same as it did when shapefiles were the primary file
type. The difference notably lies behind the scenes. Shapefiles (still functional) use
.shp as one of the five sub-files that comprise one shapefile. The .shp portion of the
shapefile is where the spatial component of the data is stored, and the .dbf portion is
where the attribute data is stored.
The geodatabase is based on a Microsoft Access format, thus enabling end users to
apply and execute different operations to the data. Now more than ever before the
functionality and shared capability in data editing between vector and raster data are
alike. The geodatabase enables raster data to be handled as vector data. Although
vector data still possesses more options in an editing process, simple .tif, geotif, and
.mrsid files when contained in a geodatabase can be displayed and edited like never
before. One example of this increased functionality is the ability to set the
transparency in the file properties to a desired percentage. This allows, for example,
simultaneous display of multiple satellite images from which change over time can be
studied.
The geodatabase emerged with the release of ArcGIS 8.X and now includes both
personal and enterprise geodatabases.
GIS analysis software takes GIS data and overlays or otherwise combines it so that
the data can be visually analysed. It can output a detailed map, image or movie used
to communicate an idea or concept with respect to a region of interest. This is usually
used by persons who are trained in cartography, geography or a GIS professional as
this type of application is complex and takes some time to master. The software
performs transformation on raster and vector data sometimes of differing datums, grid
system, or reference system, into one coherent image. It can also analyse changes over
time within a region. This software is central to the professional analysis and
presentaton of GIS data. Examples include the ArcGIS family of ESRI GIS
applications (which replaced ESRI's older Arc/INFO), XMap and GRASS.
[edit] Statistical
GIS statistical software uses standard database queries to retrieve data and analyse
data for decision making. For example, it can be used to determine how many persons
of an income of greater than 60,000 live in a given street block. The data is sometimes
referenced with postal/zip codes and street locations rather than with geodetic data.
This is used by computer scientists and statisticians with CS skills, with an objective
of characterizing an area for marketing or governing decisions. Standard DBMS can
be used or specialized GIS statistical software. These are many times setup on servers
so that they can be queried with web-browsers. Examples are MySQL or ArcSDE.
[edit] Readers
GIS readers are applications, usually free, that are distributed to allow the public to
easily view maps created via a GIS, as well as view GIS-managed data. By definition,
they usually allow very little if any editing of the map or underlying map data.
Readers can be normal standalone applications that need to be installed locally,
though they generally then connect to data servers over the Internet to access the
relevant information. Readers can also be included as an embedded application within
a web page, obviating the need for local installation. Readers are designed to be
relatively simple and easy to use, tending to emphasize global coverage, visible light
raster data and accessible vector data. Google Earth, ArcReader and GeoPDF are
examples.
This is the evolution of the scripts that were common with most early GIS systems.
An application programming interface is a set of subroutines (organized as object
oriented programming) designed to perform a specific task. GIS API's are designed to
manage GIS data for its delivery to a web browser client from a GIS server. They are
accessed with commonly used scripting language such as VBA or javascript. They are
used to build a server system for the delivery of GIS that is to made available over an
intranet or publicly over the internet.
While not strictly a GIS application, there are applications that take GIS data and
format and transfer that data in a scaled down, limitation-aware manner to PDA and
GPS Receiver devices so they can be used for field applications.
Most requirements that can be set for a GIS can be satisfied with free or open-source
software. Recently an international foundation (OSGeo) was started to support and
build the highest-quality open source geospatial software.
With the broad use of non-proprietary and open data formats such as the Shape File
format for vector data and the Geotiff format for raster data, as well as the adoption of
Open Geospatial Consortium (OGC) protocols such as Web Mapping Service (WMS)
and Web Feature Service (WFS), development of open source software continues to
evolve, especially for web and web service oriented applications.
Well-known open source GIS software includes GRASS GIS, Quantum GIS,
MapServer, GDAL/OGR, PostGIS, uDig, OpenJUMP, gvSIG, and others.
Many disciplines can benefit from GIS technology. An active GIS market has resulted
in lower costs and continual improvements in the hardware and software components
of GIS. These developments will, in turn, result in a much wider use of the technology
throughout science, government, business, and industry, with applications including
real estate, public health, crime mapping, national defense, sustainable development,
natural resources, transportation and logistics. GIS is also diverging into location-
based services (LBS). LBS allows GPS enabled mobile devices to display their
location in relation to fixed assets (nearest restaurant, gas station, fire hydrant),
mobile assets (friends, children, police car) or to relay their position back to a central
server for display or other processing. These services continue to develop with the
increased integration of GPS functionality with increasingly powerful mobile
electronics (cell phones, PDA's, laptops).
GIS products are broken down by the OGC into two categories, based on how
completely and accurately the software follows the OGC specifications.
Google Maps is different from other web map servers (like MapQuest, Yahoo! Maps,
or Rand McNally) because Google Maps exposes an API that enables users to
associate attributes with interactive maps. This is in effect a GIS. However Google
Maps is largely "point" oriented and other than using different point markers, you
have to click on the markers to get the metadata.
Maps have traditionally been used to explore the Earth and to exploit its resources.
GIS technology, as an expansion of cartographic science, has enhanced the efficiency
and analytic power of traditional mapping. Now, as the scientific community
recognizes the environmental consequences of human activity, GIS technology is
becoming an essential tool in the effort to understand the process of global change.
Various map and satellite information sources can combine in modes that simulate the
interactions of complex natural systems.
Through a function known as visualization, a GIS can be used to produce images - not
just maps, but drawings, animations, and other cartographic products. These images
allow researchers to view their subjects in ways that literally never have been seen
before. The images often are equally helpful in conveying the technical concepts of
GIS study-subjects to non-scientists.
The condition of the Earth's surface, atmosphere, and subsurface can be examined by
feeding satellite data into a GIS. GIS technology gives researchers the ability to
examine the variations in Earth processes over days, months, and years.
GIS technology and the availability of digital data on regional and global scales
enable such analyses. The satellite sensor output used to generate a vegetation graphic
is produced by the Advanced Very High Resolution Radiometer (AVHRR). This
sensor system detects the amounts of energy reflected from the Earth's surface across
various bands of the spectrum for surface areas of about 1 square kilometer. The
satellite sensor produces images of a particular location on the Earth twice a day.
AVHRR is only one of many sensor systems used for Earth surface analysis. More
sensors will follow, generating ever greater amounts of data.
GIS and related technology will help greatly in the management and analysis of these
large volumes of data, allowing for better understanding of terrestrial processes and
better management of human activities to maintain world economic vitality and
environmental quality.