Compiled with assistance from Timothy L. Nyerges, University of Washington
A. INTRODUCTION
B. DATABASE CONTENT AND AN ORGANIZATION'S MISSION
C. FUNDAMENTAL DATABASE ELEMENTS
D. DATABASE DESIGN
REFERENCES
EXAM AND DISCUSSION QUESTIONS
NOTES
This begins a three unit section covering some basic principles of spatial
databases. As these issues are very fundamental, many of them are introduced
here but dealt with in much greater detail in later units.
UNIT 10 - SPATIAL DATABASES AS MODELS OF REALITY
Compiled with assistance from Timothy L. Nyerges, University of Washington
A. INTRODUCTION
- the real world is too complex for our immediate and direct understanding
- we create "models" of reality that are intended to have some similarity
with selected aspects of the real world
- databases are created from these "models" as a fundamental step in coming
to know the nature and status of that reality
Definition
- a spatial database is a collection of spatially referenced data that acts
as a model of reality
- a database is a model of reality in the sense that the database
represents a selected set or approximation of phenomena
- these selected phenomena are deemed important enough to represent in
digital form
- the digital representation might be for some past, present or future
time period (or contain some combination of several time periods in an
organized fashion)
Standards
- many of the definitions in this Unit have been standardized by the
proposed US National Digital Cartographic Standard (DCDSTF, 1988)
- these standards have been developed to provide a nationally uniform
means for portraying and exchanging digital cartographic data
- these cartographic standards will form part of a larger standard being
developed for the digital representation of all earth science information
B. DATABASE CONTENT AND AN ORGANIZATION'S MISSION
Organization mandates
- organizations have mandates to perform certain tasks that carry out their
missions
- mandates are the reasons they exist as organizations
- organizations have different needs for data depending on their mandates
and the activities required to carry out these mandates
- mandates often help identify and define entities of interest, requiring
a certain view of the world
- what might seem at first glance to be the same data need in two
different organizations can actually be quite different when we look at a
more detailed level
- e.g. wildlife and forestry departments both need information on
vegetation but the detail needed is different
Database contents
Example: Transportation
- highway data from the different points of view of a natural resources
organization and a highway transportation organization
- a natural resource organization might only need logging roads and the
connecting access to state highways
- the transportation organization's main interest is in characterizing
highways used by the public
- the database might also be used to store detailed highway condition
and maintenance information
- we would expect their need for highway data to be more detailed than
would the natural resource organization's
Example: wetlands
- wetlands data from the different points of view of an ecological
organization and a taxing authority
- ecological organization might define wetlands as a natural resource to
be preserved and restricted from development
- that perspective might require considerable detail for describing the
area's biology and physical resources
- a taxing authority might define a wetland to be a "wasteland" and of
very little value to society
- that description might require only the boundary of the "wasteland" in
the database
Database design
- in each organization only certain phenomena are important enough to
collect and represent in a database
- the data collection process involves a sampling of geographic reality,
to determine the status of that reality (whether past, present or future)
- identifying the phenomena and then choosing an appropriate data
representation for them is part of a process called database design
- see Units 11 and 66 for more on database design
C. FUNDAMENTAL DATABASE ELEMENTS
Entity
- an entity is "a phenomenon of interest in reality that is not further
subdivided into phenomena of the same kind"
- e.g. a city could be considered an entity and subdivided into component
parts but these parts would not be called cities, they would be districts,
neighborhoods or the like
- e.g. a forest could be subdivided into smaller forests
Object
- an object is "a digital representation of all or part of an entity"
- the method of digital representation of a phenomenon varies according to
scale, purpose and other factors
- e.g. a city could be represented geographically as a point if the area
under consideration were continental in scale
- the same city could be geographically represented as an area if we are
dealing with a geographic database for a state or a county
Entity types
- similar phenomena to be stored in a database are identified as entity
types
- an entity type is any grouping of similar phenomena that should eventually
get represented and stored in a uniform way, e.g. roads, rivers, elevations,
vegetation
- provides convenient conceptual framework for describing phenomena at a
general level
- organizational perspective influences this interpretation to a large
degree
- precise definitions should be generated for each entity type
- helps with identifying overlapping categories of information
- aids in clarifying the content of the database
- theUS National Standard for Digital Cartographic Data Volume 2
(DCDSTF 1988) includes a large number of definitions for entity types
handout - Sample entity definitions
- the first step in database development is the selection and definition of
entity types to be included
- this is guided by the organization's mandate and purpose of the database
- this framework can be as important as the actual database because it
guides the development
- the second step of database design is to choose an appropriate method of
spatial representation for each of the entity types
Spatial object type
Object classes
- an object class is the set of objects which represent the set of entities
- e.g. the set of points representing the set of wells
Attributes
- an attribute is a characteristic of an entity selected for representation
- usually non-spatial
- though some may be related to the spatial character of the phenomena
under study
Attribute value
- the actual value of the attribute that has been measured (sampled) and
stored in the database
- an entity type is almost always labeled and known by attributes
- e.g. a road usually has a name and is identified according to its class
- e.g. alley, freeway
- attributes values often are conceptually organized in attribute tables
which list individual entities in the rows and attributes in the column
- entries in each cell of the table represent the attribute value of a
specific attribute for a specific entity
- note: attribute table is not an official DCDSTF term
Database model
- is a conceptual description of a database defining entity type and
associated attributes
- each entity type is represented by specific spatial objects
- after the database is constructed, the database model is a view of the
database which the system can present to the user
- other views can be presented, but this one is likely useful because it
was important in the conceptual design
- e.g. the system can model the data in vector form but generate a
raster for purposes of display to the user
- need not be related directly to the way the data are actually stored in
the database
- e.g. census zones may be defined as being represented by polygons, but
the program may actually represent the polygon as a series of line
segments
- examples of database models can be grouped by application area
- e.g. transportation applications require different database models than
do natural resource applications
Layers
- spatial objects can be grouped into layers, also called overlays,
coverages or themes
- one layer may represent a single entity type or a group of conceptually
related entity types
- e.g. a layer may have only stream segments or may have streams, lakes,
coastline and swamps
- options depend on the system as well as the database model
- some spatial databases have been built by combining all entities into
one layer
D. DATABASE DESIGN
- almost all entities of geographic reality have at least a 3-dimensional
spatial character, but not all dimensions may be needed
- e.g. highway pavement actually has a depth which might be important, but
is not as important as the width, which is not as important as the length
- representation should be based on the types of manipulations that might be
undertaken
- map-scale of the source document is important in constraining the level of
detail represented in a database
- e.g. on a 1:100,000 map individual houses or fields are not visible
Steps in database design
Link to Longley text Fig 8.2 p 179
1. Conceptual
- software and hardware independent
- describes and defines included entities
- identifies how entities will be represented in the database
- i.e. selection of spatial objects - points, lines, areas, raster cells
- requires decisions about how real-world dimensionality and relationships
will be represented
- these can be based on the processing that will be done on these
objects
- e.g. should a building be represented as an area or a point?
- e.g. should highway segments be explicitly linked in the database?
2. Logical
- software specific but hardware independent
- sets out the logical structure of the database elements, determined by
the data base management system used by the software
- this is discussed in greater detail in Unit 43
3. Physical
- both hardware and software specific
- requires consideration of how files will be structured for access from
the disk
- covered in Unit 66
Desirable database characteristics
- Database should be:
- contemporaneous - should contain information of the same vintage for all
its measured variables
- as detailed as necessary for the intended applications
- the categories of information and subcategories within them should
contain all of the data needed to analyze or model the behavior of the
resource using conventional methods and models
- positionally accurate
- exactly compatible with other information that may be overlain with it
- internally accurate, portraying the nature of phenomena without error -
requires clear definitions of phenomena that are included
- readily updated on a regular schedule
- accessible to whoever needs it
Issues in database design
- almost all entities of geographic reality have at least 3-dimensional
spatial character, but not all dimensions may be needed
- e.g. highway pavement has a depth which might be important, but is not
as important as the width, which is not as important as the length
- representation should be based on types of manipulations that might be
undertaken
- map-scale of the source document is important in constraining the level of
detail represented in a database
- e.g. individual houses or fields are not visible on a 1:100,000 map but
are evident at 1:10,000
REFERENCES
Codd, E. F., 1981. "Data Models in Database Management," ACM SIGMOD Record
11(2):112-114.
Explains the nature of data models, their role in constructing
databases.
DCDSTF - Digital Cartographic Data Standards Task Force. 1988. "The proposed
standard for digital cartographic data," The American Cartographer 15(1).
Summary of the major components of the proposed US National Standard.
Robinson, A., R. Sale, J. Morrison, and P. Muehrcke, 1984. The Elements of
Cartography, (5th ed.), John Wiley and Sons, New York.
Useful survey of
cartographic terminology and models.
Unwin D., 1981. Introductory Spatial Analysis, Methuen, London.
A spatial
analysis perspective on spatial data models.
Further readings
Abel D., and Mark D.M. 1990 ‘A comparative analysis of some two-dimensional
orderings’. International Journal of Geographical Information Systems
4(1): 21–31. A classic attempt to show how raster ordering can help
compression.
Fisher P., and Unwin D. (eds) 2005 Re-presenting GIS.
London: Wiley. A set of research papers that together provides numerous
arguments for full object orientation in geographical databases.
Gahegan M.N., and Roberts S. 1988 ‘An intelligent,
object-oriented geographical information system’. International Journal of
Geographical Information Systems 2: 101–10. An attempt to produce a fully object-oriented GIS.
Goodchild M.F., and Grandfield A.W. 1983 ‘Optimizing raster
storage: an examination of four alternatives’. Proceedings, AutoCarto 6, Ottawa,
1: 400–7.
More on compression.
Peuker T.K., and Chrisman N. 1975 ‘Geographic data structures’. American Cartographer 2(1): 55–69.
Peuquet D.J. 1984 ‘A conceptual framework and comparison of
spatial data models’, Cartographica 21(4): 66–113.
Two papers that explore the standard arc-node data structure.
Raper J.F. 2000 Multidimensional GIS. London: Taylor & Francis.
An advanced discussion of the need to incorporate additional
dimensions, such as height/depth and time, into geographic data modeling. A case
study of research at Scolt Head in England provides a powerful argument for full
object orientation in database design.
Worboys M.F. 1992 ‘A generic model for planar spatial objects’.
International Journal of Geographical Information Systems 6(5),
353–72. The classic paper on object types in GIS by a logician/computer
scientist.
Worboys M.F., Hearnshaw H.M., and Maguire D.J. 1990
‘Object-oriented data modelling for spatial databases’. International Journal
of Geographical Information Systems 4: 369–83. Another classic paper, one of the earliest calls for object
orientation in GIS.
Zeiler M. 1999 Modeling our World: The ESRI Guide to
Geodatabase Design. Redlands CA: ESRI Press. An extremely useful survey of
the approaches taken by the world-leading GIS software
EXAM AND DISCUSSION QUESTIONS
1. What makes the concept of a spatial database unique relative to other
types of databases?
2. Distinguish the construct of an entity from a spatial object.
3. Why are organizational mandates important in database design? Give
examples using (a) natural resource data and (b) socio-economic data.
4. What is a database model, and why is it important for designing a
database?
5. Why would a database designer use a chain object rather than a string
object for representation of linear features?
6. List and define an example of a spatial object type from each of the 0-D,
1-D, 2-D and 3-D groups of object types.