Reference Architecture of Distributed Dbmss
Reference Architecture of Distributed Dbmss
Reference Architecture of Distributed Dbmss
This section introduces the reference architecture of a distributed database system. Owing to
diversities of distributed DBMSs, it is much more difficult to represent a common architecture
that is generally applicable for all applications. However, it may be useful to represent a possible
reference architecture that addresses data distribution. Data in a distributed system are usually
fragmented and replicated. Considering this fragmentation and replication issue, the reference
architecture of a distributed DBMS consists of the following schemas
Global conceptual schema – The GCS represents the logical description of the entire database
as if it is not distributed. This level corresponds to the conceptual level of the ANSI–SPARC
architecture of centralized DBMS and contains definitions of all entities, relationships among
entities and security and integrity information for the whole database stored at all sites in a
distributed system.
Fragmentation schema and allocation schema – In a distributed database, the data can be split
into a number of non-overlapping portions, called fragments. There are several different ways
to perform this fragmentation operation. The fragmentation schema describes how the data is
to be logically partitioned in a distributed database. The GCS consists of a set of global relations,
and the mapping between the global relations and fragments is defined in the fragmentation
schema. This mapping is one-to-many, that is, a number of fragments correspond to one global
relation but only one global relation corresponds to one fragment. The allocation schema is a
description of where the data (fragments) are to be located, taking account of any replication.
The type of mapping defined in the allocation schema determines whether the distributed
database is redundant or non-redundant. In the case of redundant data distribution, the
mapping is one-to-many, whereas in the case of non-redundant data distribution the mapping
is one-to-one.
Local schemas – Each local DBMS in a distributed system has its own set of schemas. The local
conceptual and local internal schemas correspond to the equivalent levels of ANSI–SPARC
architecture. In a distributed database system, the physical data organization at each machine
is probably different, and therefore it requires an individual internal schema definition at each
site, called local internal schema. To handle fragmentation and replication issues, the logical
organization of data at each site is described by a third layer in the architecture, called local
conceptual schema. The GCS is the union of all local conceptual schemas; thus, the local
conceptual schemas are mappings of the global schema onto each site. This mapping is done by
local mapping schemas. The local mapping schema maps fragments in the allocation schema
onto external objects in the local database, and this mapping depends on the type of local
DBMS. Therefore, in a heterogeneous distributed DBMS, there may be different types of local
mappings at different nodes.