Database System
Database System
Database System
One way to keep the information on a computer is to store it in the permanent files. The system
has a number of application programs, each of them is defined to manipulate the data files. These
application programs have been written on request of the users in the organization. New
application will be added to the system as the need arises. The system just described is called the
file-based system.
Consider a traditional banking system which using the file-based system in managing the
organization’s data in the picture below. As we can see, there are different departments in the
Bank, each of them has their own applications which manage and manipulate different data files.
For Banking system, the programs can be the one to debit or credit an account, , find the balance
of an account, add a new mortgage loan or generate monthly statements etc.
Data Redundancy: Since files and applications are created by different programmer of
various departments over long period of time, it might lead to several problems:
o Inconsistency in data format
o The same information may be kept in several different place (files).
o Data inconsistency which means various copies of the same data are conflicting ;
waste storage space and duplication of effort
Data Isolation
o It is difficult for new application to retrieve the appropriate data which might be
stored in various files.
Integrity problems
o Data values must satisfy certain consistency constraints which are specified in the
application programs.
o It is difficult to add change the programs to enforce new constraint
Security problems
o There are constraint regarding accessing privileges
o Application is added to the system in the ad-hoc manner so it is difficult to
enforce those constraints
Concurrent – access irregularities
o Data may be accessed by many applications that have not been coordinated
previously so it is not easy to provide a strategy to support multiple users to
update data simultaneously
These difficulties have prompted the development of a new approach in managing large amount
of organizational information – database approach. In the following section, we shall see the
concepts that have been introduced to get over the problems mentioned.
Database Approach
Database and database technology play an important role in most of social areas where computer
are used, including business, education, medicine etc. To understand the fundamental of database
system, we start from introducing the basic concepts in this area.
Fundamental Concepts
Database is a shared collection of related data which will be used to support the
activities of particular organization. Database can be viewed as a repository of data
that is defined once and then is accessed by various users. A database has the
following properties:
Application program accesses the data stored in the database by sending requests to the DBMS.
The components of a database system
With the database approach, we can have the traditional banking system as shown in the
following picture.
Database approach for banking system
There are a number of characteristics that distinguish the database approach with the file-based
approach. In this section, we describe in detail some of those important characteristics.
Self-Describing Nature of a Database System : Database System contains not only the
database itself but also the descriptions of data structure and constraints (meta-data).
These information is used by the DBMS software or database users if needed. This
seperation makes database system totally different from traditional file-based sytem in
which data definition is a part of application programs
Insulation between Program and Data : In the filed base sytem, the structure of the data
files is defined in the application programs so if user want to change the structure of a
file, all the programs access to that files might need to be changed. On the other hand, in
database approach, data structure is stored in the system catalog not in the programs so
such changes might not occurs.
Support multiple views of data: A view is a subset of the database which is defined and
dedicated for particular users of the system. Multiple users in the system might have
different views of the system. Each view might contains only the interested data of an
user or a group of user.
Sharing of data and Multiuser system: A multiuser database system must allow multiple
users access the database at the same time. As the result, the multiuser DBMS must have
concurrency control strategies to ensure that several user try to access the same data item
at a time do so in the manner so that the data always be correct.
Data Independence
o The system data descriptions are separated from the application programs.
o Changes to the data structure is handled by the DBMS and not embedded in the
program.
Transaction Processing
o The DBMS must include concurrency control subsystem to ensure that several
users trying to update the same data do so in a controlled manner so that the result
of the updates is correct.
Data Model is a collection of concepts that can be used to describe the structure of
database. Structure of database means data types, relationships and restrictions. In
addition, most data model include a set of basic operations for specifying retrievals
and modifications on the database. Data Model provides a means to achieve Data
Abstraction. Data Abstraction is referring to the hiding of certain details of how the
data are stored and maintained. With several levels of abstraction, the user’s view
of the database is simplified and this leads to the improved understanding of data.
View level: The highest level of abstraction describes only part of the entire database.
Many users will not be concerned with the large database. Instead, they need to access
only a part of it so that view level abstraction is defined. There are many views for the
same database.
Logical level: This level describes what data are stored in the whole database.
Physical level: The lowest level of abstraction describes how the data are actually stored.
High-level Conceptual Data models: Provide concepts that are close to the way people
perceive data to present the data. Typical example of this type is entity – relationship
model which use main concepts like entities, attributes, relationships. An entity
represents real-world object such as an employee, a project. An entity has some attributes
which represents properties of entity such as employee’s name, address, birthdate. A
relationship represents association among entities for example a works on relationships
between employee and project.
Record-based Logical Data models: Provide concepts that can be understood by the user but not too far from the way data is
stored in the computer. Three well-known data models of this type are relational data model, network data model and hierarchical
data model.
MANAGERS
EMPS (ENAME, SALARY)
(ENAME)
SUPPLIERS (SNAME,
DEPTS (DNAME, DEPT#)
SADDR)
MANAGES (ENAME,
WORKS_IN (ENAME, DNAME)
DNAME)
PLACED_BY (O#,
CARRIES (INAME, DNAME)
CNAME)
CUSTOMERS(CNAME,CADDR,BALANCE)SUPPLIES
(SNAME, INAME, PRICES)INCLUDES (O#, INAME,
QUANTITY)
The Network model represents data as record types and also represents a limited type of
one to many relationship, called set type.
Physical Data models: Provide concepts that describe how data is actually stored in the
computer.
The description of the database which is designed in the early stage and is not expected to change frequently is called the
database schema. Database system have severals schemas
Since information can be inserted to or deleted from database at anytime, database changes over time. At a particular moment, the
collection of information stored in the database is called an instance of the database.
Three – level architecture for database system is proposed to archive the characteristics of the
database approach. The goal of this architecture is separate the applications and the physical
database so the actual details of how data is organized are hided from the users.
Three- level Architecture
As we can see from above picture, there are three levels of schemas in the database architecture
External level:
In this highest level, there exists a number of views which of is defined a part of the
actual database.
Each view is provided for a user or a group of users so that it helps in simplified the
interaction between the user and system.
Conceptual level: Conceptual Schema in this level describes the logical structure of the whole
database.
The entire database is described using simple logical concepts such as objects, their
properties or relationships. Thus the complexity of the implementation detail of the data
with be hided from the users.
Internal level: Internal Schema in this level describes how the data are actually stored, how to
access the data.
Data Independence
Data Independence is the ability to modify the schema in one level without affecting the schema
in the higher level.
Logical data independence is the ability to make change in the conceptual schema
without causing a modify in the user views or application program.
Physical data independence is the ability to make change in the internal schema without
causing a modify in the conceptual schema or application program.
Physical data independence seem to be easier to achieve since the way the data is organized in
the memory affect only the performance of the system. Meanwhile, the application program
depends much on the logical structure of the data that they are access.
Database Language
Data Definition Language (DDL): This is used to define the conceptual and internal schemas for
a database system.
It is not procedural language, rather a language for describing the types of entities and
relationships among them in terms of a particular data model.
Data Manipulation Language (DML): This is used to manipulate the database, which
typically include retrieval, insertion, deletion, and modification of the data.
Database Users
End users
These are People whose jobs require access to database for querying, updating and generating
report.
Naïve users who use the existing application programs to perform their daily tasks
Sophisticated users are who use their own way to access to the database. This means they
do not use the application program provided in the system. Instead, they might define
their own application or describe their need directly in query languages.
Specialized users maintain the personal database by using ready –make program
packages that provide easy-to-use menu.
Application Programmer
People implement specific application program to access to the stored data. This kind of user,
need to familiarize with the DBMS to accomplish their task.
Database Administrators
A person or a group of people in the organization who is responsible for authorizing the access to
the database, monitoring its use and managing all the resource to support the use of the whole
database system
Based on data model: The most popular data model in today commercial DBMSs is
relational data model. Almost wellknown DBMSs like Oracle, MS SQL Server, DB2,
MySQL are support this model. Other traditional models can be named hierarchical data
model , network data model. In the recent year, we are getting familiar with object-
oriented data model but this model has not had widespread use. Some examples of
Object-oriented DBMSs are O2, ObjectStore or Jasmine.
Based on number users we can have single user database system which support one user
at a time or multiuser syste,s which support multiple users concurrently
Based on the ways database is distributed we have centralized or distributed database
system
o Centralized database system : Data in this kind of system is stored at a single site.
Distributed database sytem: Actual database and DBMS software are distributed in
various sites connected by a computer network.
Homogeneous distributed Database Systems
o Use the same DBMS software at multiple sites
o Data exchange between various sites can be handle easily
Heterogeneous distributed Database Systems
o Different sites might use differents DBMS softwares
o There is a software to support data exchange between sites
Organizations employ Database Management Systems (or DBMS) to help them effectively manage
their data and derive relevant information out of it. A DBMS is a technology tool that directly supports
data management. It is a package designed to define, manipulate, and manage data in a database.
Some general functions of a DBMS:
Designed to allow the definition, creation, querying, update, and administration of databases
Define rules to validate the data and relieve users of framing programs for data maintenance
Convert an existing database, or archive a large and growing one
Run business applications, which perform the tasks of managing business processes, interacting
with end-users and other applications, to capture and analyze data
Some well-known DBMSs are Microsoft SQL Server, Microsoft Access, Oracle, SAP, and others.
Components of DBMS
DBMS have several components, each performing very significant tasks in the database
management system environment. Below is a list of components within the database and its
environment.
Software
This is the set of programs used to control and manage the overall database. This includes the
DBMS software itself, the Operating System, the network software being used to share the data
among users, and the application programs used to access data in the DBMS.
Hardware
Consists of a set of physical electronic devices such as computers, I/O devices, storage devices,
etc., this provides the interface between computers and the real world systems.
Data
DBMS exists to collect, store, process and access data, the most important component. The
database contains both the actual or operational data and the metadata.
Procedures
These are the instructions and rules that assist on how to use the DBMS, and in designing and
running the database, using documented procedures, to guide the users that operate and manage it.
Query Processor
This transforms the user queries into a series of low-level instructions. This reads the online user’s
query and translates it into an efficient series of operations in a form capable of being sent to the run
time data manager for execution.
Data Manager
Also called the cache manger, this is responsible for handling of data in the database, providing a
recovery to the system that allows it to recover the data after a failure.
Database Engine
The core service for storing, processing, and securing data, this provides controlled access and
rapid transaction processing to address the requirements of the most demanding data consuming
applications. It is often used to create relational databases for online transaction processing or online
analytical processing data.
Data Dictionary
This is a reserved space within a database used to store information about the database itself. A
data dictionary is a set of read-only table and views, containing the different information about the
data used in the enterprise to ensure that database representation of the data follow one standard
as defined in the dictionary.
Report Writer
Also referred to as the report generator, it is a program that extracts information from one or more
files and presents the information in a specified format. Most report writers allow the user to select
records that meet certain conditions and to display selected fields in rows and columns, or also
format the data into different charts.
On the other hand, in the database approach, the data structure is stored in the system
catalogue and not in the programs. Therefore, one change is all that is needed to
change the structure of a file. This insulation between the programs and data is also
called program-data independence.
Support for multiple views of data
A database supports multiple views of data. A view is a subset of the database, which
is defined and dedicated for particular users of the system. Multiple users in the
system might have different views of the system. Each view might contain only the
data of interest to a user or group of users.
The design of modern multiuser database systems is a great improvement from those
in the past which restricted usage to one person at a time.
In the database approach, ideally, each data item is stored in only one place in the
database. In some cases, data redundancy still exists to improve system performance,
but such redundancy is controlled by application programming and kept to minimum
by introducing as little redundancy as possible when designing the database.
Data sharing
The integration of all the data, for an organization, within a database system has many
advantages. First, it allows for data sharing among employees and others who have
access to the system. Second, it gives users the ability to generate more information
from a given amount of data than would be possible without the integration.
There are many types of database constraints. Data type, for example, determines the
sort of data permitted in a field, for example numbers only. Data uniqueness such as
the primary key ensures that no duplicates are entered. Constraints can be simple
(field based) or complex (programming).
Data independence
Another advantage of a database management system is how it allows for data
independence. In other words, the system data descriptions or data describing data
(metadata) are separated from the application programs. This is possible because
changes to the data structure are handled by the database management system and are
not embedded in the program itself.
Transaction processing
A database management system must include concurrency control subsystems. This
feature ensures that data remains consistent and valid during transaction processing
even if several users update the same information.
Key Terms
concurrency control strategies: features of a database that allow several users access
to the same data item at the same time
data type: determines the sort of data permitted in a field, for example numbers only
data uniqueness: ensures that no duplicates are entered
database constraint: a restriction that determines what is allowed to be entered or
edited in a table
metadata: defines and describes the data and relationships between tables in the
database
read and write privileges: the ability to both read and modify a file
read-only access: the ability to read a file but not make changes
self-describing: a database system is referred to as self-describing because it not only
contains the database itself, but also metadata which defines and describes the data
and relationships between tables in the database
Functions of DBMS
DBMS performs several important functions that guarantee the integrity and
consistency of the data in the database. The most important functions of
Database Management System are
The DBMS provides backup and data recovery to ensure data safety and
integrity.
Current DBMS systems provide special utilities that allow the DBA to perform
routine and special backup and restore procedures. Recovery management
deals with the recovery of the database after a failure, such as a bad sector
in the disk or a power failure. Such capability is critical to preserving the
database’s integrity.
The DBMS provides data access through a query language. A query language
is a non-procedural language—one that lets the user specify what must be
done without having to specify how it is to be done.
Structured Query Language (SQL) is the defacto query language and data
access standard supported by the majority of DBMS vendors.