CH 1 FDB 1
CH 1 FDB 1
CH 1 FDB 1
CHAPTER 1:
Introduction to Databases Systems
Compiled by: Firaol B.
BHU/ComputerScience/F-DatabaseSystem 1
OUTLINE
• Basics of Databases
• Types of Databases and Database Applications
• Basic Definitions
• Typical DBMS Functionality
• Example of a Database (UNIVERSITY)
• Main Characteristics of the Database Approach
• Types of Database Users
• Advantages of Using the Database Approach
• Historical Development of Database Technology
• Extending Database Capabilities
• When Not to Use Databases
BHU/ComputerScience/F-DatabaseSystem Chapter1- 2
Basic Definition
• Data:
• Known facts that can be recorded and have an implicit
meaning; raw
• There are three data handling methods
• Manual Approach
• Computerized Approaches
• File Based Approach
• Database Approach
BHU/ComputerScience/F-DatabaseSystem Chapter1- 3
Manual Approach
• Data storage and retrieval
follows the primitive and
traditional way of
data/information handling
where cards and paper are
used for the purpose of
keeping records.
• Typing the data on paper
and put in a file cabinet
• storage and retrieval will be
performed using human
labor.
• Works well if the number of
items to be stored is small
BHU/ComputerScience/F-DatabaseSystem Chapter1- 4
Limitations of manual approach
• Redundancy: multiple copies of the same data within
the organization.
• Data loss: due to damaged papers or unable to locate it.
• Inconsistency: Modifications are not reflected on all
multiple copies.
• Difficult to update, retrieve, integrate.
• You have the data but it is difficult to compile the
information
• Cross referencing is difficult
• Limited to small size information.
BHU/ComputerScience/F-DatabaseSystem Chapter1- 5
File based system
• An early attempt to computerize the manual filing
system.
• There were, and still are, several computer applications
with file based processing for the purpose of data
handling.
• In such systems, every application program that
provides service to end users define and manage its
own data.
• Such systems have number of programs for each of the
different applications in the organization. And this
approach is the decentralized computerized data
handling method.
• File, in file based approach, is a collection of records
which contains logically related data .
BHU/ComputerScience/F-DatabaseSystem Chapter1- 6
Cont’d
BHU/ComputerScience/F-DatabaseSystem Chapter1- 7
Limitations of file based system
• Data redundancy and inconsistency: data is stored in multiple file
formats resulting induplication of information in different files
• Difficulty in accessing data
• Need to write a new program to carry out each new task
• Data isolation
• Multiple files and formats
• Integrity problems
• Integrity constraints (e.g., account balance > 0) become “buried”
in program code rather than being stated explicitly
• Hard to add new constraints or change existing ones
BHU/ComputerScience/F-DatabaseSystem Chapter1- 8
Cont’d
• Atomicity of updates
• Failures may leave database in an inconsistent state with partial updates
carried out
• Example: Transfer of funds from one account to another should either
complete or not happen at all
• Concurrent access by multiple users
• Concurrent access needed for performance
• Uncontrolled concurrent accesses can lead to inconsistencies
• Ex: Two people reading a balance (say 100) and updating it by withdrawing
money (say 50 each) at the same time
• Security problems
• Hard to provide user access to some, but not all, data
• These difficulties, among others, prompted both the initial
development of database systems and the transition of file-based
applications to database systems, back in the 1960s and 1970s.
BHU/ComputerScience/F-DatabaseSystem Chapter1- 9
Shared file approach
• An approach to solving the problem of each application having its
own set of files is to share files between different applications.
BHU/ComputerScience/F-DatabaseSystem Chapter1- 10
Cont’d
• The introduction of shared files solves the problem of
duplication and inconsistent data across different
versions of the same file held by different departments,
but other problems may emerge, including:
• File incompatibility
• Difficult to control access
• Physical data dependence
• Difficult to implement concurrency
BHU/ComputerScience/F-DatabaseSystem Chapter1- 11
Database Approach
• Based on
• The database
• Database Management System (DBMS)
BHU/ComputerScience/F-DatabaseSystem Chapter1- 12
Cont’d
• Database
• a collection of related data with the following implicit properties:
• A database represents some aspect of the real world, sometimes called the
miniworld (Some part of the real world about which data is stored in a
database. For example, student grades and transcripts at a university.) or
the universe of discourse (UoD). Changes to the miniworld are reflected in
the database.
• A database is a logically coherent collection of data with some inherent
meaning. A random assortment of data cannot correctly be referred to as a
database.
• A database is designed, built, and populated with data for a specific
purpose. It has an intended group of users and some preconceived
applications in which these users are interested.
• a highly organized, interrelated, and structured set of data about
a particular enterprise.
• Controlled by a database management system (DBMS)
BHU/ComputerScience/F-DatabaseSystem Chapter1- 13
Cont’d
• Database Management System (DBMS) :
• is a computerized system that enables users to create and maintain a database.
• The DBMS is a general-purpose software system that facilitates the processes of
defining, constructing, manipulating, and sharing databases among various users
and applications.
• Defining a database involves specifying the data types, structures, and
constraints of the data to be stored in the database. The database definition or
descriptive information is also stored by the DBMS in the form of a database
catalog or dictionary; it is called meta-data.
• Constructing the database is the process of storing the data on some
storage medium that is controlled by the DBMS.
• Manipulating a database includes functions such as querying the database to
retrieve specific data, updating the database to reflect changes in the
miniworld, and generating reports from the data.
• Sharing a database allows multiple users and programs to access the database
simultaneously.
• Other important functions provided by the DBMS include protecting the
database
and maintaining it over a long period of time.
• Database System:
• The DBMS software together with the data itself. Sometimes, the applications
are also included.
BHU/ComputerScience/F-DatabaseSystem Chapter1- 14
Cont’d
• DBMS contains information about a particular enterprise
• Collection of interrelated data
• Set of programs to access the data
• An environment that is both convenient and efficient to use
• Database systems are used to manage collections of data
that are:
• Highly valuable
• Relatively large
• Accessed by multiple users and applications, often at the
same time.
• A modern database system is a complex software system
whose task is to manage a large, complex collection of data.
• Databases touch all aspects of our lives
BHU/ComputerScience/F-DatabaseSystem Chapter1- 15
Simplified Database System Environment
BHU/ComputerScience/F-DatabaseSystem Chapter1- 16
Database Applications Examples
•Traditional applications:
• Numeric and textual databases
•More recent applications:
• Multimedia databases
• Geographic Information Systems (GIS)
• Biological and genome databases
• Data warehouses
• Mobile databases
• Real-time and active databases
BHU/ComputerScience/F-DatabaseSystem Chapter1- 17
Cont’d
• Enterprise Information
• Sales: customers, products, purchases
• Accounting: payments, receipts, assets
• Human Resources: Information about employees, salaries, payroll
taxes.
• Manufacturing:
• management of production, inventory, orders, supply chain.
• Banking and finance
• customer information, accounts, loans, and banking transactions.
• Credit card transactions
• Finance: sales and purchases of financial instruments (e.g., stocks
and bonds; storing real-time market data
• Universities:
• registration, grades
BHU/ComputerScience/F-DatabaseSystem Chapter1- 18
Cont’d
• Airlines: reservations, schedules
• Telecommunication: records of calls, texts, and data
usage, generating monthly bills, maintaining balances on
prepaid calling cards
• Web-based services
• Online retailers: order tracking, customized
recommendations
• Online advertisements
• Document databases
• Navigation systems: For maintaining the locations of
varies places of interest along with the exact routes of
roads, train systems, buses, etc.
BHU/ComputerScience/F-DatabaseSystem Chapter1- 19
Recent Developments (1)
• Social Networks started capturing a lot of information
about people and about communications among people-
posts, tweets, photos, videos in systems such as:
- Facebook
- Twitter
- Linked-In
• All of the above constitutes data
• Search Engines, Google, Bing, Yahoo: collect their own
repository of web pages for searching purposes
BHU/ComputerScience/F-DatabaseSystem Chapter1- 20
Recent Developments (2)
• New technologies are emerging from the so-called non-
SQL, non-database software vendors to manage vast
amounts of data generated on the web:
• Big data storage systems involving large clusters of distributed
computers
• NOSQL (Non-SQL, Not Only SQL) systems
• A large amount of data now resides on the “cloud” which
means it is in huge data centers using thousands of
machines.
BHU/ComputerScience/F-DatabaseSystem Chapter1- 21
What is “big data”?
• "Big data are high-volume, high-velocity, and/or high-
variety information assets that require new forms of
processing to enable enhanced decision making, insight
discovery and process optimization” (Gartner 2012)
• Three Vs? Other Vs?
• Veracity: refers to the trustworthiness of the data
• Value: will data lead to the discovery of a critical causal effect?
• Bottom line: Any data that exceeds our current capability
of processing can be regarded as “big”
• Complicated (intelligent) analysis of data may make a small data
“appear” to be “big”
BHU/ComputerScience/F-DatabaseSystem Chapter1- 22
Impact of Databases and Database Technology
•Businesses: Banking, Insurance, Retail,
Transportation, Healthcare, Manufacturing
•Service industries: Financial, Real-estate, Legal,
Electronic Commerce, Small businesses
•Education : Resources for content and Delivery
•More recently: Social Networks, Environmental
and Scientific Applications, Medicine and Genetics
•Personalized applications: based on smart mobile
devices
BHU/ComputerScience/F-DatabaseSystem Chapter1- 23
A simplified architecture for a database system
BHU/ComputerScience/F-DatabaseSystem Slide 1- 24
A simplified architecture for a database system
BHU/ComputerScience/F-DatabaseSystem Chapter1- 25
What a DBMS Facilitates
•Define a particular database in terms of its data
types, structures, and constraints
•Construct or load the initial database contents on
a secondary storage medium
•Manipulating the database:
• Retrieval: Querying, generating reports
• Modification: Insertions, deletions and updates to its
content
• Accessing the database through Web applications
•Processing and sharing by a set of concurrent
users and application programs – yet, keeping all
data valid and consistent
BHU/ComputerScience/F-DatabaseSystem Chapter1- 26
Other DBMS Functionalities
•DBMS may additionally provide:
• Protection or security measures to prevent
unauthorized access
• “Active” processing to take internal actions on data
• Presentation and visualization of data
• Maintenance of the database and associated
programs over the lifetime of the database application
BHU/ComputerScience/F-DatabaseSystem Chapter1- 27
Application Programs and DBMS
•Applications interact with a database by
generating
- Queries: that access different parts of data and
formulate the result of a request
- Transactions: that may read some data and
“update” certain values or generate new data
and store that in the database
BHU/ComputerScience/F-DatabaseSystem Chapter1- 28
Example of a Database
(with a Conceptual Data Model)
•Mini-world for the example:
• Part of a UNIVERSITY environment
•Some mini-world entities:
• STUDENTs
• COURSEs
• SECTIONs (of COURSEs)
• (Academic) DEPARTMENTs
• INSTRUCTORs
BHU/ComputerScience/F-DatabaseSystem Chapter1- 29
Example of a Database
(with a Conceptual Data Model)
• Some mini-world relationships:
• SECTIONs are of specific COURSEs
• STUDENTs take SECTIONs
• COURSEs have prerequisite COURSEs
• INSTRUCTORs teach SECTIONs
• COURSEs are offered by DEPARTMENTs
• STUDENTs major in DEPARTMENTs
• Note: The above entities and relationships are typically
expressed in a conceptual data model, such as the entity-
relationship (ER) data or UML class model
BHU/ComputerScience/F-DatabaseSystem Chapter1- 30
Example of a Simple Database
BHU/ComputerScience/F-DatabaseSystem Chapter1- 31
The relational model
BHU/ComputerScience/F-DatabaseSystem Chapter1- 32
Main Characteristics of the Database
Approach
•Self-describing nature of a database system:
• A DBMS catalog stores the description of a particular
database (e.g. data structures, types, and constraints)
• The description is called meta-data*.
• This allows the DBMS software to work with different
database applications.
•Insulation between programs and data:
• Called program-data independence.
• Allows changing data structures and storage
organization without having to change the DBMS
access programs
• E.g., ADTs
BHU/ComputerScience/F-DatabaseSystem Chapter1- 33
Main Characteristics of the Database
Approach (continued)
• Data abstraction:
• A data model is used to hide storage details and present the
users with a conceptual view of the database.
• Programs refer to the data model constructs rather than data
storage details
• Support of multiple views of the data:
• Each user may see a different view of the database, which
describes only the data of interest to that user.
BHU/ComputerScience/F-DatabaseSystem Chapter1- 34
Main Characteristics of the Database
Approach (continued)
•Sharing of data and multi-user transaction
processing:
• Allowing a set of concurrent users to retrieve from and
to update the database.
• Concurrency control within the DBMS guarantees that
each transaction is correctly executed or aborted
• Recovery subsystem ensures each completed
transaction has its effect permanently recorded in the
database
• OLTP (Online Transaction Processing) is a major part of
database applications; allows hundreds of concurrent
transactions to execute per second.
BHU/ComputerScience/F-DatabaseSystem Chapter1- 35
Database Users
•Users may be divided into
• Those who actually use and control the database
content, and those who design, develop and maintain
database applications (called “Actors on the Scene”),
and
• Those who design and develop the DBMS software
and related tools, and the computer systems
operators (called “Workers Behind the Scene”).
BHU/ComputerScience/F-DatabaseSystem Chapter1- 36
Database Users – Actors on the Scene
•Actors on the scene
•Database administrators
• Responsible for authorizing access to the database, for
coordinating and monitoring its use, acquiring software and
hardware resources, controlling its use and monitoring
efficiency of operations.
•Database designers
• Responsible to define the content, the structure, the
constraints, and functions or transactions against the
database. They must communicate with the end-users and
understand their needs.
BHU/ComputerScience/F-DatabaseSystem Chapter1- 37
Database End Users
•Actors on the scene (continued)
• End-users: They use the data for queries, reports and
some of them update the database content. End-users
can be categorized into:
• Casual: access database occasionally when needed
• Naïve or parametric: they make up a large section of the end-user
population.
• They use previously well-defined functions in the form of “canned
transactions” against the database.
• Users of mobile apps mostly fall in this category
• Bank-tellers or reservation clerks are parametric users who do this activity
for an entire shift of operations.
• Social media users post and read information from websites
BHU/ComputerScience/F-DatabaseSystem Chapter1- 38
Database End Users (continued)
• Sophisticated:
• These include business analysts, scientists, engineers, others
thoroughly familiar with the system capabilities.
• Many use tools in the form of software packages that work
closely with the stored database.
• Stand-alone:
• Mostly maintain personal databases using ready-to-use
packaged applications.
• An example is the user of a tax program that creates its own
internal database.
• Another example is a user that maintains a database of
personal photos and videos.
BHU/ComputerScience/F-DatabaseSystem Chapter1- 39
Database Users – Actors on the
Scene (continued)
• System analysts and application developers
• System analysts: They understand the user requirements of
naïve and sophisticated users and design applications including
canned transactions to meet those requirements.
• Application programmers: Implement the specifications
developed by analysts and test and debug them before
deployment.
• Business analysts: There is an increasing need for such people
who can analyze vast amounts of business data and real-time
data (“Big Data”) for better decision making related to planning,
advertising, marketing etc.
BHU/ComputerScience/F-DatabaseSystem Slide 1- 40
Database Users – Actors behind the Scene
• System designers and implementors: Design and implement
DBMS packages in the form of modules and interfaces
and test and debug them. The DBMS must interface with
applications, language compilers, operating system
components, etc.
• Tool developers: Design and implement software systems
called tools for modeling and designing databases,
performance monitoring, prototyping, test data
generation, user interface creation, simulation etc. that
facilitate building of applications and allow using database
effectively.
• Operators and maintenance personnel: They manage the
actual running and maintenance of the database system
hardware and software environment.
BHU/ComputerScience/F-DatabaseSystem Chapter1- 41
Advantages of Using the Database Approach
• Controlling redundancy in data storage and in
development and maintenance efforts.
• Sharing of data among multiple users.
• Restricting unauthorized access to data. Only the DBA
staff uses privileged commands and facilities.
• Providing persistent storage for program Objects
• E.g., Object-oriented DBMSs make program objects persistent.
• Providing storage structures (e.g. indexes) for efficient
query processing.
BHU/ComputerScience/F-DatabaseSystem Chapter1- 42
Advantages of Using the Database
Approach (continued)
• Providing optimization of queries for efficient processing
• Providing backup and recovery services
• Providing multiple interfaces to different classes of users
• Representing complex relationships among data
• Enforcing integrity constraints on the database
• Drawing inferences and actions from the stored data
using deductive and active rules and triggers
BHU/ComputerScience/F-DatabaseSystem Chapter1- 43
Additional Implications of Using the
Database Approach
• Potential for enforcing standards:
• Standards refer to data item names, display formats, screens,
report structures, meta-data (description of data), Web page
layouts, etc.
• Reduced application development time:
• Incremental time to add each new application is reduced.
BHU/ComputerScience/F-DatabaseSystem Chapter1- 44
Additional Implications of Using the
Database Approach (continued)
• Flexibility to change data structures:
• Database structure may evolve as new requirements are defined.
• Availability of current information:
• Extremely important for on-line transaction systems such as
shopping, airline, hotel, car reservations.
• Economies of scale:
• Wasteful overlap of resources and personnel can be avoided by
consolidating data and applications across departments.
BHU/ComputerScience/F-DatabaseSystem Chapter1- 45
When not to use a DBMS
•Main inhibitors (costs) of using a DBMS:
• High initial investment and possible need for additional
hardware
• Overhead for providing generality, security,
concurrency control, recovery, and integrity functions
•When a DBMS may be unnecessary:
• If the database and applications are simple, well
defined, and not expected to change
• If access to data by multiple users is not required
•When a DBMS may be infeasible
• In embedded systems where a general-purpose DBMS
may not fit in available storage
BHU/ComputerScience/F-DatabaseSystem Chapter1- 46
When not to use a DBMS
•When no DBMS may suffice:
•If there are stringent real-time requirements
that may not be met because of DBMS
overhead (e.g., telephone switching systems)
• If the database system is not able to handle the
complexity of data because of modeling limitations
(e.g., in complex genome and protein databases)
• If the database users need special operations not
supported by the DBMS (e.g., GIS and location-based
services).
BHU/ComputerScience/F-DatabaseSystem Chapter1- 47