DBMS-Week1
DBMS-Week1
Lecture # 1-2
Instructor
Rida Ayesha
Lecture # 1-2 Agenda:
● Instructor’s Introduction
● Student’s Introduction (Class Activity 0)
● Discussion of class & assessment rules, course outline, marks distribution
● Introduction to course
o What is Data? What is Information?
2
Instructor’s Introduction
3
Student’s Introduction-Class Activity
4
Class Activity 0-Student’s Introduction
1. What’s one thing you’re excited about in this course (even if you don’t know
much about it yet)?
2. What’s one app or website you use daily that you think relies heavily on
databases?
5
Class Rules
• The use of cell phones (texting, calling, watching mute videos) is not allowed
during all lectures.
• Cell phones must be turned on silent mode and put out of sight during class.
6
Assessment Rules
• All deadlines for assignments will be HARD deadlines. No late submissions will be
accepted after due date and time in any case.
• Your course includes anything and everything from your book as well as your class
discussion (including any guest lectures); DO NOT rely on course slides only.
• No questions will be answered during the exam/quiz assessment; always state any
assumptions you have along with your answer(s).
• Cheating, plagiarism and other forms of academic fraud/dishonesty are taken very
seriously and will result in marks deduction/UMC.
7
Assessment Grading Scheme (tentative)
8
Course Introduction
Introduction:
The goal of the course is to present an introduction to database management systems,
with an emphasis on how to organize, maintain and retrieve - efficiently, and
effectively - information from a DBMS.
Text/Reference Material:
• Database Systems: A Practical Approach to Design, Implementation, and
Management, 6th Edition by Thomas Connolly and Carolyn Begg
• Database System Concepts, 6th Edition by Avi Silberschatz, Henry F. Korth and S.
Sudarshan.
9
Introduction-Data
Data?
• The term data is defined as “facts and figures”.
• Data may consist of numbers, characters, symbols, pictures, videos, audios or any
other forms.
Example:
• raw facts about students: name , class , age etc
10
Introduction-Information
Information?
• Organized and processed form of data.
Examples:
• Marks of a student in different subject is the data.
• Total marks is the information.
• Average mark is also information.
11
Introduction-Data vs Information
Data Information
Unprocessed raw facts Processed form of data
13
Introduction-Database (cont.)
⮚ Database:
• A Database is a collection of related data organized in a way that data can be
easily accessed, managed and updated. Database with one sole purpose, storing
data. Database is abbreviated as DB.
For example,
• 'What are the phone numbers and addresses of the five nearest post offices?'
• 'Do we have any books in our library that deal with health food?'
14
Introduction-Database Application Examples
Computerized Flight
library systems reservation
systems
1. Enterprise Information
• Sales: customers, products, purchases
• Accounting: payments, receipts, assets
• Human Resources: Information about employees, salaries, payroll taxes.
2. Manufacturing: management of production, inventory, orders, supply chain.
15
Introduction-Database Application Examples (cont.)
16
Introduction-Database Management System (DBMS)
Examples of DBMSs – Oracle, IBM DB2, Microsoft SQL Server, Vertica, Teradata –
Open source: MySQL (Sun/Oracle), PostgreSQL, CouchDB – Open source library:
SQLite
17
Data vs Information
Database vs DBMS
18
Introduction-Metadata
19
Introduction-Metadata (cont.)
Example:
Every time you take a photo with today's cameras a bunch of metadata is gathered
and saved with it:
• date and time
• filename
• camera settings
• geolocation
20
Introduction-Metadata (cont.)
Example:
Each book has a number of standard metadata on the covers and inside. This includes:
• a title
• author name
• publisher and copyright details
• description on a back
• table of contents
• index
• page numbers
21
Introduction-Metadata (cont.)
Example:
Spreadsheet contains a few metadata fields
• tab names
• column names
• user comments
22
Class Activity 1
• Data
• Information
• Metadata
23
Importance of the Databases
• Computer applications are divided into commercial and scientific (or engineering)
ones.
• These applications mainly store the data in the computer storage, then access and
present it to the users in different formats (also termed as data processing) for
example, banks, shopping, production, utilities billing, customer services.
24
Data Management/Record Keeping Techniques
The two common Record Keeping Techniques are: Manual Record Keeping &
Computerized Record Keeping
File-based systems
Database
systems
25
File-based Systems
In a file-based systems data is stored in discrete files and a collection of such files is
stored on a computer.
Files of archived data were called tables because they looked like tables used in
traditional file keeping.
Rows in the table were called records and columns were called fields. An example of
the file-based system is illustrated in the following table:
26
Case Study-Traditional and File Processing System
• Each Office is maintaining its own set of files for its day to day operations.
• Some of the files used in the system are Student’s File, Faculty File, Course File etc.
27
Manual System - Disadvantages
High data volume
Not reliable
Inefficient
Duplication of data
Inconsistency
• The manual filing system breaks down when we have to cross-reference or process
the information in the files. For example, a typical real estate agent’s office might
have a separate file for each property for sale or rent, each potential buyer and
renter, and each member of staff. 28
Example of File System
• In our own home, we probably have some sort of filing system, which contains
receipts, guarantees, invoices, bank statements, and such like.
• When we need to look something up, we go to the filing system and search
through the system starting from the first entry until we find what we want.
• Alternatively, we may have an indexing system that helps to locate what we want
more quickly.
For example, we may have divisions in the filing system or separate folders for
different types of item that are in some way logically related.
29
File Processing System
• File-based systems were an early attempt to computerize the manual filing system.
• Each application will have its own set of Private Files designed to meet the needs
of a particular department.
• Before the advent of database systems, computer-readable data was usually kept
in files stored on magnetic tape or disk.
30
File Processing System (cont.)
31
File Processing System (cont.)
32
Examples of File Processing System
33
Disadvantages of File-based Systems
1. Data Redundancy 10. Maintenance
2. Data Inconsistency
3. Data Dependence
4. Poor Enforcement of
Standards
6. Inflexible
9. Atomicity
34
Disadvantages of File-based Systems (cont.)
1. Redundancy of Data
• Redundancy means duplication of data
• Higher storage
35
Disadvantages of File-based Systems (cont.)
2. Data Inconsistencies or Data Anomalies
• Redundancy leads to data inconsistency or data anomalies.
• A data anomaly occurs if an operation (update, insert, delete) has not yet been
performed against all the occurrences. Consequently, same data stored at different
places will disagree with each other
36
Disadvantages of File-based Systems (cont.)
4. Data Dependence
• The definition of data is embedded in the application programs, rather than being
stored separately and independently.
• The applications are constrained to work only with the given file description. Any
change in the file structure or data requires changes to all the applications using
that file. Such applications are called Data Dependent Applications.
37
Disadvantages of File-based Systems (cont.)
5. Limited Data Sharing
• As each application has its own private files so little opportunity to share data with
others.
6. Inflexible
• A traditional file system can deliver routine scheduled reports after extensive
programming efforts, but it cannot deliver ad-hoc reports or respond to
unanticipated information requirements in a timely fashion.
• Any system may fail at any time and at that time it is desired that data should be in
a consistent state.
For Example: If you are buying a ticket from railway and you are in the process of
money transaction. Suddenly, your internet got disconnected then you may or may
not have paid for the ticket. If you have paid then your ticket will be booked and if not
then you will not be charged anything. That is called consistent state, means you have
paid or not.
10. Maintenance
• Besides the above, the maintenance of the File Based System is complex and there
is no provision for security. Recovery is non-existent or inadequate.
39
Integrated Database Environment
• IDE has a single large repository of data, called database, where data definition is
separated from application programs.
• The organization wide requirements are analysed as a whole and there is no longer
concept of MY FILE or Private Files.
• A database can handle any kind of records, like text, number, images, date, sounds
etc.
• The database in not owned by a single department, but it is owned by the whole
organization and is managed by a single person called Database Administrator
(DBA).
40
Advantages of Database Systems
• DBMS manages data resources like an operating system manages hardware
resources
41
Advantages of Database Systems
1. Reduced or Controlled Data
10. Data Independence
Redundancy
3. Enforcement of Standards
4. Reduced Program
Maintenance
5. Data Sharing
• Separate data files are integrated into a single logical structure to reduce
redundancy.
• If a data item appears only once, any change to its value needs to be performed
only for once and the database will always be in some consistent / correct state.
43
Advantages of Database Systems (cont.)
3. Enforcement of Standards
• This is possible as database is designed to meet the organization wide
requirements.
• The standards can be name of data items and their format, data codes,
documentation standards, operation standards, security policies etc.
44
Advantages of Database Systems (cont.)
5. Data Sharing
• Sharing means that the same data source is used by multiple applications.
• Data is centralized and hence can be shared not only by the existing applications
but also new applications can be developed to operate against the same data.
• The applications that can access same data simultaneously are called Online
Transaction Processing (OLTP) applications and DBMS uses Concurrency Control
Mechanism to ensure that multiple users can access and update data correctly.
45
Advantages of Database Systems (cont.)
6. Data Integrity (Improved Data Quality)
• Data integrity refers to the correctness of data.
• Integrity constraints or rules ensure that the data stored in the database is purified
and accurate.
• DBMS provides strong security measures against such threats. Some of them are:
• Password checks
• User defined procedures
• Defining user privileges
• Audit trial system
• Data encryption
46
Advantages of Database Systems (cont.)
8. Improved Accessibility & Responsiveness
• In File Processing System, data accessibility is quite difficult as it is procedural
based. You should know the detailed procedural steps. HOW TO DO?
• On the other hand, accessing data is lot easier in DBMS with the help of a non
procedural language – SQL. You only need to know the simple commands or in
other words WHAT TO DO?
47
Advantages of Database Systems (cont.)
10. Data Independence
• The separation of data descriptions from the applications using the data is called
data independence.
• Allows change & evolution of database systems without changing the application
programs.
• Different data arrangements will need different algorithms even for the same
operation.
48
Advantages of Database Systems (cont.)
10. Data Independence (cont.)
• Clearly, data dependence is not desired because of the following reasons:
b) Users should not have to deal directly with the physical database storage
details.
c) Different application may need different physical structure for the efficiency of
their operations.
d) The DBA should have freedom to change the storage structure or access
technique or both in response to changing requirements without much disturbing
the existing applications.
49
Disadvantages of Database Systems
50
Contrasting Database & File Systems
51
Contrasting Database & File Systems (cont.)
52
Contrasting Database & File Systems (cont.)
53
Introduction to Database Environment
54
Components of Database Environment
1. Hardware
2. Software
3. Data
4. Procedures
5. People/User Groups
55
Components of Database Environment (cont.)
1. Hardware
• The DBMS and the applications require hardware to run. The hardware can range
from a single personal computer to a single mainframe or a network of computers.
• The particular hardware depends on the organization’s requirements and the DBMS
used.
• Some DBMSs run only on particular hardware or operating systems, while others
run on a wide variety of hardware and operating systems.
• A DBMS requires a minimum amount of main memory and disk space to run, but
this minimum configuration may not necessarily give acceptable performance.
56
Components of Database Environment (cont.)
2. Software
• The software component comprises the DBMS software itself and the application
programs, together with the operating system, including network software if the
DBMS is being used over a network.
• The target DBMS may have its own fourth-generation tools that allow rapid
development of applications through the provision of nonprocedural query
languages, reports generators, forms generators, graphics generators, and
application generators.
• In Figure (slide # 55) , we observe that the data acts as a bridge between the
machine components and the human components.
• The database contains both the operational data and the metadata, the “data about
data.”
4. Procedures
• Procedures refer to the instructions and rules that govern the design and use of the
database.
58
Components of Database Environment (cont.)
4. Procedures (cont.)
• The users of the system and the staff who manage the database require
documented procedures on how to use or run the system.
5. People
• The final component is the people involved with the system. 59
Types of Database Users/Roles
• Database users are the persons who interact with the database and take the
benefits of database.
• Users are differentiated by the way they expect to interact with the system.
The database and the DBMS are corporate resources that must be managed like any
other resource. Data and database administration are the roles generally associated
with the management and control of a DBMS and its data.
• The Data Administrator (DA) is responsible for the management of the data
resource, including database planning; development and maintenance of
standards, policies and procedures; and conceptual/logical database design.
• The DA consults with and advises senior managers, ensuring that the direction
of database development will ultimately support corporate objectives. 60
Types of Database Users/Roles (cont.)
• The Database Administrator (DBA) is responsible for the physical realization of
the database, including physical database design and implementation,
security and integrity control, maintenance of the operational system, and
ensuring satisfactory performance of the applications for users.
• The role of the DBA is more technically oriented than the role of the DA,
requiring detailed knowledge of the target DBMS and the system environment.
2. Database Designers
• Database Designers are the users who design the structure of database which
includes tables, indexes, views, constraints, triggers, stored procedures.
He/she controls what data must be stored and how the data items to be
related.
61
Types of Database Users/Roles (cont.)
3. Application Programmers/Developers
• Once the database has been implemented, the application programs that
provide the required functionality for the end-users must be implemented.
• Each program contains statements that request the DBMS to perform some
operation on the database, which includes retrieving data, inserting,
updating, and deleting data.
62
Types of Database Users/Roles (cont.)
4. End Users
The end-users are the “clients” of the database, which has been designed and
implemented and is being maintained to serve their information needs. End-users can
be classified according to the way they use the system:
• Naïve users
o They are typically unaware of the DBMS. They access the database through
specially written application programs that attempt to make the operations as
simple as possible.
o For example, the checkout assistant at the local supermarket uses a bar code
reader to find out the price of the item. However, there is an application
program present that reads the bar code, looks up the price of the item in the
database, reduces the database field containing the number of such items in
stock, and displays the price on the till. 63
Types of Database Users/Roles (cont.)
4. End Users (cont.)
• Sophisticated users
o At the other end of the spectrum, the sophisticated end-user is familiar with
the structure of the database and the facilities offered by the DBMS.
o Some sophisticated end-users may even write application programs for their
own use.
64
Database Design: The Paradigm Shift
(History of Databases Systems)
65
History of Database Systems
1. Early Data Management (1960s)
• File Systems: Early data storage relied on flat file systems, where data was stored in
individual files without structured relationships.
• E.F. Codd's Relational Model (1970): Proposed a way to manage data using tables
(relations), leading to the development of SQL.
• First Relational Database Management Systems (RDBMS): IBM’s System R and Oracle
in the late 1970s. 66
History of Database Systems (cont.)
4. SQL and Commercialization (1980s)
• SQL Standardization: SQL became the standard query language for RDBMS,
enhancing data manipulation capabilities.
67
History of Database Systems (cont.)
6. NewSQL and Cloud Databases (2010s)
• NewSQL: Aimed to provide the scalability of NoSQL with the ACID guarantees of
traditional RDBMS (e.g., VoltDB).
• Multi-Model Databases: Support for multiple data models (e.g., document, graph)
within a single database.
68