0% found this document useful (0 votes)
21 views

Unit-1-2

The document provides an overview of Database Management Systems (DBMS), covering their necessity, architecture, and various types, including relational and NoSQL databases. It discusses the advantages of DBMS over traditional file processing systems, such as reduced redundancy, improved data integrity, and enhanced security. Additionally, it outlines the structure of a DBMS, including components like the query processor, storage manager, and the three levels of database architecture: internal, conceptual, and external.

Uploaded by

aimlpublic27
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

Unit-1-2

The document provides an overview of Database Management Systems (DBMS), covering their necessity, architecture, and various types, including relational and NoSQL databases. It discusses the advantages of DBMS over traditional file processing systems, such as reduced redundancy, improved data integrity, and enhanced security. Additionally, it outlines the structure of a DBMS, including components like the query processor, storage manager, and the three levels of database architecture: internal, conceptual, and external.

Uploaded by

aimlpublic27
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 76

Database Management Systems

UNIT-I
UNIT-1
Issues in File Processing System, Need for DBMS, Basic terminologies
of Database, Database system Architecture, Various Data models, ER
diagram basics and extensions, Case study : Construction of Database
design using Entity Relationship diagram for an application such as
University Database, Banking System, Information System
File-based Definition
• A file processing system is a collection of programs that store and manage files in computer hard-disk.
File processing system has more data redundancy, whereas there is less data redundancy in dbms. File
processing system provides less flexibility in accessing data, whereas dbms has more flexibility in
accessing data.
• Program defines and manages it’s own data
Types Of File Processing System:
• Relative-record-number processing: Data is accessed using a relative position (record
number) rather than a key. It’s efficient for direct access.
• Consecutive processing: Data is processed in a fixed, sequential order, one record after
another, without skipping.
• Sequential-by-key processing: Records are processed in a sequential order based on
their key values, typically sorted.
• Random-by-key processing: Accessing records randomly based on the key, allowing
immediate retrieval of any record without following a sequence.
• Sequential-within-limits processing: Data is processed in sequence, but only within
specified range limits, useful for subset processing.
Disadvantages of file processing:
Data Redundancy
Data Inconsistency
Lack of Data Integration
Difficulty in Access
Poor Data Security
No Support for Multi-User Access
Lack of Backup and Recovery
Limited Query Capabilities
Inefficient Data Handling
Difficulty in Maintaining Relationships
No Standardization
Database management system (DBMS):

A Database Management System (DBMS) is software designed to store, retrieve, define, and manage data in a
database.

DBMS examples include:


• MySQL
• SQL Server
• Oracle
• dBASE
• FoxPro
History of Database Systems
• First Generation
• Data Models: Hierarchical Model and Network Model
• Hierarchical Model: Organizes data in a tree-like structure, where each parent can
have multiple children, but each child has only one parent.
• Example: Information Management System (IMS)
• Network Model: Organizes data in a graph structure, allowing many-to-many
relationships.
• Example: CODASYL (Conference on Data Systems Languages) and DBTG (Database Task
Group)
• Limitations:
• Complex programming for simple queries.
• Minimal data independence (changes in the database required changes in the
program).
• No strong theoretical foundation, making it hard to develop further.
History of Database Systems
• Second Generation
• Data Model: Relational Model
• Introduced by E. F. Codd in the 1970s.
• Based on organizing data into tables (relations) with rows (records) and
columns (attributes). Queries could be written in a high-level language like
SQL.
• Examples of relational databases: DB2 (by IBM), Oracle.
• Limitations:
• Limited data modeling capabilities (e.g., it couldn't naturally handle complex
data like multimedia or hierarchical relationships).
History of Database Systems
• Third Generation
• Data Models:
• Object-Relational DBMS: Combines the features of object-oriented
programming (like inheritance and classes) with relational databases to
handle complex data types.
• Object-Oriented DBMS: Fully integrates object-oriented programming
concepts to store objects directly in the database.
• Suitable for advanced applications like CAD (Computer-Aided Design),
multimedia, and scientific computing.
Types of DBMS
• Relational DBMS (RDBMS):
• Organizes data into tables (relations) with rows and columns.
• Tables are linked using keys (primary and foreign keys).
• Uses Structured Query Language (SQL) for data manipulation.
• Example: MySQL, PostgreSQL, Oracle Database, Microsoft SQL Server.
• Hierarchical DBMS:
• Organizes data in a tree-like structure, where each record (node) has a parent/child
relationship.
• Each parent can have multiple children, but each child has only one parent.
• Example: IBM IMS.
• Network DBMS:
• Uses a graph-like structure with records connected through links (or pointers).
• Each record can have multiple parent and child records.
• Example: IDMS.
Types of DBMS
• Object-Oriented Database Management System (OODBMS )
• Stores data in objects, as in object-oriented programming.
• Supports complex data types, inheritance, and polymorphism.
• Examples:ObjectDB, db4o.
• NoSQL DBMS:
• Non-relational, supports unstructured data.
• Designed for unstructured, semi-structured, or highly variable data.
• Does not use a fixed schema and is optimized for scalability and performance.
• Example: MongoDB, Cassandra.
Database

• Definition
• A collection of self-describing and integrated data files
• Self-describing: It contains metadata (data about the data), which defines the
structure, relationships, and constraints of the stored data.
• Integrated: Data is centralized, ensuring consistency and eliminating
redundancy.
• System catalog- collection of system tables maintained by the database
management system (DBMS).
• It acts as a repository for metadata, storing details like table schemas, relationships,
indexes, users, and access permissions.
• It ensures the DBMS can manage and interpret data correctly.
Database

• Meta data
• Metadata is data that describes other data, often called "data about
data."Examples include the data types of fields, constraints, and relationships
between tables.
• Stored in the data dictionary, metadata helps the DBMS understand how to
handle and query the data.
• Overhead data refers to auxiliary information maintained by the
DBMS to ensure reliability and accountability.
• Logs: Used for recovery in case of system crashes or failures.
• Audit trails: Track user actions to ensure security and compliance.
Data abstraction
• Data abstraction is a key concept in database management systems (DBMS) that
focuses on hiding the complexities of how data is stored, managed, and
maintained from the end-users. It provides a simplified view of the database and
separates the logical structure of data from the physical storage.
• Data abstraction is the process of hiding implementation details from users to
provide a simplified view of the database.
• Goal: To shield users from the complexities of physical data storage and
management.
• Levels of Abstraction:
• Physical Level: How the data is stored physically on the disk (e.g., blocks, files).
• Logical Level: How the data is structured (e.g., tables, columns, and relationships).
• View Level: How the data appears to end-users (e.g., reports, dashboards).
• Abstraction makes it easier for developers, administrators, and end-users to
interact with databases without worrying about underlying complexities.
Database Management System Facility
• Structured query language (SQL)
Data definition language (DDL) – Enables users to define db schema
Data manipulation language (DML) – supports querying and modifying data
• Security system – supports user authentication and authorization
• Integrity system-accuracy, consistency, and reliability –implements constraints to
ensure data integrity
• Concurrency control system – manages simultaneous access by multiple users
• Backup & recovery system – facilities of periodic data backups and recovery
mechanisms after failures
• View mechanism
DBMS Environment
• Hardware
• Client-server architecture
• Software
• dbms, os, network, application
• Data
• Schema, subschema, table, attribute
• People
• Data administrator & database administrator
• Database designer: logical & physical
• Application programmer
• End-user: naive & sophisticated
• Procedure
• Start, stop, log on, log off, back up, recovery
DBMS Environment
1. Hardware
• Client-Server Architecture: The physical infrastructure that hosts the DBMS, which includes:
• Servers to host the database.
• Client systems for user interaction.
• Network components for communication.
2. Software
• DBMS Software: The database management system itself.
• Operating System (OS): The system software supporting the DBMS.
• Network Software: Facilitates communication in distributed systems.
• Application Software: Interfaces for interacting with the database (e.g., applications, front-end
tools).
DBMS Environment
• Data
• Schema: The overall structure of the database, including logical design.
• Subschema: Specific views or parts of the schema for user roles or applications.
• Tables: Collections of rows and columns storing data.
• Attributes: Columns in a table representing data fields.
• People
• Data administrator & database administratorData Administrator: Defines data policies and ensures data
integrity.
• Database Administrator (DBA): Manages the DBMS, including maintenance, security, and performance.
• Database Designer:
• Logical Designer: Focuses on the conceptual and logical structure.
• Physical Designer: Optimizes the storage and retrieval mechanisms.
• Application Programmer: Develops applications that interact with the database.
• End-Users:
• Naive Users: Use predefined queries and interfaces.
• Sophisticated Users: Write their own queries and interact directly with the DBMS.
DBMS Environment
• Procedure
• Start: Initialize the DBMS and its components.
• Stop: Properly shut down the DBMS.
• Log On: Authenticate and access the system.
• Log Off: Securely exit the system.
• Back Up: Create copies of the database for disaster recovery.
• Recovery: Restore the database to a consistent state after a failure.
Advantages of DBMS
• Control redundancy
• Consistency
• Integrity-accuracy, consistency, and reliability
• Security
• Concurrency control
• Backup & recovery
• Data standard
• More information
• Data sharing & conflict control
• Productivity & accessibility
• Economy of scale
• Maintenance
Need for DBMS
A Data Base Management System is a system software for easy, efficient and reliable data processing and
management. It can be used for:
• Creation of a database.
• Retrieval of information from the database.
• Updating the database.
• Managing a database.
• Multiple User Interface
1. Data Organization and Management:
One of the primary needs for a DBMS is data organization and management. DBMSs allow data to be stored
in a structured manner, which helps in easier retrieval and analysis. A well-designed database schema
enables faster access to information, reducing the time required to find relevant data. A DBMS also
provides features like indexing and searching, which make it easier to locate specific data within the
database. This allows organizations to manage their data more efficiently and effectively
Data Security and Privacy:
DBMSs provide a robust security framework that ensures the confidentiality, integrity, and availability of data.
They offer authentication and authorization features that control access to the database. DBMSs also provide
encryption capabilities to protect sensitive data from unauthorized access. Moreover, DBMSs comply with various
data privacy regulations such as the GDPR, ensuring that organizations can store and manage their data in
compliance with legal requirements.
3. Data Integrity and Consistency:
Data integrity and consistency are crucial for any database. DBMSs provide mechanisms that ensure the accuracy
and consistency of data. These mechanisms include constraints, triggers, and stored procedures that enforce data
integrity rules. DBMSs also provide features like transactions that ensure that data changes are atomic, consistent,
isolated, and durable (ACID).
4. Concurrent Data Access:
A DBMS provides a concurrent access mechanism that allows multiple users to access the same data
simultaneously. This is especially important for organizations that require real-time data access. DBMSs use locking
mechanisms to ensure that multiple users can access the same data without causing conflicts or data corruption.
5. Data Analysis and Reporting:
DBMSs provide tools that enable data analysis and reporting. These tools allow organizations to extract useful
insights from their data, enabling better decision-making. DBMSs support various data analysis techniques such as
OLAP, data mining, and machine learning. Moreover, DBMSs provide features like data visualization and reporting,
which enable organizations to present their data in a visually appealing and understandable way.
6. Scalability and Flexibility:
DBMSs provide scalability and flexibility, enabling organizations to handle increasing amounts of data.
DBMSs can be scaled horizontally by adding more servers or vertically by increasing the capacity of
existing servers. This makes it easier for organizations to handle large amounts of data without
compromising performance. Moreover, DBMSs provide flexibility in terms of data modeling, enabling
organizations to adapt their databases to changing business requirements.
7. Cost-Effectiveness:
DBMSs are cost-effective compared to traditional file-based systems. They reduce storage costs by
eliminating redundancy and optimizing data storage. They also reduce development costs by providing
tools for database design, maintenance, and administration. Moreover, DBMSs reduce operational costs
by automating routine tasks and providing self-tuning capabilities.
Structure of Database Management System

• Database Management System (DBMS) is software that allows access to data stored in a database and provides an
easy and effective method of –

• Defining the information.


• Storing the information.
• Manipulating the information.
• Protecting the information from system crashes or data theft.
• Differentiating access permissions for different users.

• The database system is divided into three components:


• Query Processor
• Storage Manager
• Disk Storage.
1. Query Processor: It interprets the requests (queries) received from end user via an application program into
instructions. It also executes the user request which is received from the DML compiler.
Query Processor contains the following components –
• DML Compiler: It processes the DML statements into low level instruction (machine language), so that they
can be executed.
• DDL Interpreter: It processes the DDL statements into a set of table containing meta data (data about
data).
• Embedded DML Pre-compiler: It processes DML statements embedded in an application program into
procedural calls.
• Query Optimizer: It executes the instruction generated by DML Compiler.
2. Storage Manager: Storage Manager is a program that provides an interface between the data stored in the
database and the queries received. It is also known as Database Control System. It maintains the consistency
and integrity of the database by applying the constraints and executing the DCL statements. It is responsible
for updating, storing, deleting, and retrieving data in the database.
It contains the following components –
• Authorization Manager: It ensures role-based access control, i.e,. checks whether the particular person is
privileged to perform the requested operation or not.
• Integrity Manager: It checks the integrity constraints when the database is modified.
• Transaction Manager: It controls concurrent access by performing the operations in a scheduled way that it
receives the transaction. Thus, it ensures that the database remains in the consistent state before and after
the execution of a transaction.
• File Manager: It manages the file space and the data structure used to represent information in the
database.
• Buffer Manager: It is responsible for cache memory and the transfer of data between the secondary
storage and main memory.
Disk Storage: It contains the following components –
• Data Files: It stores the data.
• Data Dictionary: It contains the information about the structure of any database object. It is the repository of
information that governs the metadata.
• Indices: It provides faster retrieval of data item.
The structure of a Database Management System (DBMS) can be divided into three main components: the
Internal Level, the Conceptual Level, and the External Level.
1.Internal Level: This level represents the physical storage of data in the database. It is responsible for storing
and retrieving data from the storage devices, such as hard drives or solid-state drives. It deals with low-level
implementation details such as data compression, indexing, and storage allocation.
2.Conceptual Level: This level represents the logical view of the database. It deals with the overall
organization of data in the database and the relationships between them. It defines the data schema, which
includes tables, attributes, and their relationships. The conceptual level is independent of any specific DBMS
and can be implemented using different DBMSs.
3.External Level: This level represents the user’s view of the database. It deals with how users access the
data in the database. It allows users to view data in a way that makes sense to them, without worrying about
the underlying implementation details. The external level provides a set of views or interfaces to the database,
which are tailored to meet the needs of specific user groups.
• The three levels are connected through a schema mapping process that translates data from one level to
another. The schema mapping process ensures that changes made at one level are reflected in the other levels.

• In addition to these three levels, a DBMS also includes a Database Administrator (DBA) component, which is
responsible for managing the database system. The DBA is responsible for tasks such as database design,
security management, backup and recovery, and performance tuning.
1. Physical Data Independence :
Physical Data Independence is defined as the ability to make changes in the structure of the lowest level of
the Database Management System (DBMS) without affecting the higher-level schemas. Hence,
modification in the Physical level should not result in any changes in the Logical or View levels.

• There are 3 levels in the schema


architecture of DBMS: Physical level,
Logical level, and View level (arranged
from the lowest to highest level).
How is Physical Data Independence achieved?
• Physical Data Independence is achieved by modifying the physical layer to logical layer mapping (PL-LL
mapping). We must ensure that the modification we have done is localized.
Logical Data Independence
• Logical Data Independence is defined as the ability to make changes in the structure of the middle level of
the Database Management System (DBMS) without affecting the highest-level schema or application
programs. Hence, modification in the logical level should not result in any changes in the view levels or
application programs.
Example –
Changes in the middle level (logical level) are: adding new attributes to a relation, deleting existing attributes
of the relation, etc. Ideally, we would not want to change any application or programs that do not require to
use of the modified attribute.
How is Logical Data Independence achieved?
Logical Data Independence is achieved by modifying the view layer to logical layer mapping (VL-LL mapping).
Basic terminologies of Database
• Entity − An entity is a specific real-world thing or idea that we wish to represent and keep data about. For
instance, students, professors, courses, and departments might all be considered entities in a university database
• Attribute − An attribute is a representation of a particular quality or trait of an entity. It outlines the information
about the entity that we wish to store. A student entity, for instance, may include characteristics like a student ID,
name, date of birth, and major.
• Key − A key is an entity's or an entity instance's particular set of properties that uniquely identify it. For data
integrity and effective data retrieval, keys are necessary. To ensure that each student has a distinct identification,
the student ID, for instance, may act as the primary key in the student object
• Table − A relational database system's core structure for organizing data into rows and columns is a table. Each
table is made up of columns (attributes) and rows (records), and it represents a single entity. For instance, a table
called "Students" may have columns for student data such as student ID, name, and major.
• Primary Key − A primary key is a way for a table to be uniquely identified. It guarantees that each row in the table
can be identified individually. A single column or a group of columns might serve as the primary key. The student
ID column, for instance, may serve as a primary key in the "Students" database.
• Foreign Key − A column or group of columns in one database that relate to the primary key in another table is
known as a foreign key. This creates a connection between the two tables. For instance, to link students with the
courses they are registered for, a foreign key in the "Courses" database can make reference to the primary key in
the "Students" field.
• Relational Database − A relational database is a kind of database system that arranges information into tables
and uses keys to create relationships between those tables. It provides a systematic and effective method of
managing data by adhering to the fundamentals of the relational model. Popular relational database systems
include PostgreSQL, Oracle, and MySQL.
• Query − Requesting data or information from a database is known as a query. It enables users to obtain,
manipulate, and manage data and is described using a query language like SQL. For instance, a query may
return a list of every student registered for a certain course.
• Index − A database table's index is a type of data structure that accelerates data retrieval processes. Based on
the indexed column(s), it enables easy access to specified data. In the "Students" database, for instance, an
index on the student ID column would speed up searches looking for students by their ID.
• Normalization − Normalization is the process of arranging data in a database to reduce duplication and
strengthen data integrity. It entails breaking down tables and creating connections between them. Normal
forms are a set of guidelines that the normalization process abides by. A normalization process can involve
dividing a single database containing student and course data into distinct "Students" and "Courses" tables.
• ACID − Atomicity, Consistency, Isolation, and Durability, or simply ACID, are characteristics that guarantee the
dependability and integrity of database transactions. A transaction will always be seen as a single piece of work
thanks to atomicity. A transaction moves the database from one legitimate state to another by guaranteeing
consistency. Concurrent transactions are prevented from interfering with one another through isolation. Durability
ensures that changes made during a transaction are permanent and will endure any future system failures.
• Data Warehouse − An organization's data warehouse is a sizable, integrated, and unified collection of information
from numerous sources. It is intended to be used for decision-making, reporting, and analysis. For the purposes of
business intelligence and data analytics, data warehouses often store historical and aggregated data
• Data Mining − Finding patterns, trends, and insights from huge databases is referred to as data mining. In order to
extract useful knowledge and information, statistical and machine-learning approaches are applied. To find hidden
patterns and provide predictions or suggestions based on the data, data mining techniques are utilized.
• Backup and Recovery − Backup and recovery procedures are crucial for guaranteeing data availability and guarding
against data loss. In order to offer a restoration point in the event of a system failure or data corruption, backup
entails making copies of the database at regular intervals. Recovery entails utilizing the backup copies to restore the
database to a consistent condition.
Client data can be recovered in the event of hardware failures or unintentional deletions, a database administrator, for instance,
may plan daily backups of a customer database.
• Replication − Replication is the process of making and keeping copies of a database or specific sections of a database on
several servers. It increases fault tolerance, scalability, and data availability. Asynchronous or synchronous replication
guarantees that changes made to one duplicate are replicated to the others.
For instance, in a distributed e-commerce system, product information may be duplicated over several servers to make sure that
users can easily access product details no matter where they are.
• Data Dictionary − A data dictionary, sometimes referred to as a metadata repository, is a central repository for details on
the objects and the schema of a database. It includes metadata including table and column names, data types, restrictions, and
table connections. The DBMS uses the data dictionary to verify queries, uphold data integrity, and offer details on the database
architecture.
• The "Employees" table, for instance, may be described in the data dictionary together with the names and data types of its
columns, such as "Employee ID," "First Name," "Last Name," and "Salary."
• Database Schema − A database schema outlines the logical organization and structure of a database. The tables, columns,
data types, restrictions, and connections between the tables are all described. A blueprint for building and running the database
is provided by the schema. A database design for an online shop, for instance, would have tables like "Books," "Authors," and
"Orders," each with specific fields, data types, and connections.
Database system Architecture

• A Database Architecture is a representation of DBMS design. It helps to design, develop, implement, and
maintain the database management system. A DBMS architecture allows dividing the database system into
individual components that can be independently modified, changed, replaced, and altered. It also helps to
understand the components of a database.
• A Database stores critical information and helps access data quickly and securely. Therefore, selecting the
correct Architecture of DBMS helps in easy and efficient data management.
Types of DBMS Architecture
There are mainly three types of DBMS architecture:
• One Tier Architecture (Single Tier Architecture) : Billing Systems and Commissions systems
• Two Tier Architecture: Railway Reservation System
• Three Tier Architecture
• N-tier Architecture
1-tier architecture:
• One-tier architecture involves putting all of the required components for a
software application or technology on a single server or platform.
• Simple Architecture: 1-Tier Architecture is the most simple architecture to
set up, as only a single machine is required to maintain it.
• Cost-Effective: No additional hardware is required for implementing 1-
Tier Architecture, which makes it cost-effective.
• Easy to Implement: 1-Tier Architecture can be easily deployed, and hence
it is mostly used in small projects.
• Basically, a one-tier architecture keeps all of the elements of an application, including the interface, Middleware and
back-end data, in one place.

2-tier architecture:
• The two-tier is based on Client Server architecture. The two-tier architecture is like client server application. The
direct communication takes place between client and server. There is no intermediate between client and server.

• APIs like ODBC and JDBC are used for this interaction. The server side is
responsible for providing query processing and transaction management
functionalities. On the client side, the user interfaces and application
programs are run. The application on the client side establishes a
connection with the server side to communicate with the DBMS.
• An advantage of this type is that maintenance and understanding are
easier, and compatible with existing systems. However, this model gives
poor performance when there are a large number of users.
Advantages of 2-Tier Architecture
• Easy to Access: 2-Tier Architecture makes easy access
to the database, which makes fast retrieval.
• Scalable: We can scale the database easily, by adding
clients or upgrading hardware.
• Low Cost: 2-Tier Architecture is cheaper than 3-Tier
Architecture and Multi-Tier Architecture.
• Easy Deployment: 2-Tier Architecture is easier to
deploy than 3-Tier Architecture.
• Simple: 2-Tier Architecture is easily understandable as
well as simple because of only two components.
3-tier architecture:
• A 3-tier architecture separates its tiers from each other
based on the complexity of the users and how they use
the data present in the database. It is the most widely
used architecture to design a DBMS.
This architecture has different usages with different applications. It can be used in web applications and distributed
applications. The strength in particular is when using this architecture over distributed systems.
• Database (Data) Tier − At this tier, the database resides along with its query processing languages. We also have the
relations that define the data and their constraints at this level.
• Application (Middle) Tier − At this tier reside the application server and the programs that access the database. For a
user, this application tier presents an abstracted view of the database. End-users are unaware of any existence of the
database beyond the application. At the other end, the database tier is not aware of any other user beyond the
application tier. Hence, the application layer sits in the middle and acts as a mediator between the end-user and the
database.
• User (Presentation) Tier − End-users operate on this tier and they know nothing about any existence of the database
beyond this layer. At this layer, multiple views of the database can be provided by the application. All views are
generated by applications that reside in the application tier.
n-tier architecture:
N-tier architecture would involve dividing an application into three different tiers. These would be the
1.logic tier,
2.the presentation tier, and
3.the data tier.
• It is the physical separation of the different parts of the application as opposed to the usually
conceptual or logical separation of the elements in the model-view-controller (MVC)
framework.
• Another difference from the MVC framework is that n-tier layers are connected linearly,
meaning all communication must go through the middle layer, which is the logic tier.
• In MVC, there is no actual middle layer because the interaction is triangular; the control
layer has access to both the view and model layers and the model also accesses the view; the
controller also creates a model based on the requirements and pushes this to the view.
• However, they are not mutually exclusive, as the MVC framework can be used in
conjunction with the n-tier architecture, with the n-tier being the overall architecture used and
MVC used as the framework for the presentation tier.
Data Models
• Data Model is the modeling of the data description, data semantics, and consistency constraints
of the data. It provides the conceptual tools for describing the design of a database at each level
of data abstraction. Therefore, there are following four data models used for understanding the
structure of the database:
1) Relational Data Model: This type of model designs the data in the form of rows and columns within a table. Thus, a
relational model uses tables for representing data and in-between relationships. Tables are also called relations. This
model was initially described by Edgar F. Codd, in 1969. The relational data model is the widely used model which is
primarily used by commercial data processing applications.

2) Entity-Relationship Data Model: An ER model is the logical representation of data as objects and relationships
among them. These objects are known as entities, and relationship is an association among these entities. This model
was designed by Peter Chen and published in 1976 papers. It was widely used in database designing. A set of attributes
describe the entities. For example, student_name, student_id describes the 'student' entity. A set of the same type of
entities is known as an 'Entity set', and the set of the same type of relationships is known as 'relationship set'.

3) Object-based Data Model: An extension of the ER model with notions of functions, encapsulation, and object
identity, as well. This model supports a rich type system that includes structured and collection types. Thus, in 1980s,
various database systems following the object-oriented approach were developed. Here, the objects are nothing but
the data carrying its properties.
4) Semi-structured Data Model: This type of data model is different from the other three
data models (explained above). The semi-structured data model allows the data specifications
at places where the individual data items of the same type may have different attributes sets.
The Extensible Markup Language, also known as XML, is widely used for representing the
semi-structured data. Although XML was initially designed for including the markup
information to the text document, it gains importance because of its application in the
exchange of data.
ER (Entity Relationship) Diagram in DBMS

• ER model stands for an Entity-Relationship model. It is a high-level data model. This model is used to define the
data elements and relationship for a specified system.
• It develops a conceptual design for the database. It also develops a very simple and easy to design view of data.
• In ER modeling, the database structure is portrayed as a diagram called an entity-relationship diagram.
• For example, Suppose we design a school database. In this database, the student will be an entity with
attributes like address, name, id, age, etc. The address can be another entity with attributes like city, street
name, pin code, etc and there will be a relationship between them.
Component of ER Diagram
1. Entity:
• An entity may be any object, class, person or place. In the ER diagram, an entity can be represented as rectangles.
• Consider an organization as an example- manager, product, employee, department etc. can be taken as an entity.

a. Weak Entity

• An entity that depends on another entity called a weak entity. The weak entity doesn't contain any key attribute
of its own. The weak entity is represented by a double rectangle.
Strong Entity Type:
• It is an entity that has its own existence and is independent.
• The entity relationship diagram represents a strong entity type with the help of a single rectangle. Below is the
ERD of the strong entity type: • In the above example, the "Customer" is the entity
type with attributes such as ID, Name, Gender, and
Phone Number. Customer is a strong entity type as it
has a unique ID for each customer.
2. Attribute
• The attribute is used to describe the property a. Key Attribute
of an entity. Eclipse is used to represent an
attribute. The key attribute is used to represent the main
characteristics of an entity. It represents a
• For example, id, age, contact number, name, primary key. The key attribute is represented by
etc. can be attributes of a student. an ellipse with the text underlined.
b. Composite Attribute c. Multivalued Attribute

• An attribute that composed of many other • An attribute can have more than one value. These
attributes is known as a composite attributes are known as a multivalued attribute.
The double oval is used to represent multivalued
attribute. The composite attribute is attribute.
represented by an ellipse, and those
• For example, a student can have more than one
ellipses are connected with an ellipse. phone number.
d. Derived Attribute
• An attribute that can be derived from other attribute is known as a derived attribute. It can be represented by a
dashed ellipse.
• For example, A person's age changes over time and can be derived from another attribute like Date of birth.
3. Relationship
A relationship is used to describe the relation between entities. Diamond or rhombus is used to represent the
relationship.

Types of relationship are as follows:


a. One-to-One Relationship
• When only one instance of an entity is associated with the relationship, then it is known as one to one relationship.
• For example, A female can marry to one male, and a male can marry to one female.
b. One-to-many relationship
• When only one instance of the entity on the left, and more than one instance of an entity on the right associates
with the relationship then this is known as a one-to-many relationship.
• For example, Scientist can invent many inventions, but the invention is done by the only specific scientist.

c. Many-to-one relationship
• When more than one instance of the entity on the left, and only one instance of an entity on the right associates
with the relationship then it is known as a many-to-one relationship.
• For example, Student enrolls for only one course, but a course can have many students.
d. Many-to-many relationship
• When more than one instance of the entity on the left, and more than one instance of an entity on the right
associates with the relationship then it is known as a many-to-many relationship.
• For example, Employee can assign by many projects and project can have many employees.
Different Types of Database Keys
• Candidate Key
• Primary Key
• Super Key
• Alternate Key
• Foreign Key
• Composite Key
Candidate Key
The minimal set of attributes that can uniquely identify a tuple is known as a candidate key. For Example,
STUD_NO in STUDENT relation.
• It is a minimal super key.
• It is a super key with no repeated data is called a candidate key.
• The minimal set of attributes that can uniquely identify a record.
• It must contain unique values.
• It can contain NULL values.
• Every table must have at least a single candidate key.
• A table can have multiple candidate keys but only one primary key.
• The value of the Candidate Key is unique and may be null for a tuple.
• There can be more than one candidate key in a relationship.
Example:
STUD_NO is the candidate key for relation STUDENT.

STUD_NO SNAME ADDRESS PHONE

1 Shyam Delhi 123456789

2 Rakesh Kolkata 223365796

3 Suraj Delhi 175468965


The candidate key can be simple (having only one attribute) or composite as well.
Example:

{STUD_NO, COURSE_NO} is a composite


candidate key for relation STUDENT_COURSE.
Table STUDENT_COURSE

STUD_NO TEACHER_NO COURSE_NO

1 001 C001

2 056 C005
Primary Key
There can be more than one candidate key in relation out of which one can be chosen as the primary key. For
Example, STUD_NO, as well as STUD_PHONE, are candidate keys for relation STUDENT but STUD_NO can
be chosen as the primary key (only one out of many candidate keys).
•It is a unique key.
•It can identify only one tuple (a record) at a time.
•It has no duplicate values, it has unique values.
•It cannot be NULL.
•Primary keys are not necessarily to be a single column; more than one column can also be a primary key for a
table.
•Example:
•STUDENT table -> Student(STUD_NO, SNAME,
•ADDRESS, PHONE) , STUD_NO is a primary key
Table STUDENT

STUD_NO SNAME ADDRESS PHONE

1 Shyam Delhi 123456789

2 Rakesh Kolkata 223365796

3 Suraj Delhi 175468965

Super Key
The set of attributes that can uniquely identify a tuple is known as Super Key. For Example, STUD_NO, (STUD_NO,
STUD_NAME), etc. A super key is a group of single or multiple keys that identifies rows in a table. It supports NULL
values.
Adding zero or more attributes to the candidate key generates the super key.
A candidate key is a super key but vice versa is not true.
Super Key values may also be NULL.
Example:

Consider the table shown above.


STUD_NO+PHONE is a super key.
Alternate Key
The candidate key other than the primary key is called an alternate key.

• All the keys which are not primary keys are called alternate keys.
• It is a secondary key.
• It contains two or more fields to identify two or more records.
• These values are repeated.
Eg:- SNAME, and ADDRESS is Alternate keys
Example:

Consider the table shown above.


STUD_NO, as well as PHONE both,
are candidate keys for relation STUDENT but
PHONE will be an alternate key
(only one out of many candidate keys).
Foreign Key
• If an attribute can only take the values which are present as values of some other attribute, it will be
a foreign key to the attribute to which it refers. The relation which is being referenced is called referenced
relation and the corresponding attribute is called referenced attribute the relation which refers to the
referenced relation is called referencing relation and the corresponding attribute is called referencing
attribute. The referenced attribute of the referenced relation should be the primary key to it.
• It is a key it acts as a primary key in one table and it acts as
secondary key in another table.
• It combines two or more relations (tables) at a time.
• They act as a cross-reference between the tables.
• For example, DNO is a primary key in the DEPT table and a non-key in EMP
Example:
• Refer Table STUDENT shown above.
• STUD_NO in STUDENT_COURSE is a
• foreign key to STUD_NO in STUDENT relation.
Table STUDENT_COURSE

STUD_NO TEACHER_NO COURSE_NO

1 005 C001

2 056 C005
• Foreign Key can be NULL as well as may contain duplicate tuples i.e. it need not follow uniqueness
constraint.
• For Example, STUD_NO in the STUDENT_COURSE relation is not unique. It has been repeated for the first
and third tuples. However, the STUD_NO in STUDENT relation is a primary key and it needs to be always
unique, and it cannot be null.
Composite Key:Sometimes, a table might not have a single column/attribute that uniquely identifies all the records of a
table. To uniquely identify rows of a table, a combination of two or more columns/attributes can be used. It still can
give duplicate values in rare cases. So, we need to find the optimal set of attributes that can uniquely identify rows in a
table.
• It acts as a primary key if there is no primary key in a table
• Two or more attributes are used together to make a composite key.
• Different combinations of attributes may give different accuracy in terms of identifying the rows uniquely.
• Example:
• FULLNAME + DOB can be combined
• together to access the details of a student.
Extended Entity-Relationship (EE-R) Model

• EER is a high-level data model that incorporates the extensions to the original ER model. Enhanced ERD are high
level models that represent the requirements and complexities of complex database.
• In addition to ER model concepts EE-R includes −

• Subclasses and Super classes.


• Specialization and Generalization.
• Category or union type.
• Aggregation.

Subclasses and Super class


• Super class is an entity that can be divided into further subtype.
• For example − consider Shape super class.
• Super class shape has sub groups: Triangle, Square and Circle.
• Sub classes are the group of entities with some unique attributes. Sub class inherits the properties and attributes from
super class.
Specialization and Generalization
• Generalization is a process of generalizing an entity which contains generalized attributes or properties of generalized
entities.

• It is a Bottom up process i.e. consider we have 3 sub entities Car,


Truck and Motorcycle. Now these three entities can be
generalized into one super class named as Vehicle.
• Specialization is a process of identifying subsets of an entity that
share some different characteristic. It is a top down approach in
which one entity is broken down into low level entity.
• In above example Vehicle entity can be a Car, Truck or
Motorcycle.
Category or Union
• Relationship of one super or sub class with more than
one super class.
• Owner is the subset of two super class: Vehicle and
House
Aggregation: Aggregation is a high-level abstraction in the Enhanced Entity-Relationship
(EER) model that treats a relationship as an entity itself. It is used when we need to
represent a relationship that involves another relationship..
• Sometimes, relationships themselves have attributes or need
to participate in other relationships. Since traditional ER
models do not allow relationships to be directly linked to
other relationships, aggregation helps solve this limitation.
Aggregation
Example: Aggregation in a University Database INSTRUCTOR ---- TEACHES ---- COURSE
Entities: |
STUDENT (Student_ID, Name) (Aggregation)
COURSE (Course_ID, Title) |
INSTRUCTOR (Instructor_ID, Name) TEACHING (Semester)
Relationships: |
TEACHES: Links INSTRUCTOR and COURSE DEPARTMENT
ENROLLS: Links STUDENT and COURSE
Problem:
What if we want to store additional information about
the TEACHES relationship, such as the Semester when a
course is taught?
We cannot directly add attributes to
the TEACHES relationship in the ER model.
We also cannot directly link another entity
(e.g., DEPARTMENT) to a relationship.
Solution: Use Aggregation
We create an aggregated entity (TEACHING) that represents
the TEACHES relationship and allows it to participate in
another relationship.
• Constraints – There are two types of constraints on the “Sub-class” relationship.
• Total or Partial – A sub-classing relationship is total if every super-class entity is to be associated with some sub-
class entity, otherwise partial. Sub-class “job type based employee category” is partial sub-classing – not necessary
every employee is one of (secretary, engineer, and technician), i.e. union of these three types is a proper subset of
all employees. Whereas other sub-classing “Salaried Employee AND Hourly Employee” is total; the union of
entities from sub-classes is equal to the total employee set, i.e. every employee necessarily has to be one of them.
• Overlapped or Disjoint – If an entity from a super-set can be related (can occur) in multiple sub-class sets, then it is
overlapped sub-classing, otherwise disjoint. Both the examples: job-type based and salaries/hourly employee sub-
classing are disjoint.
Case study :
Construction of Database design using Entity Relationship diagram for
an application such as
• University Database,
• Banking System,
• Information System
Details University Management System ER Diagram
Now we describe the overall function of the University Management System in the below table. It is a complete
overview of the information about the university management project.

Name: ER Diagram of the University Management System.

Abstract: The ER Diagram of the University management system shows the relationship
between various entities. We can also call it the blueprint of the University
management system.

Diagram: We can also call the ER diagram an Entity Relationship diagram.

Tool used: The ER diagram provides some symbol that is known as the diagramming tool.

Users: The users of the ER diagram are university admin , applications, software, and
websites.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy