0% found this document useful (0 votes)
6 views

DBMS-UNIT-1

The document provides an overview of Database Management Systems (DBMS), covering their characteristics, advantages, and disadvantages compared to traditional file systems. It discusses key concepts such as data independence, the entity-relationship model, and the three-tier architecture for data abstraction. Additionally, it highlights various applications of databases in sectors like banking, telecommunications, and education, while addressing issues like data redundancy, integrity, and security.

Uploaded by

kdurgaprasadyt
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

DBMS-UNIT-1

The document provides an overview of Database Management Systems (DBMS), covering their characteristics, advantages, and disadvantages compared to traditional file systems. It discusses key concepts such as data independence, the entity-relationship model, and the three-tier architecture for data abstraction. Additionally, it highlights various applications of databases in sectors like banking, telecommunications, and education, while addressing issues like data redundancy, integrity, and security.

Uploaded by

kdurgaprasadyt
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

DATABASE MANAGEMENT SYSTEMS

UNIT-I
Introduction: Database system, Characteristics (Database Vs File System), Database Users,
Advantages of Database systems, Database applications. Brief introduction of different Data Models;
Concepts of Schema, Instance and data independence; Three tier schema architecture for data
independence; Database system structure, environment, Centralized and Client Server architecture
for the database.
Entity Relationship Model: Introduction, Representation of entities, attributes, entity set, relationship,
relationship set, constraints, sub classes, super class, inheritance, specialization, generalization using
ER Diagrams.
==============================================================

Introduction
Data
✓ Data is collection of known facts and figures that can be recorded, from which useful information is
derived.
✓ Data is plural word, whereas datum is singular.
Types of Data
✓ Text or numeric value- name, address, phone no
✓ Images-
✓ Videos-
✓ Speech-
Database
✓ A database is a collection of related data
Database Management System (DBMS): is a collection of programs (software) enabling users to create and
maintain a database.
The DBMS is a general-purpose software system that facilitates the processes of defining, constructing,
manipulating, and sharing databases among various users and applications.
Definition: specifying data types (and other constraints to which the data must conform) and data
organization
Construction: the process of storing the data on some medium (e.g., magnetic disk) that is controlled
by the DBMS
Manipulation: querying, updating, report generation
Sharing a database allows multiple users and programs to access the database simultaneously.
Examples of Databases
✓ Traditional databases-in which most of the information that is stored and accessed is either textual or
numeric
✓ Multimedia databases-to store images, audio clips, and video streams digitally
✓ Geographic information systems (GIS) – to store and analyze maps, weather data, and satellite images.
✓ Data warehouses and online analytical processing (OLAP) systems are used in many companies to
extract and analyze useful business information from very large databases to support decision making.
DATABASE SYSTEM
✓ The database and DBMS software together is called as Database System.
An application program accesses the database by sending queries or requests for data to the DBMS.
A query typically causes some data to be retrieved;
A transaction may cause some data to be read and some data to be written into the database.

DATABASE SYSTEM APPLICATIONS:


Databases are widely used. Here are some representative applications:
• Banking: For customer information, accounts, and loans, and banking transactions.
• Airlines: For reservations and schedule information. Airlines were among the first to use databases in a
geographically distributed manner—terminals situated around the world accessed the central database system
through phone lines and other data networks.
• Universities: For student information, course registrations, and grades.
• Credit card transactions: For purchases on credit cards and generation of monthly statements.
• Telecommunication: For keeping records of calls made, generating monthly bills, maintaining balances
on prepaid calling cards, and storing information about the communication networks.
• Finance: For storing information about holdings, sales, and purchases of financial instruments such as
stocks and bonds.
• Sales: For customer, product, and purchase information.
• Manufacturing: For management of supply chain and for tracking production of items in factories,
inventories of items in warehouses/stores, and orders for items.
• Human resources: For information about employees, salaries, payroll taxes and benefits, and for
generation of pay checks.

Purpose of database System:-


A number of characteristics distinguish the database approach from the much older approach of programming
with file system.
In traditional file processing, each user defines and implements the files needed for a specific software
application as part of programming the application. In file systems, each application is free to name data
elements independently.
In the database approach, a single repository maintains data that is defined once and then accessed by
various users. The names or labels of data are defined once, and used repeatedly by queries, transactions, and
applications.
The main characteristics of the database approach versus the file-processing approach are the following:
✓ Self-describing nature of a database system
✓ Insulation between programs and data, and data abstraction
✓ Support of multiple views of the data
✓ Sharing of data and multiuser transaction processing
1. Self-Describing Nature of a Database System
A fundamental characteristic of the database approach is that the database system contains not only the
database itself but also a complete definition or description of the database structure and constraints.
This definition is stored in the DBMS catalog, which contains information such as the structure of each
file, the type and storage format of each data item, and various constraints on the data.
The information stored in the catalog is called meta-data, and it describes the structure of the primary
database (Figure 1.1).
The DBMS software must work equally well with any number of database applications—for example,
a university database, a banking database, or a company database—as long as the database definition is stored
in the catalog.
In traditional file processing, data definition is typically part of the application programs themselves.
Hence, these programs are constrained to work with only one specific database, whose structure is declared in
the application programs.
Whereas file-processing software can access only specific databases, DBMS software can access diverse
databases by extracting the database definitions from the catalog and using these definitions.
2. Insulation between Programs and Data, and Data Abstraction
In traditional file processing, the structure of data files is embedded in the application programs, so any
changes to the structure of a file may require changing all programs that access that file.
By contrast, DBMS access programs do not require such changes in most cases. The structure of data
files is stored in the DBMS catalog separately from the access programs. We call this property program-data
independence.
In some types of database systems, such as object-oriented and object-relational systems, users can
define operations on data as part of the database definitions. An operation (also called a function or method) is
specified in two parts.
The interface (or signature) of an operation includes the operation name and the data types of its
arguments (or parameters).
The implementation (or method) of the operation is specified separately and can be changed without
affecting the interface.
The characteristic that allows program-data independence and program-operation independence is called
data abstraction.
A DBMS provides users with a conceptual representation of data that does not include many of the
details of how the data is stored or how the operations are implemented.
A data model is a type of data abstraction that is used to provide this conceptual representation. The data
model uses logical concepts, such as objects, their properties, and their interrelationships, that may be easier for
most users to understand than computer storage concepts. Hence, the data model hides storage and
implementation details that are not of interest to most database users.
3. Support of Multiple Views of the Data
A database typically has many users, each of whom may require a different perspective or view of the
database.
A view may be a subset of the database or it may contain virtual data that is derived from the database
files but is not explicitly stored.
A multiuser DBMS whose users have a variety of distinct applications must provide facilities for
defining multiple views.
4. Sharing of Data and Multiuser Transaction Processing
A multiuser DBMS, as its name implies, must allow multiple users to access the database at the same
time. This is essential if data for multiple applications is to be integrated and maintained in a single database.
The DBMS must include concurrency control software to ensure that several users trying to update the same
data do so in a controlled manner so that the result of the updates is correct.
For example, when several reservation agents try to assign a seat on an airline flight, the DBMS should
ensure that each seat can be accessed by only one agent at a time for assignment to a passenger. These types of
applications are generally called online transaction processing (OLTP) applications.
A fundamental role of multiuser DBMS software is to ensure that concurrent transactions operate
correctly and efficiently.
A transaction is an executing program or process that includes one or more database accesses, such as
reading or updating of database records. Each transaction is supposed to execute a logically correct database
access if executed in its entirety without interference from other transactions.
The DBMS must enforce several transaction properties (ACID - Atomicity, Consistency, isolation,
Durability Properties).

Difference between File System and DBMS:

Basis File System DBMS


The file system is software that
manages and organizes the files in DBMS is software for managing
Structure
a storage medium within a the database.
computer.
Data Redundant data can be present in a In DBMS there is no redundant
Redundancy file system. data.
Backup and It doesn’t provide backup and It provides backup and recovery
Recovery recovery of data if it is lost. of data even if it is lost.
Query There is no efficient query Efficient query processing is there
processing processing in the file system. in DBMS.
There is more data consistency
There is less data consistency in
Consistency because of the process of
the file system.
normalization.
It has more complexity in
It is less complex as compared to
Complexity handling as compared to the file
DBMS.
system.
DBMS has more security
Security File systems provide less security
mechanisms as compared to file
Constraints in comparison to DBMS.
systems.
It has a comparatively higher cost
Cost It is less expensive than DBMS.
than a file system.
Data In DBMS data independence
There is no data independence.
Independence exists.
Only one user can access data at a Multiple users can access data at a
User Access
time. time.
The user has to write procedures The user not required to write
Meaning
for managing databases procedures.
Data is distributed in many files. Due to centralized nature sharing
Sharing
So, not easy to share data is easy
Data It give details of storage and It hides the internal details of
Abstraction representation of data Database
Integrity Integrity Constraints are difficult Integrity constraints are easy to
Constraints to implement implement
Example Cobol, C++ Oracle, SQL Server
Disadvantage of Computer File-based Processing System:
1. Data redundancy and inconsistency.
Since different programmers create the files and application programs over a long period, the various files
are likely to have different formats and the programs may be written in several programming languages.
Moreover, the same information may be duplicated in several places (files). In addition, it may lead to data
inconsistency; that is, the various copies of the same data may no longer agree.
2. Difficulty in accessing data.
Suppose that one of the bank officers needs to find out the names of all customers who live within a particular
postal-code area. The officer asks the data-processing department to generate such a list. Because the designers
of the original system did not anticipate this request, there is no application program on hand to meet it. The
bank officer has now two choices: either obtain the list of all customers and extract the needed information
manually or ask a system programmer to write the necessary application program. Both alternatives are
obviously unsatisfactory.
The conventional file-processing environments do not allow needed data to be retrieved in a convenient
and efficient manner. More responsive data-retrieval systems are required for general use.
3. Data isolation.
Because data are scattered in various files, and files may be in different formats, writing new application
programs to retrieve the appropriate data is difficult.
4. Integrity problems.
The data values stored in the database must satisfy certain types of consistency constraints. For example,
the balance of a bank account may never fall below a prescribed amount (say, $25). Developers enforce these
constraints in the system by adding appropriate code in the various application programs. However, when new
constraints are added, it is difficult to change the programs to enforce them.
5. Atomicity problems.
A computer system, like any other mechanical or electrical device, is subject to failure. In many applications,
it is crucial that, if a failure occurs, the data be restored to the consistent state that existed prior to the failure. It
is difficult to ensure atomicity in a conventional file-processing system.
6. Concurrent-access anomalies.
For the sake of overall performance of the system and faster response, many systems allow multiple users
to update the data simultaneously. In such an environment, interaction of concurrent updates may result in
inconsistent data.
7. Security problems.
Not every user of the database system should be able to access all the data. For example, in a banking system,
payroll personnel need to see only that part of the database that has information about the various bank
employees. They do not need access to information about customer accounts.

ADVANTAGES OF DATABASE SYSTEMS:


Database is a way to consolidate and control the operational data centrally. It is a better way to control the
operational data. The advantages of having a centralized control of data are:
1. To control Data Redundancy
In the Database approach, ideally each data item is stored in only one place in the database
However, in some case redundancy is still exists to improving system performance, but such redundancy is
controlled and kept to minimum.
2. Inconsistency can be avoided:
When the same data is duplicated and changes are made at one side which is not propagated to the other
site, it gives rise to inconsistency. Then the two entries regarding the same data will not agree. So if the
redundancy is removed chances of having inconsistent data is also removed.
3. Data Sharing
The integration of the whole data in an organization leads to the ability to produce more information from
a given amount of data
4. Enforcing Integrity Constraints
DBMSs should provide capabilities to define and enforce certain constraints such as data type, data
uniqueness.
5. Restricting Unauthorised Access
Not all users of the system have the same accessing privileges.
DBMSs should provide a security subsystem to create and control the user accounts.
6. Data Independence
The system data descriptions are separated from the application programs. Changes to the data structure is
handled by the DBMS and not embedded in the program.
7. Transaction Processing
The DBMS must include concurrency control subsystem to ensure that several users trying to update the
same data do so in a controlled manner so that the result of the updates is correct.
8. Providing multiple views of data
A view may be a subset of the database. Various users may have different views of the database itself. Users
may not need to be aware of how and where the data they refer to is stored .
9. Providing backup and recovery facilities
If the computer system fails in the middle of a complex update program, the recovery subsystem is
responsible for making sure that the database is restored to the stage it was in before the program started
executing.

DISADVANTAGES OF DBMS:
Complexity: A DBMS fulfil lots of requirement and it solves many problems related to database. But all these
functionality has made DBMS extremely complex software. Developer, designer, DBA and End user of
database must have complete skills if they want to use it properly. If they don’t understand this complex system
then it may cause loss of data or database failure.
Size: As DBMS becomes big software due to its functionalities so it requires lots of space and memory to run
its application efficiently. It gains bigger size as data is fed in it.
Cost of DBMS: DBMS requires high initial investment for hardware, software and trained staff.
Additional hardware costs: A DBMS requires disk storage for the data and sometimes you need to purchase
extra space to store your data. Also sometimes you need a dedicated machine for better performance of database.
These machines and storage space increase extra costs of hardware.
Performance: Traditional files system was very good for small organizations as they give splendid
performance. But DBMS gives poor performance for small scale firms as its speed is slow.
Higher impact of a failure: As we know that in DBMS, all the files are stored in single database so chances of
database failure become more. Any accidental failure of component may cause loss of valuable data. This is really a big
question mark for big firms
View of data:
3-SCHEMA ARCHITECTURE FOR DATA INDEPENDENCE / 3-LEVEL
ARCHITECTURE/ DATA ABSTRACTION:
A database system is a collection of interrelated files and a set of programs that allow users to access
and modify these files. A major purpose of a database system is to provide users with an abstract view of the
data. That is, the system hides certain details of how the data are stored and maintained.
For the system to be usable, it must retrieve data efficiently. Since many database-systems users are not
computer trained, developers hide the complexity from users through several levels of abstraction, to simplify
users’ interactions with the system:
• Physical level (Internal level): The lowest level of abstraction describes how the data are actually stored. The
physical level describes complex low-level data structures in detail. Internal schema at the internal level used
to describe physical storage structures and access paths (e.g indexes).

• Logical level (conceptual level): The next-higher level of abstraction describes what data are stored in the
database, and what relationships exist among those data. The logical level thus describes the entire database in
terms of a small number of relatively simple structures. Although implementation of the simple structures at the
logical level may involve
complex physical-level
structures, the user of the logical
level does not need to be aware
of this complexity. Database
administrators, who must decide
what information to keep in the
database, use the logical level of
abstraction. Conceptual schema
at the conceptual level used to
describe the structure and
constraints for the whole
database for a community of
users.

• View level (External Level): The highest level of abstraction describes only part of the entire database. Even
though the logical level uses simpler structures, complexity remains because of the variety of information stored
in a large database. Many users of the database system do not need all this information; instead, they need to
access only a part of the database. The view level of abstraction exists to simplify their interaction with the
system. The system may provide many views for the same database. External schemas at the external level to
describe the various user views.
The processes of transforming requests and results between levels are called mappings.
DATA INDEPENDENCE:
The three-schema architecture can be
used to further explain the concept of data
independence.
✓ This can be defined as the capacity to
change the schema at one level of a
database system without having to
change the schema at the next higher
level.
✓ Two types of data independence:
1. Logical data independence is the
capacity to the conceptual schema
without having to change external
schemas or application programs.

Fig: Data Independence and the ANSI-SPARC Three-Level

2. Physical data independence is the capacity to change the internal schema without having to change the
conceptual schema. Hence the external schemas need not be changed as well.

INSTANCES AND SCHEMAS:


Databases change over time as information is inserted and deleted.
Schema The overall design of the database is called the database schema. A database schema corresponds to
the variable declarations (along with associated type definitions) in a program.
Instance The collection of information stored in the database at a particular moment is called an instance of
the database. The values of the variables in a program at a point in time correspond to an instance of a database
schema.
Database systems have several schemas, partitioned according to the levels of abstraction.
Physical schema: describes the database design at the physical level.
Logical schema: describes the database design at the logical level.
Subschemas: A database may also have several schemas at the view level, sometimes called subschemas that
describe different views of the database.

Database Model
Data model: a collection of conceptual tools for describing data, data relationships, data semantics, and
consistency constraints.
The data models are used to describe the design of the database at the physical, logical and view levels.
A data model visually represents the nature of data, business rules governing the data and how to it will be
organized in the database.
1. Relational Model
The relational model uses a collection of tables to represent both data and the relationships among those
data. Each table has multiple columns, and each column has a unique name. The following table presents a
sample relational database comprising the details of bank customers.
2. The Entity-Relationship Model
The entity-relationship (E-R) data model is based on a perception of a real world that consists of a collection
of basic objects, called entities, and of relationships among these objects. An entity is a “thing” or “object” in
the real world that is distinguishable from other objects. For example, each person is an entity, and bank accounts
can be considered as entities. A sample E-R diagram is shown below.

3. The object-oriented data model


The object-oriented model can be seen as extending the E-R model with notions of encapsulation, methods
(functions), and object identity.
The object-oriented model is based on a collection of objects.
An object contains values stored in instance variables within the object.
Example:

4. Hierarchical Model
A hierarchical database is a kind of DMS that links records together in a tree data structure such that each
record type has only one owner. Ex: An order is owned by only one customer.
Hierarchical structures were widely used in the first main frame DBMS.
Example-1:

Example-2:

5. Network Model
The network model is based on directed graph theory. The network model replaces the hierarchical tree with
a graph thus allowing more general connections among the nodes.
The main difference of the network model from the hierarchical model is its ability to handle many to many
relationships or in other words, it allows a record to have more than one parent.

Example 1:
DATABASE SYSTEM STRUCTURE / THE DATABASE SYSTEM ENVIRONMENT:
Figure below illustrates, in a simplified form, the typical DBMS components. The figure is divided into two
parts. The top part of the figure refers to
the various users of the database
environment and their interfaces. The
lower part shows the internals of the
DBMS responsible for storage of data
and processing of transactions.
Let us consider the top part of
Figure first. It shows interfaces for the
DBA staff, casual users who work with
interactive interfaces to formulate
queries, application programmers who
create programs using some host
programming languages, and
parametric users who do data entry
work by supplying parameters to
predefined transactions. The DBA staff
works on defining the database and
tuning it by making changes to its
definition using the DDL and other
privileged commands.
The DDL compiler processes
schema definitions, specified in the DDL, and stores descriptions of the schemas (meta-data) in the DBMS
catalog. The catalog includes information such as the names and sizes of files, names and data types of data
items, storage details of each file, mapping information among schemas, and constraints. In addition, the catalog
stores many other types of information that are needed by the DBMS modules, which can then look up the
catalog information as needed.
Casual users and persons with occasional need for information from the database interact using some
form of interface, which we call the interactive query interface and so on by a query compiler that compiles
them into an internal form. This internal query is subjected to query optimization. Among other things,
the query optimizer is concerned with the rearrangement and possible reordering of operations, elimination of
redundancies, and use of correct algorithms and indexes during execution. It consults the system catalog for
statistical and other physical information about the stored data and generates executable code that performs the
necessary operations for the query and makes calls on the runtime processor.
Application programmers write programs in host languages such as Java, C, or C++ that are submitted
to a precompiler. The precompiler extracts DML commands from an application program written in a host
programming language. These commands are sent to the DML compiler for compilation into object code for
database access. The rest of the program is sent to the host language compiler. The object codes for the DML
commands and the rest of the program are linked, forming a canned transaction whose executable code includes
calls to the runtime database processor. Canned transactions are executed repeatedly by parametric users, who
simply supply the parameters to the transactions. Each execution is considered to be a separate transaction. An
example is a bank withdrawal transaction where the account number and the amount may be supplied as
parameters.
In the lower part of Figure, the runtime database processor executes (1) the privileged commands, (2)
the executable query plans, and (3) the canned transactions with runtime parameters. It works with the system
catalog and may update it with statistics. It also works with the stored data manager, which in turn uses basic
operating system services for carrying out low-level input/output (read/write) operations between the disk and
main memory. The runtime database processor handles other aspects of data transfer, such as management of
buffers in the main memory. Some DBMSs have their own buffer management module while others depend on
the OS for buffer management. We have shown concurrency control and backup and recovery
systems separately as a module in this figure. They are integrated into the working of the runtime database
processor for purposes of transaction management.

Database Users
In large organizations, many people are involved in the design, use, and maintenance of a large database
with hundreds of users.
People, whose jobs involve the day-to-day use of a large database, call them the actors on the scene.
People those who work to maintain the database system environment but who are not actively interested
in the database contents as part of their daily job, called workers behind the scene.
Actors on the Scene
1. Database Administrators
Administering resources (database, DBMS and related software) is the responsibility of the database
administrator (DBA).
The DBA is responsible for
Authorizing access to the database,
Coordinating and monitoring its use,
Acquiring software and hardware resources as needed.
The DBA is accountable for problems such as security cracks and poor system response time.
2. Database Designers
Database designers are responsible for identifying the data to be stored in the database and for choosing
appropriate structures to represent and store this data.
It is the responsibility of database designers to communicate with all prospective database users in order
to understand their requirements and to create a design that meets these requirements.
In many cases, the designers are on the staff of the DBA and may be assigned other staff responsibilities
after the database design is completed. Database designers typically interact with each potential group of users
and develop views of the database that meet the data and processing requirements of these groups.
Each view is then analyzed and integrated with the views of other user groups. The final database design
must be capable of supporting the requirements of all user groups.
3. End Users
End users are the people whose jobs require access to the database for querying, updating, and generating
reports; the database primarily exists for their use. There are several categories of end users:
i. Casual end users occasionally access the database, but they may need different information each time.
They use a sophisticated database query language to specify their requests and are typically middle or
high-level managers or other occasional browsers.
ii. Naive or parametric end users make up a sizable portion of database end users. Their main job function
revolves around constantly querying and updating the database, using standard types of queries and
updates—called canned transactions—that have been carefully programmed and tested. The tasks that
such users perform are varied:
✓ Bank tellers check account balances and post withdrawals and deposits.
✓ Reservation agents for airlines, hotels, and car rental companies check availability for a given
request and make reservations.
✓ Employees at receiving stations for shipping companies enter package identifications via bar
codes and descriptive information through buttons to update a central database of received and
in-transit packages.
iii. Sophisticated end users include engineers, scientists, business analysts, and others who thoroughly
familiarize themselves with the facilities of the DBMS in order to implement their own applications to
meet their complex requirements.
iv. Standalone users maintain personal databases by using ready-made program packages that provide
easy-to-use menu-based or graphics-based interfaces. An example is the user of a tax package that stores
a variety of personal financial data for tax purposes.
A typical DBMS provides multiple facilities to access a database. Naive end users need to learn very little
about the facilities provided by the DBMS; they simply have to understand the user interfaces of the standard
transactions designed and implemented for their use. Casual users learn only a few facilities that they may use
repeatedly. Sophisticated users try to learn most of the DBMS facilities in order to achieve their complex
requirements. Standalone users typically become very proficient in using a specific software package.
4. System Analysts and Application Programmers (Software Engineers)
System analysts determine the requirements of end users, especially naive and parametric end users, and
develop specifications for standard canned transactions that meet these requirements. Application programmers
implement these specifications as programs; then they test, debug, document, and maintain these canned
transactions. Such analysts and programmers—commonly referred to as software developers or software
engineers—should be familiar with the full range of capabilities provided by the DBMS to accomplish their
tasks.

Workers behind the Scene


Typically do not use the database for their own purposes.
1. DBMS system designers and implementers:
Design and implement the DBMS modules (for implementing the catalog, query language, interface
processors, data access, concurrency control, recovery, and security. ) and interfaces as a software package.
2. Tool developers:
Tools are optional packages that are often purchased separately
Include packages for database design, performance monitoring, natural language or graphical interfaces,
prototyping, simulation, and test data generation.
3. Operators and maintenance personnel:
System administration personnel who are responsible for the actual running and maintenance of the
hardware and software environment for the database system.

Entity Relationship Model


Introduction:
✓ The entity-relationship model (or ER model) is a way of graphically representing the logical relationships of
entities (or objects) in order to create a database.
✓ The ER model was first proposed by Peter Pin-Shan Chen of Massachusetts Institute of Technology (MIT) in
the 1970s.
✓ In ER modelling, the structure for a database is portrayed as a diagram, called an entity-relationship diagram (or
ER diagram),
✓ In the ER model, the main concepts are entity, attribute, and relationship.
Entity, Entity set:
✓ An entity represents some "thing" (in the miniworld) that is of interest to us, i.e., about which we want to
maintain some data.
✓ An entity is a real world object having physical or logical existence. Examples of entities having physical
existence include: student, house, person, automobile etc.
✓ Entities having logical existence include: Company, job, academic course, business transaction, loan account,
subject etc.
✓ Entity type means a collection of entities having same set of attributes. If s1,s2, s3 are three students having
same set of attributes then they belong to an entity set (say, student).
✓ An entity set is a set of entities of the same type that share the same properties, or Attribute. For example, the
set of all persons who are customers at a given bank, set of all students of a particular class etc.
✓ In E-R diagram, Entity set is represented using a rectangular box.
Attributes:
✓ An entity is represented by a set of attributes.
✓ The properties of an entity are represented in terms of attributes.
✓ Each entity has a value for each of its attributes. For instance, a particular customer entity
may have the value 321-12-3123 for customer-id, the value Jones for customer name and so on.
✓ The customer-id attribute is used to uniquely identify customers, since there may be more than one customer
with the same name, street, and city.
✓ For each attribute, there is a set of permitted values, called the domain, or value set, of that attribute. E.g.: The
domain of attribute customer-name might be the set of all text strings of a certain length.
✓ An attribute is represented using ellipse/oval shape.

An attribute, as used in the E-R model, are of different types


Simple and composite attributes:
✓ A simple attribute cannot be subdivided. For example the attributes roll no, gender etc. are simple attributes.
Simple attributes are represented using an ellipse.
✓ A composite attribute is an attribute that can be further
subdivided. For example the attribute ADDRESS can be
subdivided into street, city, state, and zip code. The
attribute name can be divided into first-name, last-name
and middle-name. Composite attributes are represented
using number of ellipses connecting to a single ellipse. Composite attribute

Single Valued and Multivalued attribute:


✓ A single valued attribute can have only a single value. For example a person can have only one 'date of birth',
one Adhaar number etc. single-valued attributes are represented using an ellipse.
✓ Multivalued attributes can have multiple values. For instance
a person may have multiple phone numbers, multiple degrees,
multiple addresses etc. Multi-valued attributes are represented
using double ellipse.
Multivalued attribute
Derived and stored Attributes
✓ The value for the derived attribute is derived from the stored attribute.
✓ For example 'Date of birth' of a person is a stored attribute. The value for the attribute 'AGE' can be derived by
subtracting the 'Date of Birth'(DOB) from the current date.
✓ Another example of derived attribute is experience of an employee which can be calculated from date of
joining.
✓ Stored Attribute An attribute that supplies a value to the
related attribute.
Example: Date of Birth, Date of Joining.
✓ Stored attributes are represented using an ellipse whereas Derived attribute
derived attributes are represented using a dotted ellipse.

Complex attributes:
An attribute which is a multivalued as well as composite is called as composite attribute. For example,
Address of a student is a composite (contains city, street, door no, pin code) as well as multivalued (can be
present address or permanent address).
Descriptive attributes:
The attributes of any relationship set are known as descriptive attributes. Descriptive attributes are used
to record information about the relationship.
Relationship, relationship set
✓ A relationship is an association among several entities. It connects different entities through a meaningful
relation.
✓ A relationship set is a set of relationships of the same type.
✓ In E-R diagram, a relationship set is represented using Diamond symbol (Rhombus symbol).
✓ Degree of relationship set Total no. of entity sets participate in a relationship set is known as degree of that
relationship set.

Different types of relationship sets are, Unary, Binary, Ternary relationship.


 Unary relationship= degree 1
 Binary relationship = degree 2
 Ternary = degree 3
 n-ary = degree n

Unary relationship set:


When both participants in the relationship are the same entity then we call that A unary relationship.
For Example:
1. Subjects may be prerequisites (basics) for other subjects.
2. One employee manages other employees.

Binary relationship set:


A relationship set in which only two entity sets are involved is known as binary
relationship set. The following fig. is an example of binary relationship set. Here, there are only two entities
present such as, Employees and Departments.

Fig. binary relationship set


Ternary relationship set:
A relationship set in which only three entity sets are involved is known as ternary relationship set. The following
fig. is an example of ternary relationship set.

Fig. ternary relationship set

Additional Features of ER Model:


Mapping cardinalities/ Cardinality Ratio:
✓ Mapping cardinalities specifies the number of entities to which another entity can be associated via a
relationship set.
✓ Mapping cardinalities are most useful in describing binary relationship sets.
✓ For a binary relationship set R between entity sets A and B, the mapping cardinality must be one of the
following:
•One to one. An entity in A is associated with at most one entity in B, and an entity in B is associated with at
most one entity in A. (See Figure a.)
•One to many. An entity in A is associated with any number (zero or more) of entities in B. An entity in B,
however, can be associated with at most one entity in A. (See Figure b.)

MANAGED MANAGER
DEPARTMENT
BY

ONE TO ONE RELATIONSHIP (1:1)

•Many to one. An entity in A is associated with at most one entity in B. An entity in B, however, can be
associated with any number (zero or more) of entities in A. (See Figure a.)
•Many to many. An entity in A is associated with any number (zero or more) of entities in B, and an entity in
B is associated with any number (zero or more) of entities in A. (See Figure b.)

Many to One (M:1)

SUBJECT TAUGHT TEACHER


BY
Many to Many (M:N)

List of symbols used in the E-R diagrams:


Key Constraints for Ternary Relationships:
The following Figure, we show a ternary relationship with key constraints. Each employee works in at
most one department, and at a single location. Notice that each department can be associated with several
employees and locations, and each location can be associated with several departments and employees;
however, each employee is associated with a single department and location.

An instance of the Works In relationship set is shown in following Figure.


Participation Constraints participation constraints are of two types
i. Total participation: If every entity in the entity set is participated in the relationship set is known as total
participation. In the Works_In relationship set, it is natural to expect that each employee works in at least one
department and that each department has at least one employee. This means that the participation of both
Employees and Departments in Works_In is total. If the participation of an entity set in a relationship set is
total, the two are connected by a thick line (double line).

Fig. Total participation from emloyees to works_in and departments to works_in

ii. Partial participation: If only some of the entities in the entity set are participated in the relationship set is
known as partial participation. For example, every customer may not borrow loan but every loan must have
borrowed by at least one customer. Hence participation of loan is total whereas participation of customer is
partial in the relationship set borrower.

Fig. partial participation from customer to borrower and total participation from loan to borrower

The following ER diagram shows both the Manages and Works_In relationship sets and all the given constraints.
The thick lines indicate total participation and the arrow indicates a key constraint. Every department has at most
one manager is the key constraint.

Weak Entities
✓ An entity set which has no sufficient attributes to form a primary key is called as weak entity set.
✓ An entity set without a primary key is meaningless, hence to make it meaningful, it should be associated with
a strong entity set called as identifying entity set.
✓ The relationship set through which weak entity set is associated with identifying entity set is called as
identifying relationship set.
✓ Participation of weak entity set is always total in the identifying relationship set.
✓ Weak entity set is represented using double rectangle. And identifying relationship set is represented using
double diamond symbol.
✓ The attribute of weak entity set associated with primary key of strong entity set is called as discriminator or
partial key and is represented as a dotted line inside ellipse.
✓ In the following diagram, one employee may have many dependents. The attribute pname of dependents entity
set cannot be treated as primary key because more than one dependents can have the same name. Hence,
Dependents is a weak entity set. It must be associated with a strong entity set. In this case Employees is the
identifying entity set. Policy is the identifying relationship set. The attribute pname is called as discriminator or
partial key represented as dotted line inside ellipse.

Fig. Dependents is a weak entity set


Class Hierarchies
Sometimes it is natural to classify the entities in an entity set into subclasses. For example, we might want to talk
about an Hourly Employee entity set and a Contract Employee entity set to distinguish the basis on which they
are paid. We might have attributes hours worked and hourly wage defined for Hourly Employees and an attribute
contract-id defined for Contract Employees. The attributes for the entity set Employees are inherited by the entity
set Hourly Employees, and that Hourly Employees. This is also called as ISA (read is a) relationship. i.e. hourly
employee IS An employee. The following Figure illustrates the class hierarchy.

Fig. class hierarchy

Generalization and Specialization:


Employees is specialized into subclasses. Hourly_Employees and Contract_Employees are generalized by
Employees.
Going up in this structure is called generalization, where entities are clubbed together to represent a more
generalized view. In generalization, a number of entities are brought together into one generalized entity based
on their similar characteristics. For an example, pigeon, house sparrow, crow and dove all can be generalized as
Birds.
Specialization is a process, which is opposite to generalization, as mentioned above. In specialization, a group
of entities is divided into sub-groups based on their characteristics. Take a group Person for example. A person
has name, date of birth, gender etc. These properties are common in all persons, human beings. But in a company,
a person can be identified as employee, employer, customer or vendor based on what role do they play in company.

Inheritance
We use all above features of ER-Model, in order to create classes of objects in object oriented programming.
This makes it easier for the programmer to concentrate on what he/ she is programming. Details of entities are
generally hidden from the user, this process known as abstraction. One of the important features of Generalization
and Specialization, is inheritance, that is, the attributes of higher-level entities are inherited by the lower level
entities. For example, attributes of a person like name, age, and gender can be inherited by lower level entities
like student and teacher etc. Here, person is the superclass. Student and teacher are the subclasses. The attributes
of superclass are inherited by subclasses.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy