Unit 1dbms
Unit 1dbms
Unit 1dbms
UNIT-1 I
Introduction: Concept & Overview of DBMS, Data Models-, Network, Hierarchical and Relational
Model, Levels of abstraction. Administrator, Database Users, Three Schema architecture of DBMS,
Application. Entity-Relationship Model: : Entities, Attributes and Entity Sets, Relation and
Relationships sets, Mapping Constraints, Keys, Entity-Relationship Diagram, Weak Entity Sets,
Extended E-R features.
What is database?
A database is a collection of related data. By data, we mean known facts that can be recorded and that
have implicit meaning. Ex. the names, telephone numbers and addresses of all the people you know. A
database can be of any size and of varying complexity. For example, the list of names and addresses
referred to earlier may consist of only a few hundred records, each with a simple structure. On the other
hand, the card catalog of a large library may contain half a million cards stored under different
categories—by primary author’s last name, by subject, by book title—with each category organized in
alphabetic order.
Databases are widely used. Here are some representative applications:
– Banking: all transactions
– Airlines: reservations, schedules
– Universities: registration, grades
– Sales: customers, products, purchases
– Manufacturing: production, inventory, orders, supply chain
– Human resources: employee records, salaries, tax deductions
• Databases touch all aspects of our lives
Foundation Data Concept
A hierarchy of several levels of data has been devised that differentiates between different groupings,
or elements, of data. Data are logically organized into:
Character
It is the most basic logical data element. It consists of a single alphabetic, numeric, or other symbol.
Field
It consists of a grouping of characters. A data field represents an attribute (a characteristic or
quality) of some entity (object, person, place, or event).
Record:- The related fields of data are grouped to form a record. Thus, a record represents a
collection of attributes that describe an entity.
File:- A group of related records is known as a data file, or table. Files are
frequently classified by the application for which they ar primarily used, such as a payroll file or an
inventory file, or the type of data they contain, such as a document file or a graphical image file. Files
are also classified by their permanence, for example, a master file versus a transaction file. A
transaction file would contain records of all transactions occurring during a period, whereas a master
file contains all the permanent records. A history file is an obsolete transaction or master file
retained for backup purposes or for long-term historical storage called archival storage.
Database
It is an integrated collection of logically related records or objects. A database consolidates records
previously stored in separate files into a common pool of data records that provides data for many
applications. The data stored in a database is independent of the application programs using it and o the
‘type of secondary storage devices on which it is stored.
1. Redundancy can be reduced:- In traditional file system every user group maintains its own
files for handling its data-processing applications so most of the data is stored twice: once in
the files of each user group. This redundancy in storing the same data multiple times leads to
several problems. First, there is the need to perform a single logical update—such as entering
data on a new student—multiple times: once for each file where student data is recorded. This
leads to duplication of effort. Second, storage space is wasted when the same data is stored
repeatedly, and this problem may be serious for large databases.
2. Inconsistency can be avoided :- due to redundancy inconsistency may be introduced in the file
system because an update is applied to some of the files but not to others. Even if an update—
such as adding a new student—is applied to all the appropriate files, the data concerning the
student may still be inconsistent since the updates are applied independently by each user group.
For example, one user group may enter a student’s birthdate erroneously as JAN-19-1974,
whereas the other user groups may enter the correct value of JAN-29-1974.But in the Database
approach whole data is stored at one place so inconsistency is not avoided..
3. Sharing of data:-Database belongs to the entire organization and can be shared by all
authorized users.
4 Improved data integrity:-
Database integrity provides the validity and consistency of stored data. Integrity is usually
expressed in terms of constraints, which are consistency rules that the database is not permitted to
violate.
5. Improved security and authorization:-
Database approach provides a protection of the data from the unauthorized users. It may take the
term of user names and passwords to identify user type and their access right in the operation
including retrieval, insertion, updating and deletion.
6. Enforcement of standards:-
The integration of the database enforces the necessary standards including data formats, naming
conventions, documentation standards, update procedures and access rules.
7. Economy of scale:-
Cost savings can be obtained by combining all organization's operational data into one database
with applications to work on one source of data.
8. Increased concurrency:-
Database can manage concurrent data access effectively. It ensures no interference between users
that would not result any loss of information nor loss of integrity.
9. Improved backing and recovery services:-
Modern database management system provides facilities to minimize the amount of processing that
can be lost following a failure by using the transaction approach.
10. Program data independence can be provided:- The independence between the programs and
the data is known as program-data independence (or simply data independence). It is an important
characteristic of DBMS as it allows changing the structure of the database without making any
changes in the application programs that are using the database.
11. Data retrieval become efficient: - storing of all the data at one place in the database approach,
data accessing can be crossed departmental boundaries. So the data retrieval becomes efficient in
database approach rather than file system.
12. Transactional Problem can be removed:-it ensure that either the transaction is correctly
executed or completely aborted.
13. Providing Backup and Recovery :-
A DBMS must provide facilities for recovering from hardware or software failures. The backup
and recovery subsystem of the DBMS is responsible for recovery. For example, if the computer
system fails in the middle of a complex update program, the recovery subsystem is responsible for
making sure that the database is restored to the state it was in before the program started executing.
Alternatively, the recovery subsystem could ensure that the program is resumed from the point at
which it was interrupted so that its full effect is recorded in the database.
14. Integrity & quality can be maintained:-. Database integrity provides the validity and
consistency of stored data. Integrity is usually expressed in terms of constraints, which are
consistency rules that the database is not permitted to violate For example, the balance of a bank
account may never fall below a prescribed amount (say, $25). Developers enforce these constraints
in the system by adding appropriate code in the various application programs.
Complexity
Database management system is an extremely complex piece of software. All parties must be familiar
with its functionality and take full advantage of it. Therefore, training for the administrators, designers
and users is required.
Size
The database management system consumes a substantial amount of main memory as well as a large
number amount of disk space in order to make it run efficiently.
Cost of DBMS
A multi-user database management system may be very expensive. Even after the installation, there is a
high recurrent annual maintenance cost on the software.
Cost of conversion
When moving from a file-base system to a database system, the company is required to have additional
expenses on hardware acquisition and training cost.
Performance
As the database approach is to cater for many applications rather than exclusively for a particular one,
some applications may not run as fast as before.
Higher impact of a failure
The database approach increases the vulnerability of the system due to the centralization. As all users
and applications reply on the database availability, the failure of any component can bring operations to
a halt and affect the services to the customer seriously.
DATABASE SYSTEM
A database system is a computer based record keeping System whose overall purpose is to record and
maintain information that is relevant to the organization necessary for making decisions.With the
growth of the database, these systems are used in various applications of real world such as
• Banking System and ATM's machines.
• Stock Trading Systems.
• Flight Reservation Systems.
• Computerized Library Systems.
• Super Market Product Inventory System.
• Credit Card/Credit Limit Check System.
Database can range from those of a single user with a desktop computer to those on mainframe
computers with thousands of users.
COMPONENTS OF DATABASE SYSTEM
A database system is composed of four components;
1) Data 2) Hardware 3) Software 4)Users.
which coordinate with each other to form an effective database system.
1. Data - It is a very important component of the database system. Most of the organizations
generate, store and process 1arge amount of data. The data acts a bridge between the machine parts i.e.
hardware and software and the users which directly access it or access it through some application
programs.
Data may be of different types.
• User Data - It consists of a table(s) of data called Relation(s) where Column(s) are called fields of
attributes and rows are called Records for tables. A Relation must be structured properly.
• Metadata - A description of the structure of the database is known as Metadata. It basically means
"data about data". System Tables store the Metadata which includes.
- Number of Tables and Table Names
- Number of fields and field Names
- Primary Key Fields
• Application Metadata - It stores the structure and format of Queries, reports and other applications
components. '
The various types of users which can access the database are:-
• Database Administrators (DBA)
• Database Designers
• End Users
• Application Programmers
DATABASE USERS
Users may be divided into those who actually use and control the content (called “Actors on the
Scene”) and those who enable the database to be developed and the DBMS software to be designed and
implemented (called “Workers Behind the Scene”).
Actors on the scene
Database Designers: responsible to define the content, the structure, the constraints, and functions or
transactions against the database. They must communicate with the end-users and understand their
needs
End-users: End users are those persons who interact with the application directly. They are responsible
to insert, delete and update data in the database. They get information from the system as and when
required.
There are several categories of end users:
(1)Casual end users occasionally access the database, but they may need different information each
time. They use a sophisticated database query language to specify their requests and are typically
middle- or high-level managers or other occasional browsers.
(2) Naive or parametric end users make up a sizable portion of database end users. Their main job
function revolves around constantly querying and updating the database, using standard types of
queries and updates-called canned transactions-that have been carefully programmed and tested. The
tasks that such users perform are varied:
I )Bank tellers check account balances and post withdrawals and deposits.
ii) Reservation clerks fur airlines, hotels, and car rental companies check availability for a given request
and make reservations.
iii)Clerks at receiving stations for courier mail enter package identifications via bar codes and
descriptive information through buttons to update a central database of received and in-transit
packages.
(3) Sophisticated end users :-include engineers, scientists, business analysts, and others who
thoroughly familiarize themselves with the facilities of the DBMS so as to implement their applications
to meet their complex requirements.
(4) Stand-alone users :- maintain personal databases by using ready-made program packages that
provide easy-to-use menu-based or graphics-based interfaces. An example is the user of a tax package
that stores a variety of personal financial data for tax purposes.
Application programmers: Application programmer is the person who is responsible for
implementing the required functionality of database for the end user. Application programmer works
according to the specification provided by the system analyst. These programmer must have knowleage
of programming languages such as c,c++,java or SQL etc.since application programs are written in
these languages.
External/View level
The highest level of abstraction where only those parts of the entire database are included which are
of concern to a user. Despite the use of simpler structures at the logical level, some complexity remains,
because of the large size of the database. Many users of the database system will not be concerned with
all this information. Instead, such users need to access only a part of the database. So that their
interaction with the system is simplified, the view level of abstraction is defined. The system may
provide many views for the same database.
Databases change over time as information is inserted and deleted. The collection of information
stored in the database at a particular moment is called an instance of the database. The overall design of
the database is called the database schema. Schemas are changed infrequently, if at all.
Database systems have several schemas, partitioned according to the levels of abstraction that we
discussed. At the lowest level is the physical schema; at the intermediate level is the logical schema and
at the highest level is a subschema.
The features of this view are
• The external or user view is at the highest level of database architecture.
• Here only one portion of database will be given to user.
• One portion may have many views.
• Many users and program can use the interested part of data base.
• By creating separate view of database, we can maintain security.
• Only limited access (read only, write only etc) can be provided in this view.
For example: The head of account department is interested only in accounts but in library
information, the library department is only interested in books, staff and students etc. But all such data
like student, books, accounts, staff etc is present at one place and every department can use it as per
need.
Conceptual/Logical level
Database administrators, who must decide what information is to be kept in the database, use this
level of abstraction. One conceptual view represents the entire database. There is only one conceptual
view per database.
The description of data at this level is in a format independent of its physical representation. It also
includes features that specify the checks to retain data consistence and integrity.
The features are:
• The conceptual or logical view describes the structure of many users.
• Only DBA can be defined it.
• It is the global view seen by many users.
• It is represented at middle level out of three level architecture.
• It is defined by defining the name, types, length of each data item. The create table
commands of Oracle creates this view.
• It is independent of all hardware and software.
Internal/Physical level
The lowest level of abstraction describes how the data are stored in the database, and what
relationships exist among those data. The entire database is thus described in terms of a small number
of relatively simple structures, although implementation of the simple structures at the logical level may
involve complex physical-level structures, the user of the logical level does not need to be aware of this
complexity.
The features are :
• It describes the actual or physical storage of data.
• It stores the data on hardware so that can be stored in optimal time and accessed
in optimal time.
• It is the third level in three level architecture.
• It stores the concepts like:
• B-tree and Hashing techniques for storage of data.
• Primary keys, secondary keys, pointers, sequences for data search.
• Data compression techniques.
• It is represented as
FILE EMP [
INDEX ON EMPNO
FIELD = {
(EMPNO: BYTE (4),
ENAME BYTE(25))]
Mapping in DBMS Architecture
We know that three view-levels are described by means of three schemas. These schemas are stored in
the data dictionary. In DBMS, each user refers only to its own external schema. Hence, the DBMS
must transform a request on. a specified external schema into a request against conceptual schema, and
then into a request against internal schema to store and retrieve data to and from the database.The
process to convert a request (from external level) and the result between view levels is called mapping.
The mapping defines the correspondence between three view levels. The mapping description is also
stored in data dictionary. The DBMS is responsible for mapping between these three types of schemas.
There are two types of mapping.
1. Physical data independence:- is the ability to modify the physical schema without
causing application programs to be rewritten. It means we change the physical storage/level without
affecting the conceptual or external view of the data. The new changes are absorbed by mapping
techniques. Modifications at the physical level are occasionally necessary to improve performance.
Alteration in the internal schema might include.
* Using new storage devices.
* Using different data structures.
* Switching from one access method to another.
* Using different file organizations or storage structures.
* Modifying indexes.
An example of physical data independence is to change the storage device used to store the database
data, will not affect the conceptual or external schemas / layers.
2. Logical data independence:- in the ability to modify the logical schema without causing
application program to be rewritten. Modifications at the logical level are necessary whenever the
logical structure of the database is altered (for example, when money-market accounts are added to
banking system).
Logical Data independence means if we add some new columns or remove some columns from
table then the user view and programs should not changes. It is called the logical independence. For
example: consider two users A & B. Both are selecting the empno and ename. If user B add a new
column salary in his view/table then it will not effect the external view user; user A, but internal view
of database has been changed for both users A & B. Now user A can also print the salary.
User A’s External View
DATABASE LANGUAGES
Data Description Language (DDL)
The Data Definition Language (DDL) is used to create and destroy databases and database objects.
These commands will primarily be used by database administrators during the setup and removal
phases of a database project. example are
o CREATE - to create objects in the database
o ALTER - alters the structure of the database
o DROP - delete objects from the database
o TRUNCATE - remove all records from a table, including all spaces allocated for the records are
removed
o COMMENT - add comments to the data dictionary
o RENAME - rename an object
Data Control Language (DCL): It is used to create roles, permissions, and referential integrity as well
it is used to control access to database by securing it. Some examples:
GRANT - gives user's access privileges to database
o REVOKE - withdraw access privileges given with the GRANT command
Transaction Control (TCL): Language It is used to manage different transactions occurring within a
database. Some example ARE
o COMMIT - save work done
o SAVEPOINT - identify a point in a transaction to which you can later roll back
o ROLLBACK - restore database to original since the last COMMIT
o SET TRANSACTION - Change transaction options like isolation level and what rollback
segment to use
Data Model
• Data Model: A set of concepts to describe the structure of a database, and certain constraints
that the database should obey.
Categories of data models:
• High-level, semantic data models: Provide concepts that are close to the way many users
perceive data. (Also called entity-based or object-based data models.)
• Physical (low-level, internal) data models: Provide concepts that describe details of how
data is stored in the computer.
• Implementation (representational) data models: Provide concepts that fall between the
above two, balancing user views with some computer storage details.
Hierarchical Model
• Hierarchical Model is based on tree structure. A Hierarchical Db consists of collection of
records, that are connected to each other by links. The root node is dummy node or an empty
node.
• ADVANTAGES:
• Hierarchical Model is simple to construct and operate on
• Corresponds to a number of natural hierarchically organized domains - e.g., assemblies
in manufacturing, personnel organization in companies
• DISADVANTAGES:
• It can not represent all the relationship of entire world
• Maintaining the Database is very difficult task.
• Little scope for "query optimization“
• Wastage of storage space
• Inconsistency during updation of database because when parent node is deleted that
results in deletion of child node force fully.
• Not flexible.
• Commercially available Hierarchical Database system:
• IBM’s information management system
• MRI’s system 2000
• IMS informatics Mark IV
• Time –shared Data management System of SDC
Network Model
ADVANTAGES
The network model can handle the one-to-many and many-to-many relationships.
In the network database terminology, a relationship is a set. Each set comprises of two types of
records.- an owner record and a member record, In a network model an application can access
an owner record and all the member records within a set.
Data Independence
The network model draws a clear line of demarcation between programs and the complex
physical storage details. The application programs work independently of the data. Any changes
made in the data characteristics do not affect the application program.
• DISADVANTAGES
Making structural modifications to the database is very difficult in the network database model
as the data access method is navigational. Any changes made to the database structure require
the application programs to be modified before they can access data. Though the network model
achieves data independence, it still fails to achieve structural independence.
Relational Model
• The relational model used the basic concept of a relation or table. The columns or fields in the
table identify the attributes such as name, age, and so. A tuple or row contains all the data of a
single instance of the table such as a person named Doug. In the relational model, every tuple
must have a unique identification or key based on the data. In this figure, a social security
account number (SSAN) is the key that uniquely identifies each tuple in the relation.
ADVANTAGES:-
1. Ease of use: The revision of any information as tables consisting of rows and columns is quite
natural and therefore even first time users find it attractive.
2. Flexibility: Different tables from which information has to be linked and extracted can be
easily manipulated by operators such as project and join to give information in the form in
which it is desired.
3. Precision: The usage of relational algebra and relational calculus in the manipulation of he
relations between the tables ensures that there is no ambiguity.
4. Security: Security control and authorization can also be implemented more easily by moving
sensitive attributes in a given table into a separate relation with its own authorization controls.
5. structural Independence
6. Easy to Design
Disadvantage:-
• A major constraint and therefore disadvantage in the use of relational database system is
machine performance. If the number of tables between which relationships to be established are
large and the tables themselves are voluminous, the performance in responding to queries is
definitely degraded.
• Need more powerful computing H/W and data storage devices that increase the cost and H/W
overhead.
• A key is an attribute (also known as column or field) or a combination of attribute that is used to
identify records. Sometimes we might have to retrieve data from more than one table, in those cases
we require to join tables with the help of keys. The purpose of the key is to bind data together across
tables without repeating all of the data in every table.
The various types of key with e.g. in SQL are mentioned below, (For examples let suppose we have
an Employee Table with attributes ‘ID’ , ‘Name’ ,’Address’ , ‘Department_ID’ ,’Salary’)
1. Primary Key
A key is a single attribute or combination of two or more, attributes of an entity that is used to
identify one or more instances of the set. The attribute Roll # uniquely identifies an instance of the
entity set STUDENT. It tells about student Amrita having address 101, Kashmir Avenue and phone no.
112746 and have paid fees 1500 on basis of Roll No. 15. The 15 is unique value and it gives unique
identification of students So here Roll No is unique attribute and such a unique entity identifies called
Primary Key. Primary key cannot be duplicate.
The primary key is important since it is the sole identifier for the tuples in a relation. Any tuple in
a database may be identified by specifying relation name, primary key and its value. Also for a tuple to
exist in a relation, it must be identifiable and therefore it must have a primary key.
2. Secondary Key
The attributes that are not even the Super Key but can be still used for identification of records (not
unique) are known as Secondary Key. E.g. of Secondary Key can be Name, Address, Salary,
Department_ID etc. as they can identify the records but they might not be unique.
3. Super Key
If we add additional attributes to a primary key, the resulting combination would still uniquely
identify an instance of the entity set Such keys are called super keys A primary key is therefore a
minimum super key For example, if DOB (date of birth field or attribute) is the primary key, then by
adding some additional information about the day of the month key in the DOB field, this field or
attribute becomes more powerful and useful Such type of key is called super key Super key are less
used in a small database file. Now these days it has less importance, but due to its feature, this key
gives the complete description of the database.
4. Candidate Key
• Candidate Key – It can be defined as minimal Super Key or irreducible Super Key. In other
words an attribute or a combination of attribute that identifies the record uniquely but none of its proper
subsets can identify the records uniquely.
E.g. of Candidate Key
1 Code
2 Name, Address
For above table we have only two Candidate Keys (i.e. Irreducible Super Key) used to identify the
records from the table uniquely. Code Key can identify the record uniquely and similarly combination
of Name and Address can identify the record uniquely, but neither Name nor Address can be used to
identify the records uniquely as it might be possible that we have two employees with similar name or
two employees from the same house..
5. Alternate Key – Alternate Key can be any of the Candidate Keys except for the Primary Key.
E.g. of Alternate Key is combination of “Name, Address” as it is the only other Candidate Key which is
not a Primary Key.
6 Foreign Key
Foreign Key – A foreign key is an attribute or combination of attribute in one base table that points to
the candidate key (generally it is the primary key) of another table. The purpose of the foreign key is to
ensure referential integrity of the data i.e. only values that are supposed to appear in the database are
permitted.
E.g. of Foreign Key – Let consider we have another table i.e. Department Table with Attributes
“Department_ID”, “Department_Name”, “Manager_ID”, ”Location_ID” with Department_ID as an
Primary Key. Now the Department_ID attribute of Employee Table (dependent or child table) can be
defined as the Foreign Key as it can reference to the Department_ID attribute of the Departments table
(the referenced or parent table), a Foreign Key value must match an existing value in the parent table or
be NULL.
Constraints
• Constraints within a database are rules which control values allowed in columns and also
enforce the integrity between columns and tables
Entity integrity
• The entity integrity constraint states that no primary key value can be null. This is because the
primary key value is used to identify individual tuples in a relation. Having null value for the
primary key implies that we cannot identify some tuples.This also specifies that there may not
be any duplicate entries in primary key column key row.
• The referential integrity constraint is specified between two relations and is used to maintain the
consistency among tuples in the two relations. Informally, the referential integrity constraint
states that a tuple in one relation that refers to another relation must refer to an existing tuple in
that relation. It is a rule that maintains consistency among the rows of the two relations
Domain Constraints:
• A Domain constraint deals with one or more columns. It is important to ensure that a particular
column or a set of columns meets particular criteria. When you insert or update a row, the
constraint is applied without respect to any other row in the table. The focus is on the data that
is in the column. These kinds of constraints will resurface when we deal with Check
constraints, Default constraints and rules and defaults.
A constraint is a property assigned to a column or the set of columns in a table that prevents
certain types of inconsistent data values from being placed in the column (s). Constraints are
used to enforce the data integrity. This ensures the accuracy and reliability of the data in the
database CONSTRAINT=The threat or use of force to prevent, restrict, or dictate the action or
thought of others. there are 7 types of constraints are there and they are grouped in to 4 types.
They are TYPES GROUP
The Entity - Relationship Model (E-R Model) is a high-level conceptual data model developed
by Chen in 1976 to facilitate database design. Conceptual Modeling is an important phase in designing
a successful database. A conceptual data model is a set of concepts that describe the structure of a
database and associated retrieval and updation transactions on the database. A high level model is
chosen so that all the technical aspects are also covered.
The E-R data model grew out of the exercise of using commercially available DBMS's to model
the database. The E-R model is the generalization of the earlier available commercial models like the
Hierarchical and the Network Model. It also allows the representation of the various constraints as well
as their relationships.
So to sum up, the Entity-Relationship (E-R) Model is based on a view of a real world that
consists of set of objects called entities and relationships among entity sets which are basically a group
of similar objects. The relationships between entity sets is represented by a named E-R relationship and
is of 1:1, 1: N or M: N type which tells the mapping from one entity set to another.
The E-R model is shown diagrammatically using Entity-Relationship (E-R) diagrams which
represent the elements of the conceptual model that show the meanings and the relationships between
those elements independent of any particular DBMS and implementation details.
1. The E-R diagram used for representing E-R Model can be easily converted into Relations (tables) in
Relational Model.
2. The E-R Model is used for the purpose of good database design by the database developer so to use
that data model in various DBMS.
3. It is helpful as a problem decomposition tool as it shows the entities and the relationship between
those entities.
4. It is inherently an iterative process. On later modifications, the entities can be inserted into this
model.
5. It is very simple and easy to understand by various types of users and designers because specific
standards are used for their representation
Entity
An entity is an object that exists and is distinguishable from other objects.Might be Object with
physical existence like Lect,student,car. Object with conceptual or logical existence like
course,job,postion.
Entity Type:- a collection of similar entities
• A set of entities that have the same attributes is called an entity type. Each entity type in the
database is described by a name and a list of attributes. For example an entity employee is an
entity type that has Name, Age and Salary attributes.
• The individual entities of a particular entity type are grouped into a collection or entity set,
which is also called the extension of the entity type.
An entity is a thing in the real world. It may be an object with a physical existence or an object
with a conceptual existence. A set of these entities having same attributes is entity type and
collection of individual entity type is an entity set.
entity type is like fruit which is a class .we havn't seen any "fruit"yet though we have
seen instance of fruit like "apple ,banana,mango etc.hence..
fruit=entity type=EMPLOYEE
apple=entity=e1 or e2 or e3
enity set= bucket of apple,banana ,mango etc={e1,e2......}
•
• STRONG ENTITY SETS
An entity set containing a key attribute are called strong entity types or regular entity types.For
example, The STUDENT entity has a key attribute Roll No which uniquely identifies it, hence
is a strong entity set.
• WEAK ENTITY SETS
An entity set may not have sufficient attribute to form a primary key. Entity types that do not
contain any key attributes, and hence can not be identified independently are called weak entity
sets.
• A weak entity can be identified uniquely only by considering some of its attributes in
conjunction with the primary key attribute of another entity, which is called the identifying
owner entity
Attributes
Entities are represented by means of their properties, called attributes. All attributes have values. For
example, a student entity may have name, class, and age as attributes.
There exists a domain or range of values that can be assigned to attributes. For example, a student's
name cannot be a numeric value. It has to be alphabetic. A student's age cannot be negative, etc.
Types of Attributes
Simple attribute − Simple attributes are atomic values, which cannot be divided further. For
example, a student's RollNumber is an atomic value of 7 digits.
Composite attribute − Composite attributes are made of more than one simple attribute. For
example, a student's complete name may have FirstName and LastName.
Derived attribute − Derived attributes are the attributes that do not exist in the physical
database, but their values are derived from other attributes present in the database. For example,
average_salary in a department should not be saved directly in the database, instead it can be
derived. For another example, Age can be derived from BirthDate
Single-value attribute − Single-value attributes contain single value. For example −
Social_Security_Number.
Multi-value attribute − Multi-value attributes may contain more than one values. For example,
a person can have more than one Phone Number, Email Address, etc.
Relationship
The association among entities is called a relationship. For example, an employee works_at a
department, a student Enrolls in a course. Here, Works_at and Enrolls are called relationships.
Relationship Set
A set of relationships of similar type is called a relationship set. Like entities, a relationship too can
have attributes. These attributes are called descriptive attributes.
Degree of Relationship
The number of participating entities in a relationship defines the degree of the relationship.
Binary = degree 2
Ternary = degree 3
N-ary = degree n
Mapping Cardinalities
Cardinality defines the number of entities in one entity set, which can be associated with the number
of entities of other set via relationship set.
One-to-one − One entity from entity set A can be associated with at most one entity of entity
set B and vice versa.
One-to-many − One entity from entity set A can be associated with more than one entities of
entity set B however an entity from entity set B, can be associated with at most one entity.
Many-to-one − More than one entities from entity set A can be associated with at most one
entity of entity set B, however an entity from entity set B can be associated with more than one
entity from entity set A.
Many-to-many − One entity from A can be associated with more than one entity from B and
vice versa.
Let us now learn how the ER Model is represented by means of an ER diagram. Any object, for
example, entities, attributes of an entity, relationship sets, and attributes of relationship sets, can be
represented with the help of an ER diagram.
Entity
An entity is a person,place,thing or event for which data is collected and maintained.
for example a library system may contain data about different entities like BOOK and MEMBER. A
college system may include entities like STUDENT, TEACHER and CLASS.
Entities are represented by means of rectangles. Rectangles are named with the entity set they represent.
Relationships
Relationships are represented by diamond-shaped box. Name of the relationship is written inside the
diamond-box. All the entities (rectangles) participating in a relationship, are connected to it by a line.
Binary Relationship and Cardinality
A relationship where two entities are participating is called a binary relationship. Cardinality is the
number of instance of an entity from a relation that can be associated with the relation.
One-to-one − When only one instance of an entity is associated with the relationship, it is
marked as '1:1'. The following image reflects that only one instance of each entity should be
associated with the relationship. It depicts one-to-one relationship.
One-to-many − When more than one instance of an entity is associated with a relationship, it is
marked as '1:N'. The following image reflects that only one instance of entity on the left and
more than one instance of an entity on the right can be associated with the relationship. It
depicts one-to-many relationship.
Many-to-one − When more than one instance of entity is associated with the relationship, it is
marked as 'N:1'. The following image reflects that more than one instance of an entity on the left
and only one instance of an entity on the right can be associated with the relationship. It depicts
many-to-one relationship.
Many-to-many − The following image reflects that more than one instance of an entity on the
left and more than one instance of an entity on the right can be associated with the relationship.
It depicts many-to-many relationship.
Participation Constraints
Total Participation − Eachinstance of an entity is involved in the relationship. Total
participation is represented by double lines.
Partial participation − Not all instances of entities are involved in the relationship. Partial
participation is represented by single lines.
Extended(Enhanced ) ER Model:
• The ER modeling concepts are sufficient for representing traditional database application. For
more complex database application such as telecommunications , CAD/CAM , GIS etc , we
need more complex requirements than traditional applications. In late 1970’s database designers
have tried to design more accurate ER model , which reflects the data properties and constraints
more accurately . So extended(Enhanced ) ER model have some enhanced features than normal
ER model. It uses the concepts of Specialization , Generalization , Aggregation
Generalization Specialization and Aggregation in DBMS are abstraction mechanisms used to model
information. The abstraction is the mechanism used to hide the superfluous details of a set of objects.
For example, vehicle is a abstraction, that includes the types car, jeep and bus.
Specialization may be seen as the reverse process of Generalization. Specialization is the abstracting
process of introducing new characteristics to an existing class of objects to create one or more new
classes of objects.
In specialization, a group of entities is divided into sub-groups based on their characteristics. Take a
group ‘Person’ for example. A person has name, date of birth, gender, etc. These properties are
common in all persons, human beings. But in a company, persons can be identified as employee,
employer, customer, or vendor, based on what role they play in the company.
Similarly, in a school database, persons can be specialized as teacher, student, or a staff, based on what
role they play in school as entities.
Generalization
• Generalization is just reverse of Specialization. Generalization is the process to define a
generalized entity type from the given entity type. In generalization, a number of entities are
brought together into one generalized entity based on their similar characteristics. For example,
pigeon, house sparrow, crow and dove can all be generalized as Birds.
For ex. Consider the two entity CAR and TRUCK . Because both have some common attributes, they
can combindly make a super entity called VEHICLE. So it is the process to identify the common
features (attributes) from two or more entity and generalized them into a super entity.
Aggregation:
• Aggregration is a process when relation between two entity is treated as a single entity. Here the
relation between Center and Course, is acting as an Entity in relation with Visitor.
Inheritance
We use all the above features of ER-Model in order to create classes of objects in object-oriented
programming. The details of entities are generally hidden from the user; this process known
as abstraction.
For example, the attributes of a Person class such as name, age, and gender can be inherited by lower-
level entities such as Student or Teacher.