DBMS Material Module-1 & 2

Download as pdf or txt
Download as pdf or txt
You are on page 1of 67

MODULE I: Introduction:

What is Database Management System..?


Introduction
What is data?
Data are the known facts or figures that have implicit meaning. It can also be defined as it is the
representation of facts, concepts or instructions in a formal manner, which is suitable for
understanding and processing. Data can be represented in alphabets (A-Z, a-z), digits (0-9) and
using special characters (+,-, # , $, etc)
Eg: 25, “ajit” etc.
Information: Information is the processed data on which decisions and actions are based.
Information can be defined as the organized and classified data to provide meaningful values.
Eg: “The age of Ravi is 25”
Database:
A database is organized collection of related data of an organization stored in formatted way
which is shared by multiple users.
Database Management System (DBMS):
A database management system consists of collection of related data and refers to a set of
programs for defining, creation, maintenance and manipulation of a database.
Advantage of DBMS over File Processing System:
File System: A File Management system is a DBMS that allows access to single files or tables
at a time. In a File System, data is directly stored in set of files. It contains flat files that have
no relationto other files (when only one table is stored in single file, then this file is known as
flat file).
DBMS: A Database Management System (DBMS) is a application software that allows users
to efficiently define, create, maintain and share databases. Defining a database involves
specifying the data types, structures and constraints of the data to be stored in the database.
Creating a database involves storing the data on some storage medium that is controlled by
DBMS. Maintaining a database involves updating the database whenever required to evolve and
reflect changes in the mini world and also generating reports for each change. Sharing a database
involves allowing multiple users to access the database. DBMS also serves as an interface
between the database and end users or application programs. It provides control access to the
data and ensures that data is consistent and correct by defining rules on them.
An application program accesses the database by sending queries or requests for data to the
DBMS. A query causes some data to be retrieved from database.
Advantages of DBMS over File system:
Data redundancy and inconsistency: Redundancy is the concept of repetition of data i.e. each

Page 1 of 67
data may have more than a single copy. The file system cannot control the redundancy of data
as each user defines and maintains the needed files for a specific application to run. There may
be a possibility that two users are maintaining the data ofthe same file for different applications.
Hence changes made by one user do not reflect in files used by second users, which leads to
inconsistency of data. Whereas DBMS controls redundancy by maintaining a single repository
of data that is defined once and is accessed by many users. As there is no or less redundancy,
data remains consistent.
Data sharing: The file system does not allow sharing of data or sharing is too complex.
Whereas in DBMS, data can be shared easily due to a centralized system.
 Data concurrency: Concurrent access to data means more than one user is accessing the same
data at the same time. Anomalies occur when changes made by one user get lost because of
changes made by another user. The file system does not provide any procedure to stop
anomalies. Whereas DBMS provides a locking system to stop anomalies to occur.
 Data searching: For every search operation performed on the file system, a different
application program has to be written. While DBMS provides inbuilt searching operations.
The user only has to write a small query to retrieve data from the database.
 Data integrity: There may be cases when some constraints need to be applied to the data
before inserting it into the database. The file system does not provide any procedure to check
these constraints automatically. Whereas DBMS maintains data integrity by enforcing user-
defined constraints on data by itself.
 System crashing: In some cases, systems might have crashed due to various reasons. It is a
bane in the case of file systems because once the system crashes, there will be no recovery of
the data that’s been lost. A DBMS will have the recovery manager whichretrieves the data
making it another advantage over file systems.
 Data security: A file system provides a password mechanism to protect the database but how
long can the password be protected? No one can guarantee that. This doesn’t happen in the
case of DBMS. DBMS has specialized features that help provide shielding to its data.
 Backup: It creates a backup subsystem to restore the data if required.
 Interfaces: It provides different multiple user interfaces like graphical user interface and
application program interface.
 Easy Maintenance: It is easily maintainable due to its centralized nature.
DBMS is continuously evolving from time to time. It is a power tool for data storage and
protection. In the coming years, we will get to witness an AI-based DBMS to retrieve
databases ofancient eras.

Page 2 of 67
Applications of DBMS:
There are different fields where a database management system is utilized. Following are a few
applications which utilize the information base administration framework –
1. Railway Reservation System –
In the rail route reservation framework, the information base is needed to store the record or
information of ticket appointments, status about train’s appearance, and flight. Additionally, if
trains get late, individuals become acquainted with it through the information base update.
2. Library Management System –
There are lots of books in the library so; it is difficult to store the record of the relative multitude
of books in a register or duplicate. Along these lines, the data set administration framework
(DBMS) is utilized to keep up all the data identified with the name of the book, issue date,
accessibility of the book, and its writer.
3. Banking –
Database the executive’s framework is utilized to store the exchange data of the client in the
information base.
4. Education Sector –
Presently, assessments are led online by numerous schools and colleges. They deal with all
assessment information through the data set administration framework (DBMS). In spite of that
understudy’s enlistments subtleties, grades, courses, expense, participation, results, and so forth all
the data is put away in the information base.
5. Credit card exchanges –
The database Management framework is utilized for buying on charge cards and age of month to
month proclamations.
6. Social Media Sites –
We all utilization of online media sites to associate with companions and to impart our
perspectives to the world. Every day, many people group pursue these online media accounts like
Pinterest, Facebook, Twitter, and Google in addition to. By the utilization of the data set
administration framework, all the data of clients are put away in the information base and, we
become ready to interface with others.
7. Broadcast communications –
Without DBMS any media transmission organization can’t think. The Database the executive’s
framework is fundamental for these organizations to store the call subtleties and month to month
postpaid bills in the information base.
8. Account –
The information base administration framework is utilized for putting away data about deals,
Page 3 of 67
holding and acquisition of monetary instruments, for example, stocks and bonds in a data set.
9. Online Shopping –
These days, web-based shopping has become a major pattern. Nobody needs to visit the shop and
burn through their time. Everybody needs to shop through web based shopping sites, (for
example, Amazon, Flipkart, Snapdeal) from home. So all the items are sold and added uniquely
with the assistance of the information base administration framework (DBMS). Receipt charges,
installments, buy data these are finished with the assistance of DBMS.
10. Human Resource Management –
Big firms or organizations have numerous specialists or representatives working under them.
They store data about worker’s compensation, assessment, and work with the assistance of an
information base administration framework (DBMS).
11. Manufacturing –
Manufacturing organizations make various kinds of items and deal them consistently. To keep
the data about their items like bills, acquisition of the item, amount, inventory network the
executives, information base administration framework (DBMS) is utilized.
12. Airline Reservation System –
This framework is equivalent to the railroad reservation framework. This framework additionally
utilizes an information base administration framework to store the records of flight takeoff,
appearance, and defer status.
Purpose of database system:
The purpose of database systems is to make the database user-friendly and do easy operations.
Users can easily insert, update, and delete. Actually, the main purpose is to have more control of
the data.
The purpose of database systems is to manage the following insecurities:
 Data redundancy and inconsistency,
 Difficulty in accessing data,
 Data isolation,
 Atomicity of updates,
 Concurrent access,
 Security problems, and
 Supports multiple views of data.

Page 4 of 67
Avoid data redundancy and inconsistency:
If there are multiple copies of the same data, it just avoids it. It just maintains data in a single
repository. Also, the purpose of database systems is to make the database consistent.
Difficulty in accessing data:
A database system can easily manage to access data. Through different queries, it can access data
from the database.
Data isolation:
Data are isolated in several fields in the same database.
Atomicity of updates:
In case of power failure, the database might lose data. So, this feature will automatically prevent
data loss.
Concurrent access:
Users can have multiple access to the database at the same time.
Security problems:
Database systems will make the restricted access. So, the data will not be vulnerable.
Supports multiple views of data:
It can support multiple views of data to give the required view as their needs. Only database admins
can have a complete view of the database. We cannot allow the end-users to have a view of
developers.
Views of data:
View of data in DBMS narrate how the data is visualized at each level of data abstraction?
Data abstraction allow developers to keep complex data structures away from the users. The
developers achieve this by hiding the complex data structures through levels of abstraction.
Data Abstraction
Data abstraction is hiding the complex data structure in order to simplify the user’s interface
of the system.
To achieve data abstraction, we will discuss a Three-Schema architecture which abstracts the
database at three levels discussed below:
Three-Schema Architecture:

Page 5 of 67
The three-schema architecture defines the view of data at three levels:

1. Physical level (internal level)


2. Logical level (conceptual level)
3. View level (external level)
1. Physical Level/ Internal Level
The physical or the internal level schema describes how the data is stored in the hardware. It also
describes how the data can be accessed. The physical level shows the data abstraction at the lowest
level and it
has complex data structures. Only the database administrator operates at this level.
2. Logical Level/ Conceptual Level
It is a level above the physical level. Here, the data is stored in the form of the entity set, entities,
their data types, the relationship among the entity sets, user operations performed to retrieve or
modify the data and certain constraints on the data. Well adding constraints to the view of data adds
the security. As users are restricted to access some particular parts of the database.
It is the developer and database administrator who operates at the logical or the conceptual level.
3. View Level/ User level/ External level
It is the highest level of data abstraction and exhibits only a part of the whole database. It exhibits the
data in which the user is interested. The view level can describe many views of the same data. Here,
the user retrieves the information using different application from the database.
Data Independence:
Data independence defines the extent to which the data schema can be changed at one level without
modifying the data schema at the next level. Data independence can be classified as shown below:

Page 6 of 67
Logical Data Independence:
Logical data independence describes the degree up to which the logical or conceptual schema can be
changed without modifying the external schema. Now, a question arises what is the need to change
the data schema at a logical or conceptual level?
Well, the changes to data schema at the logical level are made either to enlarge or reduce the database
by adding or deleting more entities, entity sets, or changing the constraints on data.
Physical Data Independence:
Physical data independence defines the extent up to which the data schema can be changed at the
physical or internal level without modifying the data schema at logical and view level.
Instances and Schemas
What is an instance?
We can define an instance as the information stored in the database at a particular point of time.
Database Languages:
Data Definition Language (DDL) is a set of special commands that allows us to define and modify
the structure and the metadata of the database. These commands can be used to create, modify, and
delete the database structures such as schema, tables, indexes, etc.
Since DDL commands can alter the structure of the whole database and every change implemented
by a DDL command is auto-committed (the change is saved permanently in the database), these
commands are normally not used by an end-user (someone who is accessing the database via an
application).
Some of the DDL commands are:
CREATE:
It is used to create the database or its schema objects.
MySQL Syntax -
To create a new database:
 CREATE DATABASE database_name;
To create a new table:
 CREATE TABLE table_name (
 column_1 DATATYPE,
 column_2 DATATYPE,
 column_n DATATYPE );

DROP:
It is used to delete the database or its schema objects.
MySQL Syntax -
To delete an object:

Page 7 of 67
 DROP object object_name
To delete an existing table:
 DROP TABLE table_name;
To delete the whole database:
 DROP DATABASE database_name;

ALTER:
It is used to modify the structure of the database objects.
MySQL Syntax –
To add new column(s) in a table:
 ALTER TABLE table_name ADD (
 column_1 DATATYPE,
 column_2 DATATYPE,
 column_n DATATYPE );

To change the datatype of a column in a table:


 ALTER TABLE table name
 MODIFY column_name DATATYPE;
To remove a column from a table:
 ALTER TABLE table_name
 DROP COLUMN column_name;
TRUNCATE:
It is used to remove the whole content of the table along with the deallocation of the space occupied
by the data, without affecting the table's structure.
MySQL Syntax -
To remove data present inside a table:
 TRUNCATE TABLE table_name;
NOTE - We can also use the DROP command to delete the complete table, but the DROP
command will remove the structure along with the contents of the table. Hence, if you want to
remove the data present inside a table but not the table itself, you can use TRUNCATE instead of
DROP.
RENAME:
It is used to change the name of an existing table or a database object.
MySQL Syntax -
To rename a table:
 RENAME old_table_name TO new_table_name;
To rename a column of a table:

Page 8 of 67
 ALTER TABLE table_name
RENAME COLUMN old_Column_name to new_Column_name;

1. Data Manipulation Language (DML)


Data Manipulation Language (DML) is a set of special commands that allows us to access and
manipulate data stored in existing schema objects. These commands are used to perform certain
operations such as insertion, deletion, up dation, and retrieval of the data from the database.
These commands deal with the user requests as they are responsible for all types of data
modification. The DML commands that deal with the retrieval of the data are known as Data Query
language.
NOTE: The DML commands are not auto-committed i.e., the changes and modifications done via
these commands can be rolled back.
Some of the DML commands are:
INSERT:
It is used to insert new rows into the table.
MySQL Syntax -
To insert values according to the table structure:
INSERT INTO table_name
VALUES value1, value2, value3;
To insert values based on the columns:
 INSERT INTO table_name column1, column2, column3
 VALUES value1, value2, value3;
Note: While using the INSERT command, make sure that the datatypes of each column match
with the inserted value, as this may lead to unexpected scenarios and errors in the database.
UPDATE:
It is used to update existing column(s)/value(s) within a table.
MySQL Syntax -
To update the columns of a table based on a condition (General UPDATE statement):
 UPDATE table_name
 SET column_1 = value1,
 column_2 = value2,
 column_3 = value3,
 [WHERE condition]

Note: Here, the SET statement is used to set new values to the particular column, WHERE clause
is used to select rows for which the columns are updated for the given table.
DELETE:
It is used to delete existing records from a table, i.e., it is used to remove one or more rows from a
table.

Page 9 of 67
MySQL Syntax -
To delete rows from a table based on a condition:
 DELETE FROM table_name [WHERE condition];

Note: The DELETE statement only removes the data from the table, whereas the TRUNCATE
statement also frees the memory along with data removal. Hence, TRUNCATE is more efficient
in removing all the data from a table.
MERGE:
It is a combination of the INSERT, UPDATE, and DELETE statements.
It is used to merge data from a source table or query-set with a target table based on the specified
condition.
MySQL Syntax -
To delete an object:
 MERGE INTO target_table_name
 USING source_table_name
ON <condition>
 WHEN MATCHED THEN
 UPDATE <statements>
 WHEN NOT
MATCHED
THEN INSERT
<statements>

Page 10 of 67
Introduction to Database design:
Database design can be generally defined as a collection of tasks or processes that enhance the designing,
development, implementation, and maintenance of enterprise data management system. Designing a
proper database reduces the maintenance cost thereby improving data consistency and the cost-effective
measures are greatly influenced in terms of disk storage space. Therefore, there has to be a brilliant
concept of designing a database. The designer should follow the constraints and decide how the elements
correlate and what kind of data must be stored.
1. Database designs provide the blueprints of how the data is going to be stored in a system. A proper
design of a database highly affects the overall performance of any application.
2. The designing principles defined for a database give a clear idea of the behavior of any application and how the
requests are processed.
3. Another instance to emphasize the database design is that a proper database design meets all therequirements
of users.
4. Lastly, the processing time of an application is greatly reduced if the constraints of designing a highly efficient
database are properly implemented.

Requirement Analysis:
First of all, the planning has to be done on what are the basic requirements of the project under which the design of
the database has to be taken forward. Thus, they can be defined as:-
Planning - This stage is concerned with planning the entire DDLC (Database Development Life Cycle). The
strategic considerations are taken into account before proceeding.
System definition - This stage covers the boundaries and scopes of the proper database after planning.
Database Designing- The next step involves designing the database considering the user-based requirements
and splitting them out into various models so that load or heavy dependencies on a single aspect are not imposed.
Therefore, there has been some model-centric approach and that's where logical and physical models play a crucial
role.
Physical Model - The physical model is concerned with the practices and implementations of the logical model.
Logical Model - This stage is primarily concerned with developing a model based on the proposed requirements.
The entire model is designed on paper without any implementation or adopting DBMS considerations.

Page 11 of 67
Implementation:
The last step covers the implementation methods and checking out the behavior that matches our requirements. It
is ensured with continuous integration testing of the database with different data sets and conversion of data into
machine understandable language. The manipulation of data is primarily focused on these steps where queries are
made to run and check if the application is designed satisfactorily or not.
Data conversion and loading - This section is used to import and convert data from the old to the new system.
Testing - This stage is concerned with error identification in the newly implemented system. Testing is a crucial
step because it checks the database directly and compares the requirement specifications.
Database Design Process:
Database Design:
Database design is the organization of data according to a database model. Properly designed databases are easy
to maintain, improves data consistency.
The database design process can be divided into six steps. The ER model (Entity Relationship model) is most
relevant tothe first three steps.
1. Requirement analysis
2. Conceptual database design
3. Logical database design
4. Schema refinement
5. Physical database design
6. Application and security design
1. Requirement analysis
It is necessary to understand what data need to be stored in the database, what applications must be
built, what are all those operations that are frequently used by the system.
The requirement analysis is an informal process and it requires proper communication with user groups.
There are several methods for organizing and presenting information gathered in this step. Some automated
tools can also be used for this purpose.
2. Conceptual database design
The information gathered, is used to develop a high-level description of the data to be stored in the database
This is a steps in which E-R Model i.e. Entity Relationship model is built.
The goal of this design is to create a simple description of data that matches with the requirements of users.
3. Logical database design
This is a step in which ER model in converted to relational database schema, sometimes called as the logical
schema in the relational data model.
4. Schema refinement
In this step, relational database schema is analyzed to identify the potential problems and to refine it.
The schema refinement can be done with the help of normalizing and restructuring the relations.

Page 12 of 67
5. Physical database design
The design of database is refined further.
This step may simply involve building indexes on tables and clustering tables, redesign of parts of the
databaseschema obtained from the earlier design steps.
6. Application and security design
Using design methodologies like UML (Unified Modeling Language) try to address the complete software
designof the database can be accomplished.
The role of each entity in every process must be reflected in the application task.
For each role, there must be the provision for accessing and prohibiting some part of database.
Thus some access rules must be enforced on the application (which is accessing the database) to protect
thesecurity features.
Data storage and querying:
Database Management System is the collection of interrelated data/information or set of programs that
manages, control and access the use of data. Through DBMS, It is possible for the users to manage the
data efficiently in a database in order to increase accessibility and productivity. For example –
Employee’s record, Telephone Book having all the different contacts saved at a single place very
efficiently.
So, Database System is basically a software system, having an organized collection of structured
information, stored in a computer system. It enables the user to create, maintain, define and control
access to the database.
Uses of DBMS:
 For increasing the productivity through real-time component data.
 For Reducing the data redundancy and inconsistency.
 For Enhancing the data integrity.
 For Retrieving the data.
 For Data Security.
 For Data Indexing.
For more details, you can read the article Application of DBMS.
The Database System is further divided into two components. They are as follows:
 Data Storage Manager
 Query Manager
Data Storage Manager:
Data Storage Manager also known as “Database Control System”, is generally a program that
provides an interface between the data/information stored and the queries received. It helps us to
maintain the integrity and consistency of the database by applying the constraints. It is a highly

Page 13 of 67
flexible and scalable product that provides us with the capability of fully managed storage.
Storage Manager is generally in charge of the interactions with the File Manager, where raw data is
stored on the data with the help of the file system. It translates various DML statements into low-
level commands.
 Authorization and Integrity Manager: The main purpose of the Authorization and Integrity

Manager is to ensure the satisfaction of the integrity constraints and checks the authority of users to
access information.
 Transaction Manager: The main purpose of Transaction Manager is to ensure that even after the

system failures, the database should remain in a uniform state.


 File Manager: The main purpose of File Manager is to manage the allocation of space on the disk

storage.
 Buffer Manager: The main purpose of Buffer Manager is to fetch the data from disk storage into

the main memory.


Query Processing in DBMS:
Query Processing is the activity performed in extracting data from the database. In query processing,
it takes various steps for fetching the data from the database. The steps involved are:
1. Parsing and translation
2. Optimization
3. Evaluation
The query processing works in the following way:
Parsing and Translation:
As query processing includes certain activities for data retrieval. Initially, the given user queries get
translated in high-level database languages such as SQL. It gets translated into expressions that can be
further used at the physical level of the file system. After this, the actual evaluation of the queries and
a variety of query -optimizing transformations and takes place. Thus before processing a query, a
computer system needs to translate the query into a human-readable and understandable language.
Consequently, SQL or Structured Query Language is the best suitable choice for humans. But, it is not
perfectly suitable for the internal representation of the query to the system. Relational algebra is well
suited for the internal representation of a query. The translation process in query processing is similar
to the parser of a query. When a user executes any query, for generating the internal form of the query,
the parser in the system checks the syntax of the query, verifies the name of the relation in the database,
the tuple, and finally the required attribute value. The parser creates a tree of the query, known as
'parse-tree.' Further, translate it into the form of relational algebra. With this, it evenly replaces all the
use of the views when used in the query.

Page 14 of 67
Thus, we can understand the working of a query processing in the below-described diagram:

Suppose a user executes a query. As we have learned that there are various methods of extracting the
data from the database. In SQL, a user wants to fetch the records of the employees whose salary is
greater than or equal to 10000. For doing this, the following query is undertaken:
select emp name from Employee where salary>10000;
Thus, to make the system understand the user query, it needs to be translated in the form of relational
algebra. We can bring this query in the relational algebra form as:
σsalary>10000 (πsalary (Employee))
πsalary (σsalary>10000 (Employee))
After translating the given query, we can execute each relational algebra operation by using different
algorithms. So, in this way, a query processing begins its working.
Evaluation
For this, with addition to the relational algebra translation, it is required to annotate the translated
relational algebra expression with the instructions used for specifying and evaluating each operation.
Thus, after translating the user query, the system executes a query evaluation plan.
Query Evaluation Plan
 In order to fully evaluate a query, the system needs to construct a query evaluation plan.

 The annotations in the evaluation plan may refer to the algorithms to be used for the particular index

or the specific operations.


 Such relational algebra with annotations is referred to as Evaluation Primitives. The evaluation

primitives carry the instructions needed for the evaluation of the operation.
 Thus, a query evaluation plan defines a sequence of primitive operations used for evaluating a query.

The query evaluation plan is also referred to as the query execution plan.

Page 15 of 67
 A query execution engine is responsible for generating the output of the given query. It takes the

query execution plan, executes it, and finally makes the output for the user query.
Optimization
 The cost of the query evaluation can vary for different types of queries. Although the system is

responsible for constructing the evaluation plan, the user does need not to write their query
efficiently.
 Usually, a database system generates an efficient query evaluation plan, which minimizes its cost.

This type of task performed by the database system and is known as Query Optimization.
 For optimizing a query, the query optimizer should have an estimated cost analysis of each

operation. It is because the overall operation cost depends on the memory allocations to several
operations, execution costs, and so on.
Finally, after selecting an evaluation plan, the system evaluates the query and produces the output of
the query.
Transaction Management
Transactions are a set of operations used to perform a logical set of work. It is the bundle of all the
instructions of a logical operation. A transaction usually means that the data in the database has
changed. One of the major uses of DBMS is to protect the user’s data from system failures. It is done
by ensuring that all the data is restored to a consistent state when the computer is restarted after a crash.
The transaction is any one execution of the user program in a DBMS. One of the important properties
of the transaction is that it contains a finite number of steps. Executing the same program multiple
times will generate multiple transactions.
Example: Consider the following example of transaction operations to be performed to withdraw cash
from an ATM vestibule.
Steps for ATM Transaction
1. Transaction Start.
2. Insert your ATM card.
3. Select a language for your transaction.
4. Select the Savings Account option.
5. Enter the amount you want to withdraw.
6. Enter your secret pin.
7. Wait for some time for processing.
8. Collect your Cash.
9. Transaction Completed.
A transaction can include the following basic database access operation.

Page 16 of 67
 Read/Access data (R): Accessing the database item from disk (where the database stored data) to

memory variable.
 Write/Change data (W): Write the data item from the memory variable to the disk.

 Commit: Commit is a transaction control language that is used to permanently save the changes

done in a transaction
Example: Transfer of 50₹ from Account A to Account B. Initially A= 500₹, B= 800₹. This data is
brought to RAM from Hard Disk.
R(A) -- 500 // Accessed from RAM.
A = A-50 // Deducting 50₹ from A.
W(A)--450 // Updated in RAM.
R(B) -- 800 // Accessed from RAM.
B=B+50 // 50₹ is added to B's Account.
W(B) --850 // Updated in RAM.
commit // The data in RAM is taken back to Hard Disk.

Stages of Transaction
Note: The updated value of Account A = 450₹ and Account B = 850₹.
All instructions before committing come under a partially committed state and are stored in RAM.
When the commit is read the data is fully accepted and is stored on Hard Disk.
If the transaction is failed anywhere before committing, we have to go back and start from the
beginning. We can’t continue from the same state. This is known as Roll Back.
Desirable Properties of Transaction (ACID Properties)
For a transaction to be performed in DBMS, it must possess several properties often called ACID
properties.
 A – Atomicity
 C – Consistency
 I – Isolation
 D –Durability
Page 17 of 67
Transaction States
Transactions can be implemented using SQL queries and Servers. In the below-given diagram, you
can see how transaction states work.

Transaction States
The transaction has the four properties. These are used to maintain consistency in a database, before
and after the transaction.
Property of Transaction:
1. Atomicity
2. Consistency
3. Isolation
4. Durability
Atomicity:
 t states that all operations of the transaction take place at once if not, the transactions aborted.

 There is no midway, i.e., the transaction cannot occur partially. Each transaction is treated as one

unit and either run to completion or is not executed at all.


 Atomicity involves the following two operations:

 Abort: If a transaction aborts, then all the changes made are not visible.

 Commit: If a transaction commits, then all the changes made are visible.

Consistency:
 The integrity constraints are maintained so that the database is consistent before and after the

transaction.
 The execution of a transaction will leave a database in either its prior stable state or anew stable

state.

Page 18 of 67
 The consistent property of database states that every transaction sees a consistent database instance.

 The transaction is used to transform the database from one consistent state to another consistent

state.
Isolation:
 It shows that the data which is used at the time of execution of a transaction cannot be used by the

second transaction until the first one is completed.


 In isolation, if the transaction T1 is being executed and using the data item X, then that data item

can’t be accessed by any other transaction T2 until the transaction T1ends.


 The concurrency control subsystem of the DBMS enforced the isolation property

Durability:
 The durability property is used to indicate the performance of the database’s consistent state. It

states that the transaction made the permanent changes.


 They cannot be lost by the erroneous operation of a faulty transaction or by the system failure. When

a transaction is completed, then the database reaches a state known as the consistent state. That
consistent state cannot be lost, even in the event of a system’s failure.
 The recovery subsystem of the DBMS has the responsibility of Durability property.

DBMS Architecture
 The DBMS design depends upon its architecture. The basic client/server architecture is used to deal

with a large number of PCs, web servers, database servers and other components that are connected
with networks.
 The client/server architecture consists of many PCs and a workstation which are connected via the

network.
 DBMS architecture depends upon how users are connected to the database to get their request done.

Types of DBMS Architecture


 The DBMS design depends upon its architecture. The basic client/server architecture is used to deal

with a large number of PCs, web servers, database servers and other components that are connected
with networks.
 The client/server architecture consists of many PCs and a workstation which are connected via the

network.
 DBMS architecture depends upon how users are connected to the database to get their request done.

Page 19 of 67
Database architecture can be seen as a single tier or multi-tier. But logically, database architecture is
of two types like: 2-tier architecture and 3-tier architecture.
1-Tier Architecture:
 In this architecture, the database is directly available to the user. It means the user can directly sit

on the DBMS and uses it.


 Any changes done here will directly be done on the database itself. It doesn't provide a handy tool

for end users.


 The 1-Tier architecture is used for development of the local application, where programmers can

directly communicate with the database for the quick response.


2-Tier Architecture:
 The 2-Tier architecture is same as basic client-server. In the two-tier architecture, applications on

the client end can directly communicate with the database at the server side. For this interaction,
API's like: ODBC, JDBC are used.
 The user interfaces and application programs are run on the client-side.

 The server side is responsible to provide the functionalities like: query processing and transaction

management.
 To communicate with the DBMS, client-side application establishes a connection with the server

side.

Fig: 2-tier Architecture


Page 20 of 67
3-Tier Architecture:
 The 3-Tier architecture contains another layer between the client and server. In this architecture,

client can't directly communicate with the server.


 The application on the client-end interacts with an application server which further communicates

with the database system.


 End user has no idea about the existence of the database beyond the application server. The database

also has no idea about any other user beyond the application.
 The 3-Tier architecture is used in case of large web application.

Fig: 3-tier Architecture


Advantages of 3-Tier Architecture
 Enhanced scalability: Scalability is enhanced due to distributed deployment of application

servers. Now, individual connections need not be made between the client and server.
 Data Integrity: 3-Tier Architecture maintains Data Integrity. Since there is a middle layer

between the client and the server, data corruption can be avoided/removed.
 Security: 3-Tier Architecture Improves Security. This type of model prevents direct interaction

of the client with the server thereby reducing access to unauthorized data.
Data mining and Information Retrieval:
Information Retrieval:
Information retrieval deals with the retrieval of information from a large number of text-based
documents. Some of the database systems are not usually present in information retrieval systems
because both handle different kinds of data. Examples of information retrieval system include −
 Online Library catalogue system
 Online Document Management Systems
 Web Search Systems etc.

Page 21 of 67
Note − The main problem in an information retrieval system is to locate relevant documents in a
document collection based on a user's query. This kind of user's query consists of some keywords
describing an information need.
Data Mining:
There is a huge amount of data available in the Information Industry. This data is of no use until it
is converted into useful information. It is necessary to analyze this huge amount of data and extract
useful information from it.
Extraction of information is not the only process we need to perform; data mining also involves
other processes such as Data Cleaning, Data Integration, Data Transformation, Data Mining, Pattern
Evaluation and Data Presentation. Once all these processes are over, we would be able to use this
information in many applications such as Fraud Detection, Market Analysis, Production Control,
Science Exploration, etc.
What is Data Mining...?
Data Mining is defined as extracting information from huge sets of data. In other words, we can say
that data mining is the procedure of mining knowledge from data. The information or knowledge
extracted so can be used for any of the following applications −
 Market Analysis

 Fraud Detection

 Customer Retention

 Production Control

 Science Exploration

Data Mining Applications:


Data mining is highly useful in the following domains −
 Market Analysis and Management

 Corporate Analysis & Risk Management

 Fraud Detection

Apart from these, data mining can also be used in the areas of production control, customer retention,
science exploration, sports, astrology, and Internet Web Surf-Aid
Market Analysis and Management:
Listed below are the various fields of market where data mining is used −
 Customer Profiling − Data mining helps determine what kind of people buy what kind of

products.
 Identifying Customer Requirements − Data mining helps in identifying the best products for

different customers. It uses prediction to find the factors that may attract new customers.

Page 22 of 67
 Cross Market Analysis − Data mining performs Association/correlations between product sales.

 Target Marketing − Data mining helps to find clusters of model customers who share the same

characteristics such as interests, spending habits, income, etc.


 Determining Customer purchasing pattern − Data mining helps in determining customer

purchasing pattern.
 Providing Summary Information − Data mining provides us various multidimensional summary

reports. Corporate Analysis and Risk Management


Data mining is used in the following fields of the Corporate Sector −
 Finance Planning and Asset Evaluation − It involves cash flow analysis and prediction,

contingent claim analysis to evaluate assets.


 Resource Planning − It involves summarizing and comparing the resources and spending.

 Competition − It involves monitoring competitors and market directions. Fraud Detection

Data mining is also used in the fields of credit card services and telecommunication to detect frauds.
In fraud telephone calls, it helps to find the destination of the call, duration of the call, time of the
day or week, etc. It also analyzes the patterns that deviate from expected norms.

Page 23 of 67
Specialty Databases:
A collection of focused information on one or more specific fields of study is referred to as Specialty
Databases. The information is stored in such a way that the user can locate and retrieve it quickly and
easily.
Why Use Specialty Databases:
Guaranteed Authoritative information: When someone searches a database they find an article that will
be accurate and reliable too.
Provide Full-text access: Text links will be available to anyone within the search results.
There are various types of Specialty Databases:
RDBMS: RDBMS is a Relational Database Management System based on the Relational model of
data. It follows a table structure, it is simple to use and easy to understand. It supports Structured Query
Language (SQL). RDMS is used for traditional applications tasks such as data administration and data
processing. MS SQL Server, MySQL, SQLite, MariaDB are examples of RDBMS.
OODBMS: It adds DBMS functionalities to a programming language. When it comes to integration
with host language – Flawless integration with C++/Small talk. Query language – Query processing is
relatively disorganized.
ORDBMS: ORDBMS is an Object-Oriented Relational Database Management System based on the
Relational as well as Object-Oriented database model. It adds new data types to RDBMS. When it
comes to integration with host language – Integration happens only through embedded SQL in a host
language. Query language – SQL build standards are available.
Object-Based Data Models: As the name suggests Object-Based Data Model is a model which is built
on object-oriented programming which relates the methods that are nothing but procedures with objects
that can benefit from class hierarchies. Objects are the levels of abstraction that include properties and
actions. This type of data model is one that tries to focus on how to express data. The data here is
divided into different units in which each unit has some defining properties. The object-oriented data
model also supports a rich type system, structured and collection types. Examples of Object-Based
Data Models:
 ER (Entity Relationship) Data Model
 Semantic Data Model
 Functional Data Model
Semi-Structured Data Models: These data models were planned as an evolution of the relational data
model. It is a database model in which there is no partition between the data and the schema. It allows
the representation of data with a workable structure. In this data, model items can have different
numbers of attributes but one item may contain items with different structures. It is a data model where

Page 24 of 67
the data values and the schema components synchronize properly. There are some characteristics of
Semi-structured Data Models:
 One can change the schema easily.
 It gives a workable format to exchange the data between different types of databases.
 Data transfer format may be transferable.
Some Most Important Specialty Databases are as follows:
PubMed: It has over 29 million references. It does include its sources from MEDLINE (the National
Library of Medicine’s journal citation database), PubMed Central (PMC – a free archive of biomedical
and life sciences articles), and many more. As the sources indicate these types of databases is ideal for:
 School of Medicine
 School of Health Professions
 Graduate School of Biomedical Sciences
Embase: It has over 29 million records, it includes its sources from Emtree, international biomedical
literature, etc. As the sources indicate these types of databases is ideal for:
 Graduate School of Biomedical Sciences
 School of Pharmacy
 School of Medicine
Scopus: It has 49 million records, including sources from Life science, Physical/Health sciences,
Social sciences.
CINAHL: it stands for Cumulative Index to Nursing and Allied Health Literature and is pronounced
as “sin-all”. Includes source from Nursing, etc.
Applications of Specialty Databases:
 It compares RDBMS, OODBMS, and ORDBMS.
 ORDBMS is a relational DBMS that has its own certain extensions.
Database Users and DBA:
Database users are categorized based up on their interaction with the data base. These are seven types
of data base users in DBMS.
1. Database Administrator (DBA):
Database Administrator (DBA) is a person/team who defines the schema and also controls the 3 levels
of database.
The DBA will then create a new account id and password for the user if he/she need to access the data
base.
DBA is also responsible for providing security to the data base and he allows only the authorized users
to access/modify the data base.

Page 25 of 67
 DBA also monitors the recovery and back up and provide technical support.
 The DBA has a DBA account in the DBMS which called a system or superuser account.
 DBA repairs damage caused due to hardware and/or software failures.
2. Naive / Parametric End Users:
Parametric End Users are the unsophisticated who don’t have any DBMS knowledge but they
frequently use the data base applications in their daily life to get the desired results.
For examples, Railway’s ticket booking users are naive users. Clerks in any bank is a naive user
because they don’t have any DBMS knowledge but they still use the database and perform their given
task.
3. System Analyst:
System Analyst is a user who analyzes the requirements of parametric end users. Theycheck whether
all the requirements of end users are satisfied.
4. Sophisticated Users:
Sophisticated users can be engineers, scientists, business analyst, who are familiar withthe database.
They can develop their own data base applications according to their requirement. They don’t write
the program code but they interact the data base by writing SQL queries directly through the query
processor.
5. Data Base Designers:
Data Base Designers are the users who design the structure of data base which includes tables, indexes,
views, constraints, triggers, stored procedures. He/she controls what data must be stored and how the
data items to be related.
6. Application Program:
Application Program are the back end programmers who writes the code for the application programs.
They are the computer professionals. These programs could be written in Programming languages
such as Visual Basic, Developer, C, FORTRAN, COBOL etc.
7. Casual Users / Temporary Users:
Casual Users are the users who occasionally use/access the data base but each time when they access
the data base they require the new information, for example, Middle or higher level manager.
History of database systems:
Information processing drives the growth of computers, as it has from the earliest days of commercial
computers. In fact, automation of data processing tasks predates computers. Punched cards, invented
by Herman Hollerith, were used at the very beginning of the twentieth century to record U.S. census
data, and mechanical systems were used to process the cards and tabulate results. Punched cards were
later widely used as a means of entering data into computers. Techniques for data storage and

Page 26 of 67
processing have evolved over the years:
• 1950s and early 1960s:
Magnetic tapes were developed for data storage. Data processing tasks such as payroll were automated,
with data stored on tapes. Processing of data consisted of reading data from one or more tapes and
writing data to a new tape. Data could also be input from punched card decks, and output to printers.
For example, salary raises were processed by entering the raises on punched cards and reading the
punched card deck in synchronization with a tape containing the master salary details. The records had
to be in the same sorted order. The salary raises would be added to the salary read from the master tape,
and written to a new tape; the new tape would become the new master tape.
Tapes (and card decks) could be read only sequentially, and data sizes were much larger than main
memory; thus, data processing programs were forced to process data in a particular order, by reading
and merging data from tape sand card decks.
• Late 1960s and 1970s:
Widespread use of hard disks in the late 1960s changed the scenario for data processing greatly, since
hard disks allowed direct access to data. The position of data on disk was immaterial, since any location
on disk could be accessed in just tens of milliseconds. Data were thus freed from the tyranny of
sequentiality. With disks, network and hierarchical databases could be created that allowed data
structures such as lists and trees to be stored on disk. Programmers could construct and manipulate
these data structures. A landmark paper by Codd [1970] defined the relational model and nonprocedural
ways of querying data in the relational model, and relational databases were born. The simplicity of
the relational model and the possibility of hiding implementation details completely from the
programmer were enticing indeed. Codd later won the prestigious Association of Computing
Machinery Turing Award for his work.
• 1980s:
Although academically interesting, the relational model was not used in practice initially, because of
its perceived performance disadvantages; relational databases could not match the performance of
existing network and hierarchical databases. That changed with System R, a groundbreaking project
at IBM Research that developed techniques for the construction of an efficient relational database
system. Excellent overviews of System R are provided by Astrahan et al. [1976] and Chamberlin et al.
[1981]. The fully functional System R prototype led to IBM’s first relational database product,
SQL/DS. At the same time, the Ingres system was being developed at the University of California at
Berkeley. It led to a commercial product of the same name. Initial commercial relational database
systems, such as IBM DB2, Oracle, Ingres, and DEC Rdb, played a major role in advancing techniques
for efficient processing of declarative queries. By the early 1980s, relational databases had

Page 27 of 67
become competitive with network and hierarchical database systems even in the area of performance.
Relational databases were so easy to use that they eventually replaced network and hierarchical
databases; programmers using such databases were forced to deal with many low-level implementation
details, and had to code their queries in a procedural fashion. Most importantly, they had to keep
efficiency in mind when designing their programs, which involved a lot of effort. In contrast, in a
relational database, almost all these low-level tasks are carried out automatically by the database,
leaving the programmer free to work at a logical level. Since attaining dominance in the
1980s, the relational model has reigned supreme among data models. The 1980s also saw much
research on parallel and distributed databases, as well as initial work on object-oriented databases.
• Early 1990s:
The SQL language was designed primarily for decision support applications, which are query-
intensive, yet the mainstay of databases in the 1980s was transaction-processing applications, which
are update-intensive. Decision support and querying re-emerged as a major application area for
databases. Tools for analyzing large amounts of data saw large growths in usage. Many database
vendors introduced parallel database products in this period. Database vendors also began to add
object-relational support to their databases.
• 1990s:
The major event of the 1990s was the explosive growth of the World Wide Web. Databases were
deployed much more extensively than ever before. Database systems now had to support very high
transaction-processing rates, as well as very high reliability and 24 × 7 availability (availability 24
hours a day, 7 days a week, meaning no downtime for scheduled maintenance activities). Database
systems also had to support Web interfaces to data.
• 2000s:
The first half of the 2000s saw the emerging of XML and the associated query language XQuery as a
new database technology. Although XML is widely used for data exchange, as well as for storing
certain complex data types, relational databases still form the core of a vast majority of large-scale
database applications. In this time period we have also witnessed the growth in “autonomic-
computing/auto-admin” techniques for minimizing system administration effort. This period also saw
a significant growth in use of open-source database systems, particularly PostgreSQL and MySQL.
The latter part of the decade has seen growth in specialized databases for data analysis, in particular
column-stores, which in effect store each column of a table as a separate array, and highly parallel
database systems designed for analysis of very large data sets. Several novel distributed data-storage
systems have been built to handle the data management requirements of very large Web sites such as
Amazon, Facebook, Google, Microsoft and Yahoo!, and some of these are now offered as Web services

Page 28 of 67
that can be used by application developers. There has also been substantial work on management and
analysis of streaming data, such as stock-market ticker data or computer
network monitoring data. Data-mining techniques are now widely deployed; example applications
include Web-based product-recommendation systems and automatic placement of relevant
advertisements on Web pages.
Entity Relation Model:
Entity relationship (ER) models are based on the real-world entities and their relationships. It is easy
for the developers to understand the system by simply looking at the ER diagram. ER models are
normally represented by ER-diagrams.
Components:
ER diagram basically having three components:
Entities − It is a real-world thing which can be a person, place, or even a concept. For Example:
Department, Admin, Courses, Teachers, Students, Building, etc are some of the entities of a School
Management System.
Attributes − An entity which contains a real-world property called an attribute. For Example: The
entity employee has the property like employee id, salary, age, etc.
Relationship − Relationship tells how two attributes are related. For Example: Employee works for a
department.
An entity has a real-world property called attribute and these attributes are defined by a set of
values called domain.
Example 1
In a university,
 A student is an entity,
 University is the database,
 Name and age and sex are the attributes.
 He relationships among entities define the logical association between entities.
Example 2
Given below is another example of ER:

Page 29 of 67
In the above example,
Entities − Employee and Department. Attributes −
 Employee − Name, id, Age, Salary
 Department − Dept_id, Dept_name
The two entities are connected using the relationship. Here, each employee works for a department.
Features of ER
The features of ER Model are as follows −
 Graphical Representation is Better Understanding − It is easy and simple to understand so it can
be used by the developers to communicate with the stakeholders.
 ER Diagram − ER diagrams are used as a visual tool for representing the model.
 Database Design − This model helps the database designers to build the database.
Advantages
The advantages of ER are as follows −
 The ER model is easy to build.
 This model is widely used by database designers for communicating their ideas.
 This model can easily convert to any other model like network model, hierarchical model etc.
 It is integrated with the dominant relational model.
Disadvantages
The disadvantages of ER are as follows −
 There is no industry standard for developing an ER model.
 Information might be lost or hidden in the ER model.
 There is no Data Manipulation Language (DML).
 There is limited relationship representation.
ER diagram
Why use ER Diagrams?
Here, are prime reasons for using the ER Diagram
 Helps you to define terms related to entity relationship modeling
 Provide a preview of how all your tables should connect, what fields are going to be on each table
 Helps to describe entities, attributes, relationships
 ER diagrams are translatable into relational tables which allows you to build databases quickly
 ER diagrams can be used by database designers as a blueprint for implementing data in specific
software applications
 The database designer gains a better understanding of the information to be contained in the database
with the help of ERP diagram

Page 30 of 67
 ERD Diagram allows you to communicate with the logical structure of the database to users
Facts about ER Diagram Model:
Now in this ERD Diagram Tutorial, let’s check out some interesting facts about ER Diagram Model:
 ER model allows you to draw Database Design
 It is an easy to use graphical tool for modeling data
 Widely used in Database Design
 It is a GUI representation of the logical structure of a Database
 It helps you to identifies the entities which exist in a system and the relationships between those
entities
ER Diagrams Symbols & Notations:
Entity Relationship Diagram Symbols & Notations mainly contains three basic symbols which are
rectangle, oval and diamond to represent relationships between elements, entities and attributes. There
are some sub-elements which are based on main elements in ERD Diagram. ER Diagram is a visual
representation of data that describes how data is related to each other using different ERD Symbols and
Notations.
Following are the main components and its symbols in ER Diagrams:
 Rectangles: This Entity Relationship Diagram symbol represents entity types
 Ellipses: Symbol represent attributes
 Diamonds: This symbol represents relationship types
 Lines: It links attributes to entity types and entity types with other relationship types
 Primary key: attributes are underlined
 Double Ellipses: Represent multi-valued attributes
ER Diagram Symbols

Components of the ER Diagram:


This model is based on three basic concepts:
 Entities
 Attributes
 Relationships

Page 31 of 67
ER Diagram Examples
For example, in a University database, we might have entities for Students, Courses, and Lecturers. Students entity
can have attributes like Roll no, Name, and Dept ID. They might have relationships with Courses and Lecturers.

WHAT IS ENTITY?
A real-world thing either living or non-living that is easily recognizable and nonrecognizable. It is
anything in the enterprise that is to be represented in our database. It may be a physical thing or simply
a fact about the enterprise or an event that happens in the real world.
An entity can be place, person, object, event or a concept, which stores data in the database. The
characteristics of entities are must have an attribute, and a unique key. Every entity is made up of some
‘attributes’ which represent that entity.
Examples of entities:
 Person: Employee, Student, Patient
 Place: Store, Building
 Object: Machine, product, and Car
 Event: Sale, Registration, Renewal
 Concept: Account, Course Notation of an Entity

Page 32 of 67
Entity set:
Student
An entity set is a group of similar kind of entities. It may contain entities with attribute sharing similar
values. Entities are represented by their properties, which also called attributes. All attributes have their
separate values. For example, a student entity may have a name, age, class, as attributes.

Example of Entities:
A university may have some departments. All these departments employ various lecturers and offer
several programs.
Some courses make up each program. Students register in a particular program and enroll in various
courses. A lecturer from the specific department takes each course, and each lecturer teaches a various
group of students.
Relationship
Relationship is nothing but an association among two or more entities. E.g., Tom works in the Chemistry
department.
Entities take part in relationships. We can often identify relationships with verbs or verb phrases.

For example:
 You are attending this lecture
 I am giving the lecture
 Just loke entities, we can classify relationships according to relationship-types:
 A student attends a lecture
 A lecturer is giving a lecture.
Weak Entities
A weak entity is a type of entity which doesn’t have its key attribute. It can be identified uniquely by
considering the primary key of another entity. For that, weak entity sets need to have participation.

Page 33 of 67
In above ER Diagram examples, “Trans No” is a discriminator within a group of transactions
in anATM.
Let’s learn more about a weak entity by comparing it with a Strong Entity
Strong Entity Set Weak Entity Set
Strong entity set always has a primary key. It does not have enough attributes to build a
primarykey.
It is represented by a rectangle symbol. It is represented by a double rectangle symbol.
It contains a Primary key represented by the It contains a Partial Key which is represented
underline symbol. by a dashed underline symbol.
The member of a strong entity set is called The member of a weak entity set called
as dominant entity set. as a subordinate entity set.
Primary Key is one of its attributes which In a weak entity set, it is a combination of
helps to identify its member. primarykey and partial key of the strong entity
set.
In the ER diagram the relationship between two The relationship between one strong and a
strong entity set shown by using a diamond weak entity set shown by using the double
symbol. diamond symbol.
The connecting line of the strong entity set with The line connecting the weak entity set for
the relationship is single. identifying relationship is double.
Attributes
It is a single-valued property of either an entity-type or a relationship-type. For example, a lecture
might have attributes: time, date, duration, place, etc. An attribute in ER Diagram examples, is
represented by an Ellipse.

Page 34 of 67
Types of Attributes Description

Simple attributes can’t be divided any further. For example,


Simple attribute a student’s contact number. It is also called an atomic value.
It is possible to break down composite attribute. For example, a
student’s full name may be further divided into first name,
Composite attribute
second name, and last name.

This type of attribute does not include in the physical database.


However, their values are derived from other attributes present in
the database. For example, age should not be stored directly.
Derived attribute
Instead, it should be derived from the DOB of that employee.

Multivalued attributes can have more than one values. For


example, a student can have more than one mobile number, email
Multivalued attribute
address, etc.

Cardinality
Defines the numerical attributes of the relationship between two entities or entity sets.
Different types of cardinal relationships are:
 One-to-One Relationships
 One-to-Many Relationships
 May to One Relationships
 Many-to-Many Relationships

Page 35 of 67
1. One-to-one:

One entity from entity set X can be associated with at most one entity of entity set Y and vice versa.
Example: One student can register for numerous courses. However, all those courses have a
single line back to that one student.

2. One-to-many:

One entity from entity set X can be associated with multiple entities of entity set Y, but an entity
fromentity set Y can be associated with at least one entity.
For example, one class is consisting of multiple students.

3. Many to One

More than one entity from entity set X can be associated with at most one entity of entity set Y. However,
an entity from entity set Y may or may not be associated with more than one entity fromentity set X.
For example, many students belong to the same class.

Page 36 of 67
4. Many to Many:

One entity from X can be associated with more than one entity from Y and vice versa.
For example, Students as a group are associated with multiple faculty members, and faculty
members can be associated with multiple students.

How to Create an Entity Relationship Diagram (ERD)


Now in this ERD Diagram Tutorial, we will learn how to create an ER Diagram. Following
are the steps to create an ER Diagram:

Steps to Create an ER Diagram Let’s study them with an Entity Relationship Diagram Example:
In a university, a Student enrolls in Courses. A student must be assigned to at least one or more Courses.
Each course is taught by a single Professor. To maintain instruction quality, a Professor can deliver only
one course
Step 1) Entity Identification We have three entities
 Student
 Course
 Professor

Page 37 of 67
Step 2) Relationship Identification
We have the following two relationships
 The student is assigned a course
 Professor delivers a course

Step 3) Cardinality Identification


For them problem statement we know that,
 A student can be assigned multiple courses
 A Professor can deliver only one course

Step 4) Identify Attributes


You need to study the files, forms, reports, data currently maintained by the organization to identify
attributes. You can also conduct interviews with various stakeholders to identify entities. Initially, it’s
important to identify the attributes without mapping them to a particular entity.
Once, you have a list of Attributes, you need to map them to the identified entities. Ensure an attribute
is to be paired with exactly one entity. If you think an attribute should belong to more than one entity,
use a modifier to make it unique.
Once the mapping is done, identify the primary Keys. If a unique key is not readily available, create one.

Entity Primary Key Attribute

Student Student_ ID Student Name

Professor Employee_ ID Professor Name

Course Course_ ID Course Name

Page 38 of 67
For Course Entity, attributes could be Duration, Credits, Assignments, etc. For the sake of
ease we have considered just one attribute.
Step 5) Create the ERD Diagram
A more modern representation of Entity Relationship Diagram Example

Additional Features Of ER Model


Generalization
Generalization is the process of extracting common properties from a set of entities and
creating a generalized entity from it. It is a bottom-up approach in which two or more entities
can be generalized to a higher-level entity if they have some attributes in common.
For Example, STUDENT and FACULTY can be generalized to a higher-level entity called
PERSON as shown in Figure 1. In this case, common attributes like P_NAME, and P_ADD
become part of a higher entity (PERSON), and specialized attributes like S_FEE become
part of a specialized entity (STUDENT).

Page 39 of 67
Specialization
In specialization, an entity is divided into sub-entities based on its characteristics. It is a top-
down approach where the higher-level entity is specialized into two or more lower-level
entities.
For Example, an EMPLOYEE entity in an Employee management system can be specialized
into DEVELOPER, TESTER, etc. as shown in Figure 2. In this case, common attributes like
E_NAME, E_SAL, etc. become part of a higher entity (EMPLOYEE), and specialized
attributes like TES_TYPE become part of a specialized entity (TESTER).

Aggregation
An ER diagram is not capable of representing the relationship between an entity and a
relationship which may be required in some scenarios. In those cases, a relationship with its
corresponding entities is aggregated into a higher- level entity. Aggregation is an abstraction
through which we can represent relationships as higher-level entity sets.
For Example, an Employee working on a project may require some machinery. So, REQUIRE
relationship is needed between the relationship WORKS_FOR and entity MACHINERY.
Using aggregation, WORKS_FOR relationship with its entities EMPLOYEE and PROJECT
is aggregated into a single entity and relationship REQUIRE is created between the
aggregated entity and MACHINERY.

Page 40 of 67
PROJECT Work
s for
EMPLOYEE

REQ
UIR

MACHINARY

Representing Aggregation Via Schema


To represent aggregation, create a schema containing the following things.
 the primary key to the aggregated relationship
 the primary key to the associated entity set
 descriptive attribute, if exists
Explain The Conceptual Design For Large Enterprises With An Example?
The conceptual design for a database involves creating a high-level model of the database that focuses
on the overall structure of the data and the relationships between the different entities in the database.
For large enterprises, the conceptual design is typically more complex and involves multiple levels of
abstraction.
One approach to conceptual design for large enterprises is the Entity-Relationship (ER) modeling
technique. ER models use three main components to represent the data in the database: entities,
attributes, and relationships. These components are used to create a diagram that represents the data and
relationships in a graphical format.
For example, consider a large enterprise that manages a chain of retail stores. The conceptual design for
this enterprise might involve creating an ER diagram that includes entities such as 'stores,' 'products,'
'employees,' and 'customers.' These entities would have attributes that describe the specific data
associated with each entity. For example, the 'stores' entity might have attributes such as 'store_id,'
'store_name,' and 'store_address.'
Relationships between the entities would be represented using connectors or lines that connect the
entities in the diagram. For example, the 'stores' entity might have a relationship with the 'employees'
entity, representing the fact that each store employs multiple employees. Similarly, the 'products' entity
Page 41 of 67
might have a relationship with the 'stores' entity, representing the fact that each store carries multiple
products.
In addition to the basic components of entities, attributes, and relationships, ER models can also include
additional features such as cardinality, which specifies the number of instances of one entity that can be
related to another entity. For example, a relationship between the 'stores' and 'employees' entities might
be specified as a 'one-to-many' relationship, indicating that each store employs multiple employees, but
each employee is associated with only one store.
Overall, the conceptual design for a large enterprise involves creating a high-level model of the database
that represents the data and relationships between entities in a graphical format. This design can help to
ensure that the database is well-organized, scalable, and efficient, and that it meets the needs of the
enterprise.
Relational Model in DBMS:
Relational model can represent as a table with columns and rows. Each row is known as a tuple. Each
table of the column has a name or attribute.
Domain: It contains a set of atomic values that an attribute can take.
Attribute: It contains the name of a column in a particular table. Each attribute Ai must have a domain,
dom(Ai)
Relational instance: In the relational database system, the relational instance is represented by a finite
set of tuples. Relation instances do not have duplicate tuples.
Relational schema: A relational schema contains the name of the relation and name of all columns or
attributes.
Relational key: In the relational key, each row has one or more attributes. It can identify the row in the
relation uniquely.
Example: STUDENT Relation

NAME ROLL_NO PHONE_NO ADDRESS AGE

Ram 14795 7305758992 Noida 24

Shyam 12839 9026288936 Delhi 35

Laxman 33289 8583287182 Gurugram 20

Mahesh 27857 7086819134 Ghaziabad 27

Ganesh 17282 9028 9i3988 Delhi 40

Page 42 of 67
 In the given table, NAME, ROLL_NO, PHONE_NO, ADDRESS, and AGE are the attributes.
 The instance of schema STUDENT has 5 tuples.
 t3 = <Laxman, 33289, 8583287182, Gurugram, 20>
Properties of Relations
 Name of the relation is distinct from all other relations.
 Each relation cell contains exactly one atomic (single) value
 Each attribute contains a distinct name
 Attribute domain has no significance
 tuple has no duplicate value
 Order of tuple can have a different sequence
Different Types of Keys in the Relational Model
1. Candidate Key
2. Primary Key
3. Super Key
4. Alternate Key
5. Foreign Key
6. Composite Key
1. Candidate Key: The minimal set of attributes that can uniquely identify a tuple is known as a
candidate key. For Example, STUD_NO in STUDENT relation.
 It is a minimal super key.
 It is a super key with no repeated data is called a candidate key.
 The minimal set of attributes that can uniquely identify a record.
 It must contain unique values.
 It can contain NULL values.
 Every table must have at least a single candidate key.
 A table can have multiple candidate keys but only one primary key.
 The value of the Candidate Key is unique and may be null for a tuple.
 There can be more than one candidate key in a relationship.
Example:
STUD_NO is the candidate key for relation STUDENT.
Table STUDENT

Page 43 of 67
STUD_NO SNAME ADDRESS PHONE

1 Shyam Delhi 123456789

2 Rakesh Kolkata 223365796

3 Suraj Delhi 175468965

 The candidate key can be simple (having only one attribute) or composite as well.
Example:
{STUD_NO, COURSE_NO} is a composite candidate key for relation STUDENT_COURSE.

Table STUDENT_COURSE

STUD_NO TEACHER_NO COURSE_NO

1 001 C001

2 056 C005

Note: In SQL Server a unique constraint that has a nullable column, allows the value ‘null‘ in that
column only once. That’s why the STUD_PHONE attribute is a candidate here, but can not be a ‘null’
value in the primary key attribute.
2. Primary Key: There can be more than one candidate key in relation out of which one can be chosen
as the primary key. For Example, STUD_NO, as well as STUD_PHONE, are candidate keys for relation
STUDENT but STUD_NO can be chosen as the primary key (only one out of many candidate keys).
 It is a unique key.

 It can identify only one tuple (a record) at a time.

 It has no duplicate values, it has unique values.

 It cannot be NULL.

 Primary keys are not necessarily to be a single column; more than one column can also be a primary key

for a table.
Example:
STUDENT table -> Student(STUD_NO, SNAME, ADDRESS, PHONE) , STUD_NO is a primary key
Table STUDENT

Page 44 of 67
STUD_NO SNAME ADDRESS PHONE

1 Shyam Delhi 123456789

2 Rakesh Kolkata 223365796

3 Suraj Delhi 175468965

3. Super Key: The set of attributes that can uniquely identify a tuple is known as Super Key. For Example,
STUD_NO, (STUD_NO, STUD_NAME), etc. A super key is a group of single or multiple keys that
identifies rows in a table. It supports NULL values.
 Adding zero or more attributes to the candidate key generates the super key.
 A candidate key is a super key but vice versa is not true.
 Super Key values may also be NULL.
Example:
Consider the table shown above.
STUD_NO+PHONE is a super key.

Relation between Primary Key, Candidate Key, and Super Key


4. Alternate Key: The candidate key other than the primary key is called an alternate key.
 All the keys which are not primary keys are called alternate keys.
 It is a secondary key.
 It contains two or more fields to identify two or more records.

Page 45 of 67
 These values are repeated.
 Eg:- SNAME, and ADDRESS is Alternate keys
Example:
Consider the table shown above.
STUD_NO, as well as PHONE both, are candidate keys for relation STUDENT but PHONE will be an
alternate key (only one out of many candidate keys).

Primary Key, Candidate Key, and Alternate Key


5. Foreign Key: If an attribute can only take the values which are present as values of some other attribute,
it will be a foreign key to the attribute to which it refers. The relation which is being referenced is called
referenced relation and the corresponding attribute is called referenced attribute the relation which refers to
the referenced relation is called referencing relation and the corresponding attribute is called referencing
attribute. The referenced attribute of the referenced relation should be the primary key to it.
 It is a key it acts as a primary key in one table and it acts as secondary key in another table.

 It combines two or more relations (tables) at a time.

 They act as a cross-reference between the tables.

 For example, DNO is a primary key in the DEPT table and a non-key in EMP

Example:
Refer Table STUDENT shown above.
STUD_NO in STUDENT_COURSE is a foreign key to STUD_NO in STUDENT relation.
Table STUDENT_COURSE

Page 46 of 67
STUD_NO TEACHER_NO COURSE_NO

1 005 C001

2 056 C005

It may be worth noting that, unlike the Primary Key of any given relation, Foreign Key can be NULL as
well as may contain duplicate tuples i.e. it need not follow uniqueness constraint. For Example, STUD_NO
in the STUDENT_COURSE relation is not unique. It has been repeated for the first and third tuples.
However, the STUD_NO in STUDENT relation is a primary key and it needs to be always unique, and it
cannot be null.

Relation between Primary Key and Foreign Key


6. Composite Key: Sometimes, a table might not have a single column/attribute that uniquely identifies all
the records of a table. To uniquely identify rows of a table, a combination of two or more columns/attributes
can be used. It still can give duplicate values in rare cases. So, we need to find the optimal set of attributes
that can uniquely identify rows in a table.
 It acts as a primary key if there is no primary key in a table

 Two or more attributes are used together to make a composite key.

 Different combinations of attributes may give different accuracy in terms of identifying the rows uniquely.

Example:
FULLNAME + DOB can be combined together to access the details of a student.

Page 47 of 67
Integrity Constraints:
o Integrity constraints are a set of rules. It is used to maintain the quality of information.
o Integrity constraints ensure that the data insertion, updating, and other processes have
to be performed in such a way that data integrity is not affected.
o Thus, integrity constraint is used to guard against accidental damage to the database.
Types of Integrity Constraint

1. Domain constraints
 Domain constraints can be defined as the definition of a valid set of values for an attribute.
 The data type of domain includes string, character, integer, time, date, currency, etc.
 The value of the attribute must be available in the corresponding domain.
Example:

Page 48 of 67
2. Entity integrity constraints
 The entity integrity constraint states that primary key value can't be null.
 This is because the primary key value is used to identify individual rows in relationand if the primary key
has a null value, then we can't identify those rows.
 A table can contain a null value other than the primary key field.

3. Referential Integrity Constraints


 A referential integrity constraint is specified between two tables.
 In the Referential integrity constraints, if a foreign key in Table 1 refers to the Primary Key of Table 2, then
every value of the Foreign Key in Table 1 must be nullor be available in Table 2.

Page 49 of 67
4. Key constraints
 Keys are the entity set that is used to identify an entity within its entity set uniquely.
 An entity set can have multiple keys, but out of which one key will be the primarykey. A primary key can
contain a unique and null value in the relational table.

Integrity constraints over relations:


An integrity constraint (IC) is a condition that is specified on a database schema and restricts the data can be
stored in an instance of the database.
Various restrictions on data that can be specified on a relational database schema in the formof ‘constraints’.
A DBMS enforces integrity constraints, in that it permits only legal instances to be stored in the database.
Integrity constraints are specified and enforced at different times as below.
1.When the DBA or end user defines a database schema, he or she specifies the ICs that must hold on any
instance of this database.
2.When a data base application is run, the DBMS checks for violations and disallows changes to the data that
violate the specified ICs.
Constraints in DBMS:
Constraints enforce limits to the data or type of data that can be inserted/updated/deleted from a table.
The whole purpose of constraints is to maintain the data integrity during an update/delete/insert into a
table. In this tutorial we will learn several types of constraints that can be created in RDBMS.
Types of constraints Not Null Unique Default Check Key Constraints – PRIMARY KEY, FOREIGN KEY
Domain Constraint
Mapping
constraints
NOT NULL:
NOT NULL constraint makes sure that a column does not hold NULL value. When we don’t provide value for
a particular column while inserting a record into a table, it takes NULL value by default. By specifying NULL
constraint, we can be sure that a particular column(s) cannot have NULL values.

Page 50 of 67
Example:
CREATE TABLE STUDENT(
ROLL_NO INT NOT NULL,
STU_NAME VARCHAR (35) NOT
NULL, STU_AGE INT NOT NULL,
STU_ADDRESS VARCHAR (235),
PRIMARY KEY (ROLL_NO)
);
UNIQUE:
UNIQUE Constraint enforces a column or set of columns to have unique values. If a column has a unique
constraint, it means that particular column cannot have duplicate values in a table.
CREATE TABLE STUDENT(
ROLL_NO INT NOT NULL,
STU_NAME VARCHAR (35) NOT NULL UNIQUE,
STU_AGE INT NOT NULL,
STU_ADDRESS VARCHAR (35)
UNIQUE, PRIMARY KEY
(ROLL_NO)
);
DEFAULT:
The DEFAULT constraint provides a default value to a column when there is no value provided while
inserting a record into a table.
CREATE TABLE STUDENT(
ROLL_NO INT NOT NULL,
STU_NAME VARCHAR (35) NOT NULL,
STU_AGE INT NOT NULL,
EXAM_FEE INT DEFAULT
10000, STU_ADDRESS
VARCHAR (35) , PRIMARY
KEY (ROLL_NO)
);
CHECK:
This constraint is used for specifying range of values for a particular column of a table. When this constraint
is being set on a column, it ensures that the specified column must have the value falling in the specified range.
Page 51 of 67
CREATE TABLE STUDENT(
ROLL_NO INT NOT NULL CHECK(ROLL_NO >1000) ,
STU_NAME VARCHAR (35) NOT NULL,
STU_AGE INT NOT NULL,
EXAM_FEE INT DEFAULT
10000, STU_ADDRESS
VARCHAR (35) , PRIMARY
KEY (ROLL_NO)
);
In the above example we have set the check constraint on ROLL_NO column of STUDENT table. Now,
the ROLL_NO field must have the value greater than 1000.
Key constraints:
PRIMARY KEY:
Primary key uniquely identifies each record in a table. It must have unique values and cannot contain nulls.
In the below example the ROLL_NO field is marked as primary key, that means the ROLL_NO field
cannot have duplicate and null values.
CREATE TABLE STUDENT(
ROLL_NO INT NOT NULL,
STU_NAME VARCHAR (35) NOT NULL UNIQUE,
STU_AGE INT NOT NULL,
STU_ADDRESS VARCHAR (35) UNIQUE,
PRIMARY KEY (ROLL_NO)
);
FOREIGN KEY:
Foreign keys are the columns of a table that points to the primary key of another table. They act as a
cross-reference between tables.
Domain constraints:
Each table has certain set of columns and each column allows a same type of data, based on its data
type. The column does not accept values of any other data type.
Domain constraints are user defined data type and we can define them like this:
Domain Constraint = data type + Constraints (NOT NULL / UNIQUE / PRIMARY KEY /
FOREIGN KEY / CHECK / DEFAULT

Page 52 of 67
Enforce Integrity Constraints:
Introduction:
Integrity constraints are rules that specify the conditions that must be met for the data in a
database to be considered valid. These constraints help to ensure the accuracy and consistency of the
data by limiting the values that can be entered for a particular attribute and specifying the relationships
between entities in the database.
There are several types of integrity constraints that can be enforced in a DBMS:
 Domain constraints: These constraints specify the values that can be assigned to an attribute in a

database. For example, a domain constraint might specify that the values for an "age" attribute must be
integers between 0 and 120.
 Participation constraints: These constraints specify the relationship between entities in a database.

For example, a participation constraint might specify that every employee must be assigned to a
department.
 Entity integrity constraints: These constraints specify rules for the primary key of an entity. For

example, an entity integrity constraint might specify that the primary key cannot be null.
 Referential integrity constraints: These constraints specify rules for foreign keys in a database. For

example, a referential integrity constraint might specify that a foreign key value must match the value
of the primary key in another table.
 User-defined constraints: These constraints are defined by the database administrator and can be used

to specify custom rules for the data in a database.


Ways to Enforce Integrity Constraints
There are several ways to enforce integrity constraints in a DBMS:
 Declarative referential integrity: This method involves specifying the integrity constraints at the time

of database design and allowing the DBMS to enforce them automatically.


 Triggers: A trigger is a special type of stored procedure that is executed automatically by the DBMS

when certain events occur (such as inserting, updating, or deleting data). Triggers can be used to
enforce integrity constraints by checking for and rejecting invalid data.
 Stored procedures: A stored procedure is a pre-defined set of SQL statements that can be executed as

a single unit. Stored procedures can be used to enforce integrity constraints by performing checks on
the data before it is inserted, updated, or deleted.
 Application-level code: Integrity constraints can also be enforced at the application level by writing

code to check for and reject invalid data before it is entered into the database.
It is important to carefully consider the appropriate method for enforcing integrity constraints in a
DBMS in order to ensure the accuracy and consistency of the data.

Page 53 of 67
Logical database design using entity-relationship modeling
Before you implement a database, you should plan or design it so that it satisfies all requirements. This first
task of designing a database is called logical design.
 Data modeling

Logical data modeling is the process of documenting the comprehensive business information
requirements in an accurate and consistent format.
 Entities for different types of relationships

In a relational database, separate entities must be defined for different types of relationships.
 Application of business rules to relationships

Whether a given relationship is one-to-one, one-to-many, many-to-one, or many-to-many, your


relationships need to make good business sense.
 Entity attributes in database design

When you define attributes for entities, you generally work with the data administrator to decide on
names, data types, and appropriate values for the attributes.
 Normalization in database design

Normalization helps you avoid redundancies and inconsistencies in your data. There are several forms
of normalization.

Page 54 of 67
Page 55 of 67
Page 56 of 67
Page 57 of 67
Page 58 of 67
Page 59 of 67
Page 60 of 67
Page 61 of 67
Page 62 of 67
Page 63 of 67
Page 64 of 67
Views in SQL
 Views in SQL are considered as a virtual table. A view also contains

rows andcolumns.
 To create the view, we can select the fields from one or more tables

present in thedatabase.
 A view can either have specific rows based on certain condition or all

the rows of atable.


Sample table:
Student_Detail
STU_ID NAME ADDRESS
1 Stephan Delhi
2 Kathrin Noida
3 David Ghaziabad
4 Alina Gurugram

Student_Marks
STU_ID NAME MARKS AGE
1 Stephan 97 19
2 Kathrin 86 21
3 David 74 18
4 Alina 90 20
5 John 96 18

1. Creating view
A view can be created using the CREATE VIEW statement. We can create a view from asingle table
or multiple tables.
Syntax:
CREATE VIEW view_name AS
SELECT column1, column2.....
FROM table_name
WHERE condition;
2. Creating View from a single table
Query:

Page 65 of 67
CREATE VIEW DetailsView AS
SELECT NAME, ADDRESS
FROM Student_Details
WHERE STU_ID < 4;
Just like table query, we can query the view to
view the data.SELECT * FROM Details View;
Output:
NAME ADDRESS
Stephan Delhi
Kathrin Noida
David Ghaziabad

3. Creating View from multiple tables


View from multiple tables can be created by simply include multiple tables in the
SELECT statement.
In the given example, a view is created named MarksView from two tables
Student_Detailand Student_Marks.
Query:
CREATE VIEW MarksView AS
SELECT Student_Detail.NAME, Student_Detail.ADDRESS,
Student_Marks.MARKS FROM Student_Detail, Student_Mark
WHERE Student_Detail.NAME = Student_Marks.NAME;
To display data of View
MarksView: SELECT *
FROM MarksView;
NAME ADDRESS MARKS
Stephan Delhi 97
Kathrin Noida 86
David Ghaziabad 74
Alina Gurugram 90

4. Deleting View
A view can be deleted using the Drop View statement.

Page 66 of 67
Syntax
1. DROP VIEW view_name;
Example:
If we want to delete the View MarksView, we can do this as:
1. DROP VIEW MarksView;
Uses of a View :
A good database should contain views due to the given reasons:
1. Restricting data access –
Views provide an additional level of table security by restricting access to a
predetermined set of rows and columns of a table.
2. Hiding data complexity –
A view can hide the complexity that exists in a multiple table join.
3. Simplify commands for the user –
Views allows the user to select information from multiple tables without
requiring theusers to actually know how to perform a join.
4. Store complex queries –
Views can be used to store complex queries.
5. Rename Columns –
Views can also be used to rename the columns without affecting the base tables
provided the number of columns in view must match the number of columns
specified in select statement. Thus, renaming helps to to hide the names of the
columns of the base tables.
6. Multiple view facility –
Different views can be created on the same table for different users.

Page 67 of 67

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy