DataBase Management (VBSPU 4th Sem)
DataBase Management (VBSPU 4th Sem)
System
PRAVEEN KUMAR SRIVASTAVA
Database
A database management system stores data in such a way that it becomes easier to retrieve, manipulate, and
produce information.
To find out what database is, we have to start from data, which is the basic building block of any DBMS.
Data: Facts, figures, statistics etc. having no particular meaning (e.g. 1, ABC, 19 etc).
Record: Collection of related data items, e.g. in the above example the three data items had no
meaning.
Data Processing Vs. Data Management Systems: -
Although Data Processing and Data Management Systems both refer to functions that take raw data and
transform it into usable information, the usage of the terms is very different. Data Processing is the term
generally used to describe what was done by large mainframe computers from the late 1940's until the
early 1980's (and which continues to be done in largest organizations to a greater or lesser extent even
today): large volumes of raw transaction data fed into programs that update a master file, with fixed-
format reports written to paper.
The term Data Management Systems refers to an expansion of this concept, where the raw data,
previously copied manually from paper to punched cards, and later into data-entry terminals, is now fed
into the system from a variety of sources, including ATMs, EFT, and direct customer entry through the
Internet. The master file concept has been largely displaced by database management systems, and static
reporting replaced or augmented by ad-hoc reporting and direct inquiry, including downloading of data by
customers. The ubiquity of the Internet and the Personal Computer have been the driving force in the
transformation of Data Processing to the more global concept of Data Management Systems.
File Oriented Approach: -
The earliest business computer systems were used to process business records and produce
information. They were generally faster and more accurate than equivalent manual systems.
These systems stored groups of records in separate files, and so they were called file processing
systems. In a typical file processing systems, each department has its own files, designed
specifically for those applications. The department itself working with the data processing staff,
sets policies or standards for the format and maintenance of its files.
Programs are dependent on the files and vice-versa; that is, when the physical format of the file is
changed, the program has also to be changed. Although the traditional file oriented approach to
information processing is still widely used, it does have some very important disadvantages.
.N File System DBMS
O.
1. File system is a software that manages and DBMS is a software for managing the
organizes the files in a storage medium within database.
a computer.
2. Redundant data can be present in a file In DBMS there is no redundant data.
system.
3. It doesn’t provide backup and recovery of It provides backup and recovery of data
data if it is lost. even if it is lost.
4. There is no efficient query processing in file Efficient query processing is there in DBMS.
system.
5. There is less data consistency in file system. There is more data consistency because of
the process of normalization.
Concurrent Use:
A database system allows several users to access the database concurrently. Answering different
questions from different users with the same (base) data is a central aspect of an information system.
Such concurrent use of data increases the economy of a system.
An example for concurrent use is the travel database of a bigger travel agency. The employees of
different branches can access the database concurrently and book journeys for their clients. Each
travel agent sees on his interface if there are still seats available for a specific journey or if it is already
fully booked.
Structured and Described Data: -
A fundamental feature of the database approach is that the database systems does not only contain
the data but also the complete definition and description of these data. These descriptions are basically
details about the extent, the structure, the type and the format of all data and, additionally, the
relationship between the data. This kind of stored data is called metadata ("data about data").
Transactions: -
A transaction is a bundle of actions which are done within a database to bring it from one consistent
state to a new consistent state.
A transaction is atomic what means that it cannot be divided up any further. Within a transaction all
or none of the actions need to be carried out. Doing only a part of the actions would lead to an
inconsistent database state.
One example of a transaction is the transfer of an amount of money from one bank account to
another. The debit of the money from one account and the credit of it to another account makes
together a consistent transaction. This transaction is also atomic. The debit or credit alone would
both lead to an inconsistent state. After finishing the transaction (debit and credit) the changes to
both accounts become persistent and the one who gave the money has now less money on his
account while the receiver has now a higher balance.
Data Persistence: -
Data persistence means that in a DBMS all data is maintained as long as it is not deleted
explicitly. The life span of data needs to be determined directly or indirectly by the user and
must not be dependent on system features. Additionally, data once stored in a database must
not be lost. Changes of a database which are done by a transaction are persistent. When a
transaction is finished even a system crash cannot put the data in danger.
Advantages of a DBMS: -
Using a DBMS to manage data has many advantages:
Reduction of Redundancy: - This is perhaps the most significant advantage of using DBMS.
Redundancy is the problem of storing the same data item in more one place. Redundancy
creates several problems like requiring extra storage space, entering same data more than
once during data insertion, and deleting data from more than one place during deletion.
Anomalies may occur in the database if insertion, deletion etc are not done properly.
Efficient data access: - A DBMS utilizes a variety of sophisticated techniques to store and
retrieve data efficiently. This feature is especially important if the data is stored on external storage
devices.
Data integrity and security: - If data is always accessed through the DBMS, the DBMS can
enforce integrity constraints on the data. For example, before inserting salary information for an
employee, the DBMS can check that the department budget is not exceeded. Also, the DBMS can
enforce access controls that govern what data is visible to different classes of users.
Sharing of Data:-In a paper-based record keeping, data cannot be shared among many users.
But in computerized DBMS, many users can share the same database if they are connected
via a network.
Data administration: - When several users share the data, centralizing the administration of data can offer
significant improvements. Experienced professionals who understand the nature of the data being managed, and
how different groups of users use it, can be responsible for organizing the data representation to minimize
redundancy and fine-tuning the storage of the data to make retrieval efficient.
Concurrent access and crash recovery: A DBMS schedules concurrent accesses to the data in such a
manner that users can think of the data as being accessed by only one user at a time. Further, the DBMS protects
users from the effects of system failures.
Disadvantages of a DBMS: -
Danger of an Overkill: - For small and simple applications for single users a database system is
often not advisable.
Complexity: - A database system creates additional complexity and requirements. The supply and
operation of a database management system with several users and databases is quite costly and
demanding.
Costs: - Through the use of a database system new costs are generated for the system itself but also for
additional hardware and the more complex handling of the system.
Lower Efficiency: - A database system is a multi-use software which is often less efficient than
specialized software which is produced and optimized exactly for one problem.
Data Independence: -
It is the property of the database which tries to ensure that if we make any change in any
level of schema of the database, the schema immediately above it would require minimal or no
need of change. It removes the need for additional amount of work needed in adopting the single
change into all the levels above.
Data independence can be classified into the following two types:
1. Physical Data Independence: - This means that for any change made in the physical
schema, the need to change the logical schema is minimal. This is practically easier to achieve.
2. Logical Data Independence: - This means that for any change made in the logical
schema, the need to change the external schema is minimal. As we shall see, this is a little
difficult to achieve.
Instances and Schemas: -
Databases change over time as information is inserted and deleted. The collection of information stored
in the database at a particular moment is called an instance of the database. The overall design of the
database is called the database schema.
The concept of database schemas and instances can be understood by analogy to a program written in a
programming language. A database schema corresponds to the variable declarations (along with
associated type definitions) in a program. Each variable has a particular value at a given instant. The
values of the variables in a program at a point in time correspond to an instance of a database schema.
Database systems have several schemas, partitioned according to the levels of abstraction.
The physical schema describes the database design at the physical level, while the
logical schema describes the database design at the logical level.
Database may also have several schemas at the view level, sometimes called subschemas, that describe
different views of the database
Three Views of Data: -
Physical Level
This is the lowest level in the three level
architecture. It is also known as the internal level.
The physical level describes how data is actually
stored in the database. In the lowest level, this
data is stored in the external hard drives in the
form of bits and at a little high level, it can be said
that the data is stored in files and folders. The
physical level also discusses compression and
encryption techniques.
Conceptual Level
The conceptual level is at a higher level than the physical level. It is also known as the
logical level. It describes how the database appears to the users conceptually and the
relationships between various data tables. The conceptual level does not care for how
the data in the database is actually stored.
External Level
This is the highest level in the three level architecture and closest to the user. It is also
known as the view level. The external level only shows the relevant database content
to the users in the form of views and hides the rest of the data. So different users can
see the database as a different view as per their individual requirements.
The Three-Schema Architecture: -
The goal of the three-schema architecture is to separate the user applications and the physical
database. In this architecture, schemas can be defined at the following three levels:
1. The internal level has an internal schema, which describes the physical storage structure of the
database. The internal schema uses a physical data model and describes the complete details of
data storage and access paths for the database.
2. The conceptual level has a conceptual schema, which describes the structure of the whole
database for a community of users. The conceptual schema hides the details of physical storage
structures and concentrates on describing entities, data types, relationships, user operations, and
constraints. A high-level data model or an implementation data model can be used at this level.
3. The external or view level includes a number of external schemas or user views. Each external
schema describes the part of the database that a particular user group is interested in and hides the
rest of the database from that user group. A high-level data model or an implementation data model
can be used at this level.
Database Users: -
There are four different types of database-system users, differentiated by the way they expect to interact with
the system. Different types of user interfaces have been designed for the different types of users.
1. Naive users are unsophisticated users who interact with the system by invoking one of the application
programs that have been written previously. For example, a bank teller who needs to transfer $50 from account
A to account B invokes a program called transfer. This program asks the teller for the amount of money to be
transferred, the account from which the money is to be transferred, and the account to which the money is to be
transferred.
2. Application programmers are computer professionals who write application programs. Application
programmers can choose from many tools to develop user interfaces. Rapid application development (RAD)
tools are tools that enable an application programmer to construct forms and reports without writing a program.
There are also special types of programming languages that combine imperative control structures (for example,
for loops, while loops and if-then-else statements) with statements of the data manipulation language. These
languages, sometimes called fourth-generation languages, often include special features to facilitate the
generation of forms and the display of data on the screen. Most major commercial database systems include a
fourth generation language.
3. Sophisticated users interact with the system without writing programs. Instead, they form
their requests in a database query language. They submit each such query to a query processor,
whose function is to break down DML statements into instructions that the storage manager
understands. Analysts who submit queries to explore data in the database fall in this category.
4. Database Administrator One of the main reasons for using DBMSs is to have central control of
both the data and the programs that access those data. A person who has such central control over
the system is called a database administrator (DBA).
Database Administrator (DBA):-
The functions of a DBA include:
1. Schema definition: - The DBA creates the original database schema by executing a set of data definition
statements in the DDL. Storage structure and access-method definition.
2. Schema and physical-organization modification: -The DBA carries out changes to the schema and physical
organization to reflect the changing needs of the organization, or to alter the physical organization to improve
performance.
3. Granting of authorization for data access: - By granting different types of authorization, the database
administrator can regulate which parts of the database various users can access. The authorization
information is kept in a special system structure that the database system consults whenever someone
attempts to access the data in the system.
4. Routine maintenance: - Examples of the database administrator’s routine maintenance activities are:
Periodically backing up the database, either onto tapes or onto remote servers, to prevent loss of data in case
of disasters such as flooding. Ensuring that enough free disk space is available for normal operations, and
upgrading disk space as required.
Monitoring jobs running on the database and ensuring that performance is not degraded by very expensive
tasks submitted by some users.
Data Models
Data Model is the modeling of the data description, data semantics, and consistency
constraints of the data. It provides the conceptual tools for describing the design of a database
at each level of data abstraction. Therefore, there are following four data models used for
understanding the structure of the database:
1) Relational Data Model: This type of model designs the data in the form of rows and columns within a
table. Thus, a relational model uses tables for representing data and in-between relationships. Tables
are also called relations. This model was initially described by Edgar F. Codd, in 1969. The relational
data model is the widely used model which is primarily used by commercial data processing
applications.
2) Entity-Relationship Data Model: An ER model is the logical representation of data as objects and
relationships among them. These objects are known as entities, and relationship is an association among
these entities. This model was designed by Peter Chen and published in 1976 papers. It was widely used
in database designing. A set of attributes describe the entities.
For example, student_name, student_id describes the 'student' entity. A set of the same type of entities is
known as an 'Entity set', and the set of the same type of relationships is known as 'relationship set'.
3) Object-based Data Model: An extension of the ER model with notions of functions,
encapsulation, and object identity, as well. This model supports a rich type system that includes
structured and collection types. Thus, in 1980s, various database systems following the object-
oriented approach were developed. Here, the objects are nothing but the data carrying its
properties.
4) Semistructured Data Model: This type of data model is different from the other three data
models . The semistructured data model allows the data specifications at places where the
individual data items of the same type may have different attributes sets.
The Extensible Markup Language, also known as XML, is widely used for representing the
semistructured data. Although XML was initially designed for including the markup information to the
text document, it gains importance because of its application in the exchange of data.
Database Language
•A DBMS has appropriate languages and interfaces to express database queries and updates.
•Database languages can be used to read, store and update the data in the database.
•DCL stands for Data Control Language. It is used to retrieve the stored or saved data.
•The DCL execution is transactional. It also has rollback parameters.
(But in Oracle database, the execution of data control language does not have the feature of rolling
back.)
TCL is used to run the changes made by the DML statement. TCL can be grouped into a logical
transaction.
Entity: -
An entity can be a real-world object, either animate or inanimate, that can be easily identifiable. For
example, in a school database, students, teachers, classes, and courses offered can be considered as
entities. All these entities have some attributes or properties that give them their identity.
An entity set is a collection of similar types of entities.
An entity set may contain entities with attribute sharing similar values.
For example:- a Students set may contain all the students of a school; likewise, a Teachers set may contain
all the teachers of a school from all faculties. Entity sets need not be disjoint.
Attributes: -
Entities are represented by means of their properties called attributes. All attributes have values.
For example:- a student entity may have name, class, and age as attributes. There exists a domain or range of
values that can be assigned to attributes.
For example:- a student's name cannot be a numeric value. It has to be alphabetic. A student's age cannot be
negative, etc.
Types of Attributes: -
Key Attribute –
The attribute which uniquely identifies each entity in the entity set is called key attribute.
For example:- Roll_No will be unique for each student. In ER diagram, key attribute is
represented by an oval with underlying lines.
Composite Attribute –
An attribute composed of many other attribute is called as composite attribute.
For example:- Address attribute of student Entity type consists of Street, City, State, and
Country. In ER diagram, composite attribute is represented by an oval comprising of ovals.
Multivalued Attribute –
An attribute consisting more than one value for a given entity.
For example:- Phone_No (can be more than one for a given student. In ER diagram,
multivalued attribute is represented by double oval.
Derived Attribute –
An attribute which can be derived from other attributes of the entity type is known as derived
attribute.
e.g.; Age (can be derived from DOB). In ER diagram, derived attribute is represented by dashed
oval.
The complete entity type Student with its attributes can be represented as:
Entity-Set:-
Key is an attribute or collection of attributes that uniquely identifies an entity among entity set.
For example, the roll_number of a student makes him/her identifiable among students.
Relationship:-
The association among entities is called a relationship.
For example, an employee works_at a department, a student enrolls in a course. Here, Works_at and
Enrolls are called relationships.
Relationship Set
A set of relationships of similar type is called a relationship set. Like entities, a relationship too can have
attributes. These attributes are called descriptive attributes.
Degree of a relationship set:
The number of different entity sets participating in a relationship set is called as degree of a
relationship set.
Unary Relationship –
When there is only ONE entity set participating in a relation, the relationship is called as unary
relationship.
For example, one person is married to only one person.
Binary Relationship –
When there are TWO entities set participating in a relation, the relationship is called as binary
relationship.For example, Student is enrolled in Course.
n-ary Relationship –
When there are n entities set participating in a relation, the relationship is called as n-ary
relationship.
Mapping Cardinalities:-
Cardinality defines the number of entities in one entity set, which can be associated with the number of entities of
other set via relationship set.
1. One-to-one: One entity from entity set A can be associated with at most one entity of entity set B and vice
versa.
1 1
Teacher Teaches Student
2. One-to-many: One entity from entity set A can be associated with more than one entities of
entity set B, however an entity from entity set B can be associated with at most one entity.
1 M
Teacher Teaches Student
Many to one – When entities in one entity set can take part only once in the relationship
set and entities in other entity set can take part more than once in the relationship
set, cardinality is many to one. Let us assume that a student can take only one course but
one course can be taken by many students. So the cardinality will be n to 1. It means that for
one course there can be n students but for one student, there will be only one course.
Many to many – When entities in all entity sets can take part more than once in the
relationship cardinality is many to many. Let us assume that a student can take more than one
course and one course can be taken by many students. So the relationship will be many to
many.
ER DIAGRAM REPRESENTATION:-
Any object, for example, entities, attributes of an entity, relationship sets, and attributes of relationship sets,
can be represented with the help of an ER diagram.
Entity
Entities are represented by means of rectangles. Rectangles are named with the entity set they represent.
Roll No
Email. id
Add DOB
If the attributes are composite, they are further divided in a tree like structure. Every node is then connected
to its attribute. That is, composite attributes are represented by ellipses that are connected with an ellipse.
Roll No First
Email. id Name
Add DOB
Multivalued attributes are depicted by double ellipse.
Roll No First
Email. id Name
Add DOB
Participation Constraints:-
Total Participation: Each entity is involved in the relationship. Total participation is represented by double
lines.
Partial participation: Not all entities are involved in the relationship. Partial participation is represented by
single lines.
Generalization and Specialization: -
The ER Model has the power of expressing database entities in a conceptual hierarchical manner. As the
hierarchy goes up, it generalizes the view of entities, and as we go deep in the hierarchy, it gives us the
detail of every entity included.
Going up in this structure is called generalization, where entities are clubbed together to represent a
more generalized view. For example, a particular student named Mira can be generalized along with all
the students. The entity shall be a student, and further, the student is a person. The reverse is called
specialization where a person is a student, and that student is Mira.
Generalization
As mentioned above, the process of generalizing entities, where the generalized entities contain the
properties of all the generalized entities, is called generalization. In generalization, a number of entities
are brought together into one generalized entity based on their similar characteristics. For example,
pigeon, house sparrow, crow, and dove can all be generalized as Birds.
Specialization
Specialization is the opposite of generalization. In specialization, a group of entities is divided into sub-groups
based on their characteristics. Take a group ‘Person’ for example. A person has name, date of birth, gender, etc.
These properties are common in all persons, human beings. But in a company, persons can be identified as
employee, employer, customer, or vendor, based on what
role they play in the company.
Aggregation
In aggregation, the relation between two entities is treated as a single entity. In aggregation,
relationship with its corresponding entities is aggregated into a higher level entity.
For example: Center entity offers the Course entity act as a single entity in the relationship which
is in a relationship with another entity visitor. In the real world, if a visitor visits a coaching center
then he will never enquiry about the Course only or just about the Center instead he will ask the
enquiry about both.
Reduction of ER diagram to Table
Relational model can represent as a table with columns and rows. Each row is known as a tuple. Each table of
the column has a name or attribute.
Attribute: It contains the name of a column in a particular table. Each attribute Ai must have a domain, dom(Ai)
Relational instance: In the relational database system, the relational instance is represented by a finite set of
tuples. Relation instances do not have duplicate tuples.
Relational schema: A relational schema contains the name of the relation and name of all columns or
attributes.
Relational key: In the relational key, each row has one or more attributes. It can identify the row in the relation
uniquely.
NAME ROLL_NO PHONE_NO ADDRESS AGE
•In the given table, NAME, ROLL_NO, PHONE_NO, ADDRESS, and AGE are the attributes.
•The instance of schema STUDENT has 5 tuples.
Properties of Relations
•Name of the relation is distinct from all other relations.
•Each relation cell contains exactly one atomic (single) value
•Each attribute contains a distinct name
•Attribute domain has no significance
•tuple has no duplicate value
•Order of tuple can have a different sequence
Relational Algebra
Relational algebra is a procedural query language. It gives a step by step process to obtain the result
of the query. It uses operators to perform queries.
Types of Relational operation
1. Select Operation:
•The select operation selects tuples that satisfy a given predicate.
•It is denoted by sigma (σ).
Notation: σ p(r)
Where:
σ is used for selection prediction
r is used for relation
p is used as a propositional logic formula which may use connectors like: AND OR and NOT. These
relational can use as relational operators like =, ≠, ≥, <, >, ≤.
For example: LOAN Relation
BRANCH_NAME LOAN_NO AMOUNT
Downtown L-17 1000
Redwood L-23 2000
Perryride L-15 1500
Downtown L-14 1500
Mianus L-13 500
Roundhill L-11 900
Perryride L-16 1300
σ BRANCH_NAME="perryride" (LOAN)
BRANCH_NAME LOAN_NO AMOUNT
Perryride L-15 1500
Perryride L-16 1300
Project Operation:
•This operation shows the list of those attributes that we wish to appear in the result. Rest of
the attributes are eliminated from the table.
•It is denoted by ∏.
Notation: ∏ A1, A2, An (r)
Where
A1, A2, A3 is used as an attribute name of relation r.
Example: CUSTOMER RELATION
NAME CITY
Jones Harrison
∏ NAME, CITY (CUSTOMER)
Smith Rye
Hays Harrison
Curry Rye
Johnson Brooklyn
Brooks Brooklyn
Union Operation:
•Suppose there are two tuples R and S. The union operation contains all the tuples that are
either in R or S or both in R & S.
•It eliminates the duplicate tuples. It is denoted by ∪.
Notation: R ∪ S
A union operation must hold the following condition:
•R and S must have the attribute of the same number.
•Duplicate tuples are eliminated automatically.
DEPOSITOR RELATION BORROW RELATION
CUSTOMER_NAME
Smith
Jones
Set Difference:
•Suppose there are two tuples R and S. The set intersection operation contains all tuples
that are in R but not in S.
•It is denoted by intersection minus (-).
Notation: R - S
Example: Using the above DEPOSITOR table and BORROW table
CUSTOMER_NAME
Jackson
Hayes
Willians
Curry
Cartesian product
•The Cartesian product is used to combine each row in one table with each row in the other table. It is
also known as a cross product.
•It is denoted by X.
Notation: E X D
Example: EMPLOYEE X DEPARTMENT
EMPLOYEE
EMP_ID EMP_NAME EMP_DEPT DEPT_NO DEPT_NAME
EMP_ID EMP_NAME EMP_DEPT
1 Smith A A Marketing
1 Smith A
2 Harry C 1 Smith A B Sales
1 Smith A C Legal
3 John B
2 Harry C A Marketing
DEPARTMENT
2 Harry C B Sales
DEPT_NO DEPT_NAME 2 Harry C C Legal
A Marketing 3 John B A Marketing
The rename operation is used to rename the output relation. It is denoted by rho (ρ).
Example: We can use the rename operator to rename STUDENT relation to STUDENT1.
ρ(STUDENT1, STUDENT)
Integrity Constraints
•Integrity constraints are a set of rules. It is used to maintain the quality of information.
•Integrity constraints ensure that the data insertion, updating, and other processes have to be
performed in such a way that data integrity is not affected.
•Thus, integrity constraint is used to guard against accidental damage to the database.
Types of Integrity Constraint
Domain constraints
•Domain constraints can be defined as the definition of a valid set of values for an attribute.
•The data type of domain includes string, character, integer, time, date, currency, etc. The value
of the attribute must be available in the corresponding domain.
Example
Entity integrity constraints
•The entity integrity constraint states that primary key value can't be null.
•This is because the primary key value is used to identify individual rows in relation and if
the primary key has a null value, then we can't identify those rows.
•A table can contain a null value other than the primary key field
Referential Integrity Constraints
•A referential integrity constraint is specified between two tables.
•In the Referential integrity constraints, if a foreign key in Table 1 refers to the Primary Key of
Table 2, then every value of the Foreign Key in Table 1 must be null or be available in Table 2.
Key constraints
•Keys are the entity set that is used to identify an entity within its entity set uniquely.
•An entity set can have multiple keys, but out of which one key will be the primary key. A primary
key can contain a unique value in the relational table.
Relational keys:-
Super Keys:-
A super key is a set of attributes whose values can be used to uniquely identify a tuple within a relation. A
relation may have more than one super key, but it always has at least one: the set of all attributes that make up
the relation.
Candidate Keys:-
A candidate key is a super key that is minimal; that is, there is no proper subset that is itself a super key. A
relation may have more than one candidate key, and the different candidate keys may have a different number
of attributes. In other words, you should not interpret 'minimal' to mean the super key with the fewest
attributes.
Properties of Candidate key:
It must contain unique values
Candidate key in SQL may have multiple attributes
Must not contain null values
It should contain minimum fields to ensure uniqueness
Uniquely identify each record in a table
Primary Key:-
The primary key of a relation is a candidate key especially selected to be the key for the relation.
In other words, it is a choice, and there can be only one candidate key designated to be the
primary key.
Rules for defining Primary key:
Two rows can't have the same primary key value
It must for every row to have a primary key value.
The primary key field cannot be null.
The value in a primary key column can never be modified or updated if any
foreign key refers to that primary key.
Primary Key never accept null values. A foreign key may accept multiple
null values.
Primary key is a clustered index and A foreign key cannot automatically
data in the DBMS table are physically create an index, clustered or non-
organized in the sequence of the clustered. However, you can manually
clustered index. create an index on the foreign key.
You can have the single Primary key in You can have multiple foreign keys in
a table. a table.
Oracle CREATE TABLE
In Oracle, CREATE TABLE statement is used to create a new table in the database.
To create a table, you have to name that table and define its columns and datatype for each column.
Syntax:
1.CREATE TABLE table_name
2.(
3. column1 datatype [ NULL | NOT NULL ],
4. column2 datatype [ NULL | NOT NULL ],
5. ...
6. column_n datatype [ NULL | NOT NULL ]
7.);
Parameters used in syntax
•table_name: It specifies the name of the table which you want to create.
•column1, column2, ... column n: It specifies the columns which you want to add in the table. Every
column must have a datatype. Every column should either be defined as "NULL" or "NOT NULL". In
the case, the value is left blank; it is treated as "NULL" as default.
Example:-
1.CREATE TABLE customers
2.( customer_id number(10) NOT NULL,
3. customer_name varchar2(50) NOT NULL,
4. city varchar2(50)
5.);
CREATE TABLE Example with primary key
1.CREATE TABLE customers
2.( customer_id number(10) NOT NULL,
3. customer_name varchar2(50) NOT NULL,
4. city varchar2(50),
5. CONSTRAINT customers_pk PRIMARY KEY (customer_id)
6.);
OR
Syntax:
1.CREATE TABLE new_table
2.AS (SELECT column_1, column2, ... column_n
3. FROM old_table_1, old_table_2, ... old_table_n);
Let's take an example: Consider that you have already created two tables "regularcustomers"
and "irregularcustomers".
The table "regularcustomers" has three columns rcustomer_id, rcustomer_name and rc_city.
1.CREATE TABLE "regularcustomers"
2. ( "RCUSTOMER_ID" NUMBER(10,0) NOT NULL ENABLE,
3. "RCUSTOMER_NAME" VARCHAR2(50) NOT NULL ENABLE,
4. "RC_CITY" VARCHAR2(50)
5. )
6./
The second table "irregularcustomers" has also three columns ircustomer_id,
ircustomer_name and irc_city.
1.CREATE TABLE "irregularcustomers"
2. ( "IRCUSTOMER_ID" NUMBER(10,0) NOT NULL ENABLE,
3. "IRCUSTOMER_NAME" VARCHAR2(50) NOT NULL ENABLE,
4. "IRC_CITY" VARCHAR2(50)
5. )
6./
In the following example, we will create a table name "newcustomers3" form copying columns from both tables.
Example:
1.CREATE TABLE newcustomers3
2. AS (SELECT regularcustomers.rcustomer_id, regularcustomers.rc_city, irregularcustomers.ircustomer_name
3. FROM regularcustomers, irregularcustomers
4. WHERE regularcustomers.rcustomer_id = irregularcustomers.ircustomer_id
5. AND regularcustomers.rcustomer_id < 5000);
ALTER TABLE Statement
In Oracle, ALTER TABLE statement specifies how to add, modify, drop or delete columns in a
table. It is also used to rename a table.
How to add column in a table
Syntax:
1.ALTER TABLE table_name
2. ADD column_name column-definition;
Example:
Consider that already existing table customers. Now, add a new column customer_age into the
table customers.
1.ALTER TABLE customers
2. ADD customer_age varchar2(50);
add multiple columns in the existing table
Syntax:
Hello Java Program for Beginners
1.ALTER TABLE table_name
2. ADD (column_1 column-definition,
3. column_2 column-definition,
4. ...
5. column_n column_definition);
Example
1.ALTER TABLE customers
2. ADD (customer_type varchar2(50),
3. customer_address varchar2(50));
modify column of a table
Syntax:
1.ALTER TABLE table_name
2. MODIFY column_name column_type;
Example:
1.ALTER TABLE customers
2. MODIFY customer_name varchar2(100) not null;
modify multiple columns of a table
Syntax:
1.ALTER TABLE table_name
2. MODIFY (column_1 column_type,
3. column_2 column_type,
4. ...
5. column_n column_type);
Example:
1.ALTER TABLE customers
2. MODIFY (customer_name varchar2(100) not null,
3. city varchar2(100));
drop column of a table
Syntax:
1.ALTER TABLE table_name
2. DROP COLUMN column_name;
3.
Example:
1.ALTER TABLE customers
2. DROP COLUMN customer_name;
Example:
1.ALTER TABLE customers
2. RENAME COLUMN customer_name to cname;
rename table
Syntax:
1.ALTER TABLE table_name
2. RENAME TO new_table_name;
Example:
1.ALTER TABLE customers
2.RENAME TO retailers;
DROP TABLE Statement
Oracle DROP TABLE statement is used to remove or delete a table from the Oracle database.
Syntax
1.DROP [schema_name].TABLE table_name
2.[ CASCADE CONSTRAINTS ]
3.[ PURGE ];
Parameters
schema_name: It specifies the name of the schema that owns the table.
table_name: It specifies the name of the table which you want to remove from the Oracle database.
CASCADE CONSTRAINTS: It is optional. If specified, it will drop all referential integrity constraints
as well.
PURGE: It is also optional. If specified, the table and its dependent objects are placed in the recycle
bin and can?t be recovered.
2. Deletion anomalies: - We tried to delete a record, but parts of it was left undeleted because
of unawareness, the data is also saved somewhere else.
3. Insert anomalies: - We tried to insert data in a record that does not exist at all. Normalization
is a method to remove all these anomalies and bring the database to a consistent stat
Decompositions:-
Intuitively, redundancy arises when a relational schema forces an association between attributes that is not
natural. Functional dependencies can 'be used to identify such situations and suggest refinements to the
schema. The essential idea is that many problems arising from redundancy can be addressed by replacing a
relation 'with a collection of smaller relations. A. decomposition of a relation schema consists of replacing
the relation schema by two (or more) relation schema that each contain a subset of the attributes of R and
together include all attributes in R. Intuitively, we want to store the information in any given instance of R
by storing projections of the instance. This section examines the use of decompositions through several
examples. we can decompose Hourly_Emps into two relations:
Hourly_Emps2(ssn,name,lot,rating_hours_worked)
Wages (rating, hourly_wages)
1. First Normal Form –
If a relation contain composite or multi-valued attribute, it violates first normal form or a relation
is in first normal form if it does not contain any composite or multi-valued attribute. A relation is
in first normal form if every attribute in that relation is singled valued attribute.
Example 1 – Relation STUDENT in table 1 is not in 1NF because of multi-valued attribute STUD_PHONE. Its decomposition into 1NF has been
shown in table 2.
Example 2 –
ID Name Courses
1 A C1,C2
2 E C3
3 M C2.C3
ID Name Cources
1 A C1
1 A C2
2 E C3
3 M C2
3 M C3
Functional Dependency:-
Functional dependency (FD) is a set of constraints between two attributes in a relation. Functional dependency says
that if two tuples have same values for attributes A1, A2,..., An, then those two tuples must have to have same values
for attributes B1, B2, ..., Bn.
Functional dependency is represented by an arrow sign (→) that is, X→Y, where X functionally determines Y. The left-
hand side attributes determine the values of attributes on the right-hand side.
A functional dependency is a property of the semantics of the attributes in a relation. The semantics indicate how
attributes relate to one another, and specify the functional dependencies between attributes. When a functional
dependency is present, the dependency is specified as a constraint between the attributes.
2. Augmentation rule: If a → b holds and y is attribute set, then ay → by also holds. That is adding
attributes in dependencies, does not change the basic dependencies.
3. Transitivity rule: Same as transitive rule in algebra, if a → b holds and b → c holds, then a → c also
holds. a → b is called as a functionally that determines b.
Second Normal Form –
To be in second normal form, a relation must be in first normal form and relation must not contain
any partial dependency. A relation is in 2NF if it has No Partial Dependency, i.e., no non-prime
attribute (attributes which are not part of any candidate key) is dependent on any proper subset of
any candidate key of the table.
Partial Dependency – If the proper subset of candidate key determines non-prime attribute, it is
called partial dependency.
report: report_no -> editor, dept_no dept_no -> dept_name, dept_addr
author_id -> author_name, author_addr
Definition. A table is in second normal form (2NF) if and only if it is in 1NF and every non key attribute
is fully dependent on the primary key. An attribute is fully dependent on the primary key if it is on the
right side of an FD for which the left side is either the primary key itself or something that can be
derived from the primary key using the transitivity of FDs. An example of a transitive FD in report is
the following: report_no -> dept_no dept_no -> dept_name
The projection of report into three smaller tables has preserved the FDs and the association between
report_no and author_no that was important in the original table.
TEACHER_DETAIL table
TEACHER_ID TEACHER_AGE
25 30
47 35
83 38
Definition. A table is in third normal form (3NF) if and only if for every nontrivial functional dependency X->A, where X
and A are either simple or composite attributes, one of two conditions must hold. Either attribute X is a superkey, or
attribute A is a member of a candidate key. If attribute A is a member of a candidate key, A is called a prime attribute.
Note: a trivial FD is of the form YZ->Z.
report11: report_no -> editor, dept_no
report12: dept_no -> dept_name, dept_addr
report2: author_id -> author_name, author_addr
report3: report_no, author_id is a candidate key (no FDs)
Transitive dependency – If A->B and B->C are two FDs then A->C is called transitive dependency.
•Example 1 – In relation STUDENT given in Table 4,FD set: {STUD_NO -> STUD_NAME, STUD_NO ->
STUD_STATE, STUD_STATE -> STUD_COUNTRY, STUD_NO -> STUD_AGE}
Candidate Key: {STUD_NO}
•For this relation in table 4, STUD_NO -> STUD_STATE and STUD_STATE -> STUD_COUNTRY are true.
So STUD_COUNTRY is transitively dependent on STUD_NO. It violates the third normal form. To convert it
in third normal form, we will decompose the relation
•STUDENT (STUD_NO, STUD_NAME, STUD_PHONE, STUD_STATE, STUD_COUNTRY_STUD_AGE)
as:
STUDENT (STUD_NO, STUD_NAME, STUD_PHONE, STUD_STATE, STUD_AGE)
STATE_COUNTRY (STATE, COUNTRY)
A functional dependency is said to be transitive if it is indirectly formed by two functional
dependencies.
<MovieListing>
The above table is not in 3NF because it has a transitive functional dependency −
Movie_ID -> Listing_Type
Movie_ID -> Listing_ID
Definition. A table R is in Boyce-Codd normal form (BCNF) if for every nontrivial FD X->A, X is a superkey.
BCNF is a stronger form of normalization than 3NF because it eliminates the second condition for 3NF, which
allowed the right side of the FD to be a prime attribute. Thus, every left side of an FD in a table must be a
superkey. Every table that is BCNF is also 3NF, 2NF, and 1NF, by the previous definitions.
A relation R is in BCNF if R is in Third Normal Form and for every FD, LHS is super key. A relation is in BCNF
iff in every non-trivial functional dependency X –> Y, X is a super key.
•Example 1 – Find the highest normal form of a relation R(A,B,C,D,E) with FD set as {BC->D, AC->BE, B->E}
Step 1. As we can see, (AC)+ ={A,C,B,E,D} but none of its subset can determine all attribute of relation, So
AC will be candidate key. A or C can’t be derived from any other attribute of the relation, so there will be only 1
candidate key {AC}.
Step 2. Prime attributes are those attribute which are part of candidate key {A, C} in this example and others
will be non-prime {B, D, E} in this example.
Step 3. The relation R is in 1st normal form as a relational DBMS does not allow multi-valued or composite
attribute.
The relation is in 2nd normal form because BC->D is in 2nd normal form (BC is not a proper subset of
candidate key AC) and AC->BE is in 2nd normal form (AC is candidate key) and B->E is in 2nd normal form (B
is not a proper subset of candidate key AC).
The relation is not in 3rd normal form because in BC->D (neither BC is a super key nor D is a prime attribute)
and in B->E (neither B is a super key nor E is a prime attribute) but to satisfy 3rd normal for, either LHS of an
FD should be super key or RHS should be prime attribute.
So the highest normal form of relation will be 2nd Normal form.
•Example 2 –For example consider relation R(A, B, C)
A -> BC,
B ->
A and B both are super keys so above relation is in BCNF.
Key Points –
1.BCNF is free from redundancy.
2.If a relation is in BCNF, then 3NF is also also satisfied.
3. If all attributes of relation are prime attribute, then the relation is always in 3NF.
4.A relation in a Relational Database is always and at least in 1NF form.
5.Every Binary Relation ( a Relation with only 2 attributes ) is always in BCNF.
6.If a Relation has only singleton candidate keys( i.e. every candidate key consists of only 1
attribute), then the Relation is always in 2NF( because no Partial functional dependency possible).
7.Sometimes going for BCNF form may not preserve functional dependency. In that case go for
BCNF only if the lost FD(s) is not required, else normalize till 3NF only.
8.There are many more Normal forms that exist after BCNF, like 4NF and more. But in real world
database systems it’s generally not required to go beyond BCNF.
Exercise 1: Find the highest normal form in R (A, B, C, D, E) under following functional dependencies.
ABC --> D
CD --> AE
Important Points for solving above type of question.
1) It is always a good idea to start checking from BCNF, then 3 NF and so on.
2) If any functional dependency satisfied a normal form then there is no need to check for lower normal
form. For example, ABC –> D is in BCNF (Note that ABC is a superkey), so no need to check this
dependency for lower normal forms.
Candidate keys in the given relation are {ABC, BCD}
BCNF: ABC -> D is in BCNF. Let us check CD -> AE, CD is not a super key so this dependency is not in
BCNF. So, R is not in BCNF.
3NF: ABC -> D we don’t need to check for this dependency as it already satisfied BCNF. Let us consider
CD -> AE. Since E is not a prime attribute, so the relation is not in 3NF.
2NF: In 2NF, we need to check for partial dependency. CD which is a proper subset of a candidate key
and it determine E, which is non-prime attribute. So, given relation is also not in 2 NF. So, the highest
normal form is 1 NF.