0% found this document useful (0 votes)
3 views

M1 (1)

The document provides an overview of relational database systems, including their purpose, architecture, and the various data models used to represent data. It discusses the roles of database users and administrators, as well as the entity-relationship model which is fundamental for database design. Key concepts such as instances, schemas, attributes, and relationships are explained, along with the importance of data abstraction and constraints in database management.

Uploaded by

karthikpm0412
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

M1 (1)

The document provides an overview of relational database systems, including their purpose, architecture, and the various data models used to represent data. It discusses the roles of database users and administrators, as well as the entity-relationship model which is fundamental for database design. Key concepts such as instances, schemas, attributes, and relationships are explained, along with the importance of data abstraction and constraints in database management.

Uploaded by

karthikpm0412
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 95

Module1

M O D U L E I : R E L AT I O N A L D ATA B A S E S
Introduction

A database-management system (DBMS) is a


collection of interrelated data and a set of
programs to access those data.
The collection of data, usually referred to as the
database, contains information relevant to an
enterprise.
The primary goal of a DBMS is to provide a way
to store and retrieve database information
that is both convenient and efficient.
Purpose of Database System

Before database management systems (DBMSs) were


introduced, organizations usually stored
information in file-processing system.
Keeping organizational information in a file-
processing system has a number of major
disadvantages:
 Data redundancy and inconsistency
 . In addition, it may lead to data inconsistency;
 Difficulty in accessing data.
 Data isolation
 Integrity problems. The data values stored in the database
must satisfy certain types of consistency constraints
• Atomicity problems.
Concurrent-access anomalies.
Security problems
View of Data

A database system is a collection of interrelated


data and a set of programs that allow users to
access and modify these data.
A major purpose of a database system is to
provide users with an abstract view of the
data.
That is, the system hides certain details of how
the data are stored and maintained.
Data Abstraction

developers hide the complexity from users


through several levels of abstraction, to simplify
users’ interactions with the system:
Physical level.
The lowest level of abstraction describes how the
data are actually stored. The physical level
describes complex low-level data structures in
detail.
Logical level.
The next-higher level of abstraction describes
what data are stored in the database, and what
relationships exist among those data.
The user of the logical level does not need to be
aware of the complexity of physical-level
structures. This is referred to as physical data
independence.
View level.
 The highest level of abstraction describes only
part of the entire database.
Many users of the database system do not need
all information; instead, they need to access only
a part of the database.
The system may provide many views for the
same database.
Instances and Schemas

Databases change over time as information is


inserted and deleted. The collection of
information stored in the database at a
particular moment is called an instance of the
database.
 The overall design of the database is called the
database schema
Database systems have several schemas,
partitioned according to the levels of abstraction.
The physical schema describes the database
design at the physical level, while the logical
schema describes the database design at the
logical level.
A database may also have several schemas at the
view level, sometimes called sub schemas, that
describe different views of the database.
Data Models

Underlying the structure of a database is the


data model: a collection of conceptual tools for
describing data, data relationships, data
semantics, and consistency constraints.
A data model provides a way to describe the
design of a database at the physical, logical, and
view levels.
The data models can be classified into four
different categories:
1. Relational Model.

The relational model uses a collection of tables


to represent both data and the relationships
among those data.
Each table has multiple columns, and each
column has a unique name.
Tables are also known as relations.
The relational model is an example of a record-
based model.
2. Entity-Relationship Model.

The entity-relationship (E-R) data model uses a


collection of basic objects, called entities, and
relationships among these objects.
 An entity is a “thing” or “object” in the real
world that is distinguishable from other objects.
The entity-relationship model is widely used in
data base design
3. Object-Based Data Model.

Object-oriented programming (especially in


Java,C++, or C#) has become the dominant
software-development methodology .
This led to the development of an object-oriented
data model that can be seen as extending the E-R
model with notions of encapsulation,
methods(functions), and object identity.
The object-relational data model combines
features of the object-oriented data model and
relational data model.
4. Semi structured Data Model.

The semi structured data model permits the


specification of data where individual data items
of the same type may have different sets of
attributes.
This is in contrast to the data models mentioned
earlier, where every data item of a particular
type must have the same set of attributes. The
Extensible Markup Language (XML)is widely
used to represent semi structured data.
Database Architecture

Database applications are usually partitioned


into two or three parts.
Database Users and Administrators

A primary goal of a database system is to


retrieve information from and store new
information into the database. People who work
with a database can be categorized as database
users or database administrators
Database Users and User Interfaces

There are four different types of database-system users,


differentiated by the way they expect to interact with the
system.
Different types of user interfaces have been designed for
the different types of users.
1. Naıve users are unsophisticated users who interact
with the system by invoking one of the application
programs that have been written previously. For
example, a clerk in the university
The typical user interface for naıve users is a forms
interface, where the user can fill in appropriate fields of
the form.
2. Application programmers are computer
professionals who write application programs.
Application programmers can choose from many
tools to develop user interfaces.
3. Sophisticated users interact with the system
without writing programs. In-stead, they form
their requests either using a database query
language or by using tools such as data analysis
software.
4. Specialized users are sophisticated users who
write specialized database applications that do
not fit into the traditional data-processing frame
work.
 Among these applications are computer-aided
design systems, knowledge-base and expert
systems, systems that store data with complex
data types (for example, graphics data and audio
data), and environment-modeling systems.
Database Administrator

One of the main reasons for using DBMSs is to


have central control of both the data and the
programs that access those data.
A person who has such central control over the
system is called a database administrator (DBA).
The functions of a DBA include:
Schema definition. The DBA creates the original
database schema by executing a set of data
definition statements in the DDL.
Storage structure and access-method
definition.
Schema and physical-organization
modification. The DBA carries out changes to
the schema and physical organization to reflect
the changing needs of the organization, or to
alter the physical organization to improve
performance.
Granting of authorization for data access. By
granting different types of authorization, the
database administrator can regulate which parts
of the data base various users can access. The
authorization information is kept in a special
system structure that the database system
consults whenever some one attempts to access
the data in the system.
Routine maintenance.
Examples of the database administrator’s routine
maintenance activities are:
Periodically backing up the database, either onto tapes or
onto remote servers, to prevent loss of data in case of
disasters such as flooding.
Ensuring that enough free disk space is available for
normal operations , and upgrading disk space as
required.
Monitoring jobs running on the database and ensuring
that performance is not degraded by very expensive
tasks submitted by some users.
The Entity-Relationship Model

The entity-relationship(E-R) data model was


developed to facilitate data base design.
The E-R data model employs three basic
concepts: entity sets, relationship sets, and
attributes.
Entity Sets

An entity is a “thing” or “object” in the real


world that is distinguishable from all other
objects.
For example, each person in a university is an
entity.
An entity has a set of properties, and the values
for some set of properties may uniquely identify
an entity.
For instance, a person may have a person id
property whose value uniquely identifies that
person.
An entity set is a set of entities of the same type
that share the same properties , or attributes.
The set of all people who are instructors at a
given university, for example, can be defined as
the entity set instructor.
Similarly, the entity set student might represent
the set of all students in the university
Entity sets do not need to be disjoint.
For example, it is possible to define the entity set
of all people in a university (person). A person
entity may be an instructor entity, a student
entity, both, or neither.
Attributes

An entity is represented by a set of attributes.


Attributes are descriptive properties possessed
by each member of an entity set.
Possible attributes of the instructor entity set are
ID ,name , deptname , and salary.
Possible attributes of the course entity set are
courseid , title , deptname , and credits.
Each entity has a value for each of its attributes.
Relationship Sets

A relationship is an association among several


entities.
 For example, we can define a relationship
advisor that associates instructor James with
student Shankar. This relationship specifies that
James is an advisor to student Shankar.
A relationship set is a set of relationships of the
same type.
Formally, it is a mathematical relation on n≥2
(possibly non distinct) entity sets. If E1,E2,...,En
are entity sets, then a relationship set Ris a
subset of
{(e1,e2,...,en)|e1∈E1,e2∈E2,..., en∈En} where
(e1,e2,...,en) is a relationship
The association between entity sets is referred to
as participation;
that is, the entity setsE1,E2,...,En participate in
relationship set R.
The function that an entity plays in a
relationship is called that entity’s role.
A relationship may also have attributes called
descriptive attributes .
Consider a relationship set advisor with entity
sets instructor and student.
We could associate the attribute date with that
relationship to specify the date when an
instructor became the advisor of a student.
Attributes

For each attribute, there is a set of permitted


values, called the domain,or value set, of that
attribute.
The domain of attribute courseid might be the
set of all text strings of a certain length.
An attribute, as used in the E-R model, can be
characterized by the following attribute types
 Simple and composite attributes.
 Single-valued and multi valued attributes.
 Derived attribute.
Simple and composite attributes.

Simple- that is, they have not been divided into


subparts.
Composite attributes- can be divided into
subparts (that is, other attributes).
For example, an attribute name could be
structured as a composite attribute consisting of
first name ,middle initial , and last name.
Single-valued and multi valued
attributes.

single value for a particular entity.


For instance, the studentID attributefor a specific
student entity refers to only one studentID. Such
attributes are said to be single valued.
multi valued attributes -an attribute has a set of
values for a specific entity.
Eg: a phone number attribute.
Derived attribute.

The value for this type of attribute can be


derived from the values of other related
attributes or entities.
Eg: age
Constraints

An E-R enterprise schema may define certain


constraints to which the contents of a database
must conform.
Mapping Cardinalities

Mapping cardinalities, or cardinality ratios,


express the number of entities to which another
entity can be associated via a relationship set.
For a binary relationship set R between entity
sets A and B, the mapping cardinality must be
one of the following:
One-to-one.
One-to-many.
Many-to-one.
Many-to-many.
One-to-one.
An entity in A is associated with at most one
entity in B, and an entity in B is associated with
at most one entity in A
One-to-many.
An entity in A is associated with any number
(zero or more)of entities in B. An entity in B,
however, can be associated with at most one
entity in A.
Many-to-one.
An entity in A is associated with at most one
entity in B. An entity in B, however, can be
associated with any number (zero or more) of
entities in A.
Many-to-many.
An entity in A is associated with any number
(zero or more)of entities in B, and an entity in B
is associated with any number (zero or more) of
entities in A.
Participation Constraints

The participation of an entity set E in a


relationship set R is said to be total if every
entity in E participates in at least one
relationship in R.
 If only some entities in E participate in
relationships in R, the participation of entity set
E in relationship R is said to be partial.
Keys

The values of the attribute values of an entity


must be such that they can uniquely identify the
entity.
Entity-Relationship Diagrams

An E-R diagram can express the overall logical


structure of a database graphically.
Basic Structure
An E-R diagram consists of the following major
components
Rectangles divided into two parts :represent
entity sets. The first part, contains the name of
the entity set. The second part contains the
names of all the attributes of the entity set.
Diamonds represent relationship sets.
Undivided rectangles represent the attributes
of a relationship set. Attributes that are part of
the primary key are underlined.
Lines link entity sets to relationship sets.
Dashed lines link attributes of a relationship set
to the relationship set.
Double lines indicate total participation of an
entity in a relationship set
Double diamonds represent identifying
relationship sets linked to weak entity sets
Roles
Weak Entity Sets

An entity set that does not have sufficient


attributes to form a primary key is termed a
weak entity set.
An entity set that has a primary key is termed a
strong entity set.
For a weak entity set to be meaningful, it must
be associated with another entity set, called the
identifying or owner entity set.
Every weak entity must be associated with an
identifying entity; that is, the weak entity set is
said to be existence dependent on the identifying
entity set.
The identifying entity set is said to own the weak
entity set that it identifies.
The relationship associating the weak entity set
with the identifying entity set is called the
identifying relationship
The identifying relationship is many-to-one
from the weak entity set to the identifying entity
set, and the participation of the weak entity set
in the relationship is total.
The identifying relationship set should not have
any descriptive attributes, since any such
attributes can instead be associated with the
weak entity set.
The discriminator of a weak entity set is a set of
attributes that allows this distinction to be made.
The primary key of a weak entity set is formed by
the primary key of the identifying entity set, plus the
weak entity set’s discriminator.
In E-R diagrams, a weak entity set is depicted via a
rectangle, like a strong entity set, but there are two
main differences:
The discriminator of a weak entity is underlined
with a dashed, rather than a solid, line.
The relationship set connecting the weak entity set
to the identifying strong entity set is depicted by a
double diamond.
Problem
A company database needs to store information
about employees (identified by ssn, with salary
and phone as attributes), departments (identified
by dno, with dname and budget as attributes), and
children of employees (with name and age as
attributes).
Problem
Employees work in departments; each
department is managed by an employee; a child
must be identified uniquely by name when the
parent (who is an employee; assume that only
one parent works for the company) is known.
We are not interested in information about a
child once the parent leaves the company.
Draw an ER diagram that captures this
information.
Introduction to the RelationalModel

Structure of Relational Databases


A relational database consists of a collection of
tables, each of which is assigned a unique name.
For example, consider the instructor table,which
stores information about instructors. The table
has four column headers: ID, name, deptname,
and salary.
 Each row of this table records information
about an instructor.
in the relational model the term relation is used to
refer to a table, while the term tuple is used to refer
to a row. Similarly, the term attribute refers to a
column of a table.
the term relation instance to refer to a specific
instance of a relation, i.e., containing a specific set of
rows.
For each attribute of a relation, there is a set of
permitted values, called the domain of that attribute.
The domain of the name attribute is the set of all
possible instructor names.
for all relations r, the domains of all attributes of
r be atomic.
A domain is atomic if elements of the domain
are considered to be indivisible units.
Database Schema

The database schema, which is the logical design


of the database, and the database instance,
which is a snapshot of the data in the database at
a given instant in time.
In general, a relation schema consists of a list of
attributes and their corresponding domains.
department(deptname,building,budget)
Keys

Super key is a set of one or more attributes that, taken


collectively, allow us to identify uniquely a tuple in the
relation.
Such minimal super keys are called candidate keys
The term primary key to denote a candidate key that
is chosen by the database designer as the principal
means of identifying tuples within a relation.
A relation, say r1, may include among its attributes the
primary key of an other relation, say r2. This attribute
is called a foreign key from r1, referencing r2.
Relational Query Languages

A query language is a language in which a user requests


information from the database.
These languages are usually on a level higher than that of
a standard programming language.
Query languages can be categorized as either procedural
or nonprocedural.
In a procedural language, the user instructs the system
to perform a sequence of operations on the database to
compute the desired result.
In a non procedural language, the user describes the
desired information with out giving a specific procedure
for obtaining that information
The Relational Algebra

The relational algebra is a procedural query


language.
It consists of a set of operations that take one or two
relations as input and produce a new relation as
their result.
The fundamental operations in the relational
algebra are select, project, union , set
difference ,Cartesian product , and rename.
In addition to the fundamental operations, there are
several other operations—namely, set intersection ,
naturaljoin ,and assignment.
Fundamental Operations

The select, project, and rename operations


are called unary operations, because they
operate on one relation.
The other three operations operate on pairs of
relations and are, therefore, called binary
operations
The Select Operation

The select operation selects tuples that satisfy a


given predicate.
We use the lower case Greek letter sigma (σ) to
denote selection.
The predicate appears as a subscript to σ .
 The argument relation is in parentheses after the σ
Thus, to select those tuples of the instructor relation
where the instructor is in the “Physics” department,
we write:
σ deptname=“Physics” (instructor)
We can find all instructors with salary greater than
$90,000 by writing:
σ salary>90000(instructor)
we allow comparisons using=,=,<,≤,>, and ≥ in the
selection predicate. Furthermore, we can combine
several predicates into a larger predicate by using
the connectives and (∧),or(∨), and not(¬).
Thus, to find the instructors in Physics with a
salary greater than $90,000, we write:
σ deptname=“Physics”∧salary>90000 (instructor)
The Project Operation

Projection is denoted by the uppercase Greek


letter pi (Π). We list those attributes that we wish
to appear in the result as a subscript to Π .
The argument relation follows in parentheses.
Π ID,name,salary(instructor)
Composition of Relational Operations

“Find the name of all instructors in the Physics


department.”
Πname(σdeptname=“Physics” (instructor))
The Union Operation
To find the set of all courses taught in the Fall
2009 semester, we write:
Πcourseid(σsemester=“Fall”∧year=2009(section))
To find the set of all courses taught in the Spring
2010 semester, we write:
Π courseid(σsemester=“Spring”∧year=2010(section))
Πcourseid(σsemester=“Fall”∧year=2009(section)) U Π
(σsemester=“Spring”∧year=2010(section))
courseid
The Set-Difference Operation

The set-difference operation, denoted by −,


allows us to find tuples that are in one relation
but are not in another. The expression r−s
produces a relation containing those tuples in r
but not in s.
Courses offered in the Fall 2009 semester but not
in Spring 2010 semester.
Πcourseid(σsemester=“Fall”∧year=2009(section)) - Π
courseid(σsemester=“Spring”∧year=2010(section))
The Cartesian-Product Operation

The Cartesian-product operation, denoted by a


cross (×), allows us to combine information from
any two relations. We write the Cartesian
product of relations r1 and r2 as r1×r2.
The Rename Operation

The rename operator, denoted by the lowercase


Greek letter rho (ρ),
ρ x(E)
Formal Definition of the Relational
Algebra

Let E1 and E2 be relational-algebra expressions.


Then, the following are all relational-algebra
expressions:
•E1∪E2
•E1−E2
•E1×E2
•σP(E1), where P is a predicate on attributes inE1
•ΠS(E1), where S is a list consisting of some of the
attributes inE1
ρx(E1), where x is the new name for the result of E1
Additional Relational-Algebra Operations

The Set-Intersection Operation


Suppose that we wish to find the set of all courses
taught in both theFall 2009 and the Spring 2010
semesters. Using set intersection, we can write
 Π courseid (σ semester=“Fall”∧year=2009(section)) ∩ Π
courseid (σ semester=“Spring”∧year=2010(section))
Note that we can rewrite any relational-algebra
expression that uses set intersection by replacing the
intersection operation with a pair of set-difference
operations as:
r∩s=r−(r−s)
The Natural-Join Operation

The natural join is a binary operation that allows


us to combine certain selections and a Cartesian
product into one operation.
It is denoted by the join symbol .

The natural-join operation forms a Cartesian


product of its two arguments,performs a
selection forcing equality on those attributes that
appear in both rela-tion schemas, and finally
removes duplicate attributes.
“Find the names of all instructors together with
the courseid of all courses they taught.
The Assignment Operation

It is convenient at times to write a relational-


algebra expression by assigning parts of it to
temporary relation variables.
The assignment operation, denoted by ←, works
like assignment in a programming language.
OUTER JOINs

Notice that much of the data is lost when applying a


join to two relations. In some cases this lost data
might hold useful information. An outer join retains
the information that would have been lost from the
tables, replacing missing data with nulls.
There are three forms of the outer join, depending on
which data is to be kept.
LEFT OUTER JOIN - keep data from the left-hand table
RIGHT OUTER JOIN - keep data from the right-hand
table
FULL OUTER JOIN - keep data from both tables
1. Write a relational algebra expression that
returns the food items required to cook the
recipe “Pasta and Meat-balls”. For each such food
item return the item paired with the number of
ounces required by the recipe.
2. Write a relational algebra expression that
returns food items that are sold at “Aldi” and
their price
3. Write a relational algebra expression that
returns food items (item) that are of type “Wheat
product” or oftype “Meat” and have at least 20
calories per ounce (attribute calories)

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy