0% found this document useful (0 votes)
64 views

Dbms Mod 3

This document provides an overview of the relational model and relational algebra. It discusses the objectives of learning about these topics, including understanding the importance of the relational model and how to interpret and write relational algebra queries. Key concepts from the relational model like relations, attributes, domains, keys, and foreign keys are defined. Relational algebra is introduced as the interface for manipulating data stored in relational databases.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
64 views

Dbms Mod 3

This document provides an overview of the relational model and relational algebra. It discusses the objectives of learning about these topics, including understanding the importance of the relational model and how to interpret and write relational algebra queries. Key concepts from the relational model like relations, attributes, domains, keys, and foreign keys are defined. Relational algebra is introduced as the interface for manipulating data stored in relational databases.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Database Management System 44

Module-03

Relational Model and Algebra


3.1 Motivation
This module provides the easy way to fire the query in database so that the access becomes faster.

3.2 Objective
The objective of this module is to learn the following points
• Understand why relational model is so important
• Be able to describe the essential features of the relational model
• Map EER Model into Relational model
• To be able to interpret relational algebra database queries
• To be able to write queries in relational algebra
• Understand how a relational algebra expression interrogates a relational database.

3.3 Syllabus
Prerequisites: Syllabus: Duration: Self study:
set theory of discrete Mapping the ER and EER Model 1 Hr 1 Hr
mathematic. to the Relational Model

Data Manipulation 1 Hr 1 Hr
Data Integrity
Advantages of the Relational Model
Relational Algebra 2 Hr 1 Hr
Relational Algebra Queries
Relational Calculus. 1 Hr 1 Hr

3.4 Definition

Relational Model-The model is based on branches of mathematics called set theory and predicate logic. The
basic idea behind the relational model is that a database consists of a series of unordered tables (or
relations) that can be manipulated using non-procedural operations that return tables

Domain- A domain D is a set of atomic values. By atomic we mean that each value in the domain is
indivisible as far as the formal relational model is concerned.
Attribute- is the name of a role played by some domain D in the relation schema R.

Relation schema- A relation schema2 R, denoted by R(A1, A2, ...,An), is made up of a relation name R and a
list of attributes, A1, A2, ..., An. A relation schema is used to describe a relation.
Module 3 : Relational Model and Algebra 45

Degree of a relation- The degree (or arity) of a relation is the number of attributes n of its relation schema.
A relation of degree seven, which stores information about university students, would
contain seven attributes describing each student. as follows:

STUDENT (Name, Ssn, Homophone, Address, Office-phone, Age, Gpa)

Relational algebra – is

• the formal description of how a relational database operates


• an interface to the data stored in the database itself
• the mathematics which underpin SQL operations

Super key – It is of an entity set is a set of one or more attributes whose values uniquely determine
each entity.
Candidate key –It is of an entity set is a minimal super key.
Primary key- The primary key of a relational table uniquely identifies each record in the table.

3.5 Learning
Relational Model-
The relational data model was first introduced by Ted Codd of IBM Research in 1970, and it attracted
immediate attention due to its simplicity and mathematical foundation. The model uses the concept of a
mathematical relation—which looks somewhat like a table of values—as its basic building block, and has its
theoretical basis in set theory and first-order predicate logic. In this module we discuss the basic
characteristics of the model and its constraints.

The first commercial implementations of the relational model became available in the early 1980s, such as
the SQL/DS system on the MVS operating system by IBM and the Oracle DBMS. Since then, the model has
been implemented in a large number of commercial systems. Current popular relational DBMSs (RDBMSs)
include DB2 and Informix Dynamic Server (from IBM), Oracle and Rdb (from Oracle), Sybase DBMS (from
Sybase) and SQLServer and Access (from Microsoft). In addition, several open source systems, such as
MySQL and PostgreSQL, are available.

Data models that preceded the relational model include the hierarchical and network models. They were
proposed in the 1960s and were implemented in early DBMSs during the late 1960s and early 1970s. These
models and systems are now referred to as legacy database systems.

Relational Model Components

The relational model can be divided into three parts: objects, integrity, and operators. Each of these three
parts of the model will each be discussed in detail in this module. The objects comprise the structural
aspects of the model. The integrity portion includes constraints on the data. The operators include the
various ways relational data can be manipulated.
Relations and Attributes
The fundamental object within the relational model is a relation, which is an object that contains a set of
scalar or atomic (i.e., single-valued) values in tuples. Each tuple has a fixed set of attributes. A relation
Database Management System 46

actually has two parts: a heading, with a fixed set of attribute definitions, and a body, which is the set of
tuples. The tuples comprise the actual data in the relation. An example of a relation, the CUSTOMERS
relation, is shown in the example in figure below.

In the formal relational model terminology, a row is called a tuple, a column header is called an
attribute, and the table is called a relation. The data type describing the types of values that
can appear in each column is represented by a domain of possible values.

Keys

Superkeys
All relations have one or more superkeys, which consist of one or more attributes that uniquely define each
tuple in a relation. Thus the combined set of attribute values for each superkey is unique. For example, in the
CUSTOMERS relation, each tuple's combined values for Customer ID, Last Name, and First Name (although
this is a larger attribute set than we really need for uniqueness) would certainly distinguish each different
customer from all others.

Candidate Keys
If we remove the attributes from a superkey not necessary to provide uniqueness, we are left with a
candidate key. In our CUSTOMERS relation example, we really don't need the Last Name and First Name
attributes for unique tuple identification since the Customer ID alone can uniquely identify each customer
tuple. So, from the superkey of Customer ID, Last Name, and First Name, we can derive a candidate key of
simply Customer ID by reducing the three attributes to just one. Note that each attribute value in a
candidate key must be non-null, since having a null component for one or more of the candidate key
attributes does not give the required uniqueness. Thus candidate keys are unique, non-null, and irreducible.
(Removing any further attributes destroys the uniqueness.) A candidate key is a minimal superkey.
Candidate keys also possess a property of time-invariance, since they must remain unique for all time. This
means that even as more tuples are added to a relation, the candidate keys must always remain unique.
A relation can have more than one candidate key. For example, our CUSTOMERS relation might have had
unique identifiers of not only Customer ID but also an attribute of Customer SSN for each tuple. In this
situation, each of these attributes independently provides uniqueness. Also, remember that all tuples of a
relation are unique by definition in the relational model. In the worst case we are guaranteed always to have
a candidate key by simply including all attributes in the relation.

Primary Keys
Module 3 : Relational Model and Algebra 47

One of the candidate keys of a relation is customarily designated as the primary key. The application
determines which of multiple candidate keys becomes the primary key. If there is only one candidate key
(and there will always be at least one), then the choice is obvious. If there are multiple candidate keys to
choose from, then any candidate key not selected as the primary key is designated as an alternate key.
Whereas a relation will always have a primary key (since its tuples are unique), a table, since it can contain
duplicate rows, may not have a primary key.
Primary keys that consist of more than one attribute are known as composite primary keys. Although the
choice of primary key among the candidate keys is based on database design convenience and end-user
preference, it is usually better for data manipulation when a single attribute or at least as small a set of
attributes as possible is selected. When the number of attributes and different data types becomes unwieldy
it might be desirable to generate a surrogate key as the primary key. In database applications this can be
accomplished via the use of simple, sequential number values for an identifying column or table, where the
sequential numbers for this added column are machine-generated. The Order ID attribute of the ORDERS
relation in figure 2.2 is an example of a surrogate key. Notice that the Order ID is also the primary key of the
ORDERS relation.
Figure 2.2

In the relational model, the entity integrity constraint states that no part of a primary key may contain a null.
Therefore if the primary key of some relation is a composite of multiple different attributes, none of these
can ever be null. This constraint follows since primary keys must uniquely identify tuples. With an unknown
value this identification would not be possible. Recall our earlier statement that all attributes of a candidate
key must also be non-null.

Foreign Keys
In a relational database where there are multiple relations related to one another, a foreign key in one
relation is the method used to logically link to a primary key in another relation. Stated differently, the
foreign key in a referencing relation references a primary key in a referenced relation. For example, in the
ORDER relation of figure 2.2, there is a Cust ID attribute that contains the Customer IDs of CUSTOMERS
having placed orders. These foreign key Cust IDs reference the unique Customer ID identifiers in the
CUSTOMERS relation, where Customer ID is the primary key. Notice that Cust ID and Customer ID, even
though they have different attribute names, are linked based on matching values from the same domain.
Some interpretations of the relational model only require that a foreign key reference an existing candidate
key, and not necessarily the primary key.

Characteristics of Relations-
• Ordering of Tuples in a Relation-
A relation is defined as a set of tuples. Mathematically, elements of a set have no order among
them; hence, tuples in a relation do not have any particular order In other words, a relation is not
sensitive to the ordering of tuples. However, in a file, records are physically stored on disk (or in
memory), so there always is an order among the records. This ordering indicates first, second, ith, and
Database Management System 48

last records in the file. Similarly, when we display a relation as a table, the rows are displayed in a certain
order

• Ordering of Values within a Tuple and an Alternative Definition of a Relation-


a relation schema R = {A1, A2, ..., An} is a set of attributes (instead of a list), and a relation state r(R)
is a finite set of mappings r = {t1, t2, ..., tm}, where each tuple ti is a mapping from R to D, and D is
the union of the attribute domains.

• Values and NULLs in the Tuples-


Each value in a tuple is an atomic value; that is, it is not divisible into components within the
framework of the basic relational model. Hence, composite and multivalued attributes are not allowed.
An important concept is that of NULL values, which are used to represent the values of attributes that
may be unknown or may not apply to a tuple.

Relations versus Tables

The rules that exist for relations in the relational model are not generally enforced for relational database
tables. Specifically, for relations there is no sense of ordering for tuples or attributes. This is not the case for
relational database tables (or files), where there is an implied "first" row and an implied "third" column.
Rearranging the tuples and attributes of a relation yields the same relation, but rearranging the rows and
columns of a table results in a new table. Once again notice the cautious use of the terms relation, tuple, and
attribute here for the relational model context and the terms table, row, and column for the relational
database context. Although these sets of terms are analogous, they are not equivalent because of the
differences between relations and tables.
Another major difference between relations and tables is that relations cannot have duplicate tuples,
whereas a table can have duplicate rows. Note that we normally would not want a relational database table
to have duplicate rows. However, in some applications, such as decision support systems (covered in the
advanced relational databases course), this can occur. For example, duplicate rows might arise during the
extraction of data from multiple legacy systems and the merger of this data into a single relational database
table as it is being prepared for a data warehouse.
The entire set of relations for a relational database is referred to as its schema. The actual values that exist at
some point in time in the database's relations are collectively called its state. A relational database therefore
can be thought of as a combination of a schema and a current state.

Relational Model Constraints


In a relational database, there will typically be many relations, and the tuples in those relations are
usually related in various ways. The state of the whole database will correspond to the states of all its
relations at a particular point in time. There are generally many restrictions or constraints on the actual
values in a database state. These constraints are derived from the rules in the miniworld that the database
represents.
Constraints on databases can generally be divided into three main categories:
1. Constraints that are inherent in the data model. We call these inherent model-based
constraints or implicit constraints.
2. Constraints that can be directly expressed in schemas of the data model, typically by
specifying them in the DDL (data definition language). We call these schema-based
constraints or explicit constraints.
3. Constraints that cannot be directly expressed in the schemas of the data model, and
hence must be expressed and enforced by the application programs. We call these
application-based or semantic constraints or business rules.

The schema-based constraints include


• domain constraints,
Module 3 : Relational Model and Algebra 49

• key constraints,
• constraints on NULLs,
• entity integrity constraints, and
• referential integrity constraints.

A. Domain constraints
Each attribute Ai must be an atomic value from dom( Ai) for that attribute.

B. Key Constraints

Superkey of R: A set of attributes, SK, of R such that no two tuples in any valid relational instance,
r( R), will have the same value for SK. Therefore, for any two distinct tuples, t1 and t2 in r( R),
t1[ SK] != t2[SK].

Key of R: A minimal superkey. That is, a superkey, K, of R such that the removal of ANY attribute from
K will result in a set of attributes that are not a superkey.

Example CAR( State, LicensePlateNo, VehicleID, Model, Year, Manufacturer)


This schema has two keys:
K1 = { State, LicensePlateNo}
K2 = { VehicleID }
Both K1 and K2 are superkeys.
K3 = { VehicleID, Manufacturer} is a superkey, but not a key (Why?).

If a relation has more than one keys, we can select any one (arbitrarily) to be the primary key. Primary
Key attributes are underlined in the schema:
CAR(State, LicensePlateNo, VehicleID, Model, Year, Manufacturer)

C. Entity Integrity Constraints

The primary key attribute, PK, of any relational schema R in a database cannot have null values in any
tuple. In other words, for each table in a DB, there must be a key; for each key, every row in the table
must have non-null values. This is because PK is used to identify the individual tuples.
Mathematically, t[PK] != NULL for any tuple t € r( R).

D. Referential Integrity Constraints


Referential integrity constraints are used to specify the relationships between two relations in a database.
Consider a referencing relation, R1, and a referenced relation, R2. Tuples in the referencing relation, R1,
have attributed FK (called foreign key attributes) that reference
the primary key attributes of the referenced relation, R2. A tuple, t1, in R1 is said to reference a tuple, t2,
in R2 if t1[FK] = t2[PK].
A referential integrity constraint can be displayed in a relational database schema as a directed arc from
the referencing (foreign) key to the referenced (primary) key. Examples are shown in the figure below:

Codd’s 12 Rules for Relational Model:


Codd's 12 rules are a set of thirteen rules proposed by Edgar F. "Ted" Codd, a pioneer of the
relational model for databases, designed to define what is required from a database management
system in order for it to be considered relational, i.e., a RDBMS.
Codd produced these rules as part of a personal campaign to prevent his vision of the relational
database being diluted, as database vendors scrambled in the early 1980s to repackage existing products
with a relational veneer.
Rule 12 was particularly designed to counter such a positioning. In fact, however, the rules are so
strict that even systems whose only interface is the SQL language fail on some of the criteria.
Database Management System 50

The Rules

Rule 0: The system must qualify as relational, as a database, and as a management system.

Rule 1: The information must be represented in one and only one way.

Rule 2: All data must be accessible with no ambiguity.

Rule 3: The system must provide systematic treatment of null values.

Rule 4: The system must have active online catalog based on the relational model.

Rule 5: The system must use comprehensive data language.

Rule 6: The system must allow view updating.

Rule 7: The system must support set-at-a-time insert, update, and delete operators.

Rule 8: The system must support physical data independence.

Rule 9: The system must support logical data independence.

Rule 10: The system must provide integrity independence.

Rule 11: The system must support distribution independence.

Rule 12: The system must not support subverting interface.

Steps for Converting an E-R Diagram into a Relational Database Schema

Assume the following ER Diagram


Module 3 : Relational Model and Algebra 51

STEP 1: For each Strong entity, create a relation (or table) that includes all of the simple attributes of that
entity. Do not include multivalued attributes or derived attributes at this time. If you have a composite
attribute, inc lude only the component attributes. (For example, if you had a composite attribute ADDRESS
made up of the component attributes STREET, CITY, STATE, and ZIP, you would include the 4 components
and not the composite as fields.) Choose one of the candidate keys to be the primary key of the table. If the
candidate key you choose to be the primary key is a composite attribute, then all of its component attributes
together will become the primary key.

EXAMPLE: From the Company E-R diagram on p. 46 of the text, you would get the following table definitions
by following step 1:

EMPLOYEE

Fname Minit Lname SSN Bdate Address Sex Salary

DEPARTMENT

Name Number

PROJECT
Database Management System 52

Name Number Location

STEP 2: For each weak entity, create a relation that includes all simple attributes (or simple components of
composite attributes) of the weak entity. In addition, include as a foreign key attribute the primary key of
the owning entity. The primary key of this relation will be the combination of the primary key of the owning
entity and the partial key of the weak entity.

EXAMPLE: The relation for the weak entity DEPENDENT would look like this after step 2:

DEPENDENT

FK
Name Sex Birthdate Relationship Emp_SSN

STEP 3: For each binary 1:1 relationship, identify the two entities that participate in that relationship. Choose
one of the entities -- preferably the one with total participation in the relationship -- and think of it as E1.
The other entity is E2. Take the primary key from E2 and include it as a foreign key in E1. If the relationship
has simple attributes, include those in the relation for E1.

EXAMPLE: Manages is a 1:1 relationship between EMPLOYEE and DEPARTMENT. We choose DEPARTMENT
as E1, because it participates totally in the relationship. So after step 3, the EMPLOYEE and DEPARTMENT
relations would look like this:

DEPARTMENT (E1)

FK
Name Number Mgr_SSN Mgr_Startdate

EMPLOYEE (E2) -- NOTE: The EMPLOYEE relation is unchanged from step 1

Fname Minit Lname SSN Bdate Address Sex Salary

STEP 4: For each non-weak binary 1:N relationship, identify the entity E1 that is at the N-side (the "many"
side) of the relationship. The other entity in the relationship is E2. Include as a foreign key in E1 the primary
key of E2. Include any simple attributes (or simple components of composite attributes) of the relationship
as attributes of E1.

EXAMPLE: We have three 1:N relationships: Works_for, Controls, and Supervision.

Works_for: EMPLOYEE is on the N-side of the relationship, so after doing step 4 for Works_for, it will look
like the following:

EMPLOYEE

FK
Fname Minit Lname SSN Bdate Address Sex Salary Dept_Num

Controls: PROJECT is on the N-side of the relationship, so after doing step 4 for Controls, it will look like the
following:
Module 3 : Relational Model and Algebra 53

PROJECT

FK
Name Number Location Dept_Num

Supervision: EMPLOYEE is on the N-side of the relationship in the supervisee role, so after doing step 4
for Supervision, it will look like the following:

EMPLOYEE

FK FK
Fname Minit Lname SSN Bdate Address Sex Salary Dept_Num Super_SSN

Note that in this case the Foreign Key Super_SSN has actually come from the SSN in the EMPLOYEE relation,
since the EMPLOYEE entity is acting in two different roles in the Supervision relationship.

Step 5: For each binary M:N relationship, create a new relation to represent the relationship. Include in this
relation as foreign keys the primary keys of each of the entities that participates in the relationship. The
combination of these foreign keys will make up the primary key for the relation. Also include any simple
attributes (or simple components of composite attributes) of the relationship.

EXAMPLE: We have one M:N relationship, Works_On. After step 5, we will have a relation for WORKS_ON
that looks like this:

WORKS_ON

FK FK
Emp_SSN Proj_Num Hours

Step 6: For each multivalued attribute, create a new relation that includes that attribute, plus the primary
key of the entity to whom that attribute belongs as a foreign key. The primary key of this new relation will be
the combination of the foreign key and the attribute itself. If the multivalued attribute is also composite, we
include only its simple components.

EXAMPLE: We have only one multivalued attribute, the Locations attribute of DEPARTMENT. We create a
relation called DEPT_LOCATIONS that will look like this:

DEPT_LOCATIONS

FK
Location Dept_Num

(At this point we have completed the conversion of the E-R diagram -- see fig. 7.1 (fig 3.2 in 3rd ed.) to
relational tables -- see fig. 7.2 (fig 7.7 in 3rd ed.) to see the complete schema.)

Step 7: For each relationship with 3 or more participating entities, create a new relation to represent the
relationship. Include the primary keys of each of the participating entities in the new relation as foreign keys.
Also include any simple attributes (or simple components of composite attributes) of the relationship. The
primary key will usually be a combination of all of the foreign keys that represent the entities that participate
Database Management System 54

in the relationship. However, if any of the participating entities are on the 1-side of the relationship, then the
primary key of the relation should not include the foreign key from that entity.

EXAMPLE: Remember back to our Supplier - Part - Project ternary relationship. See the E-R diagram in fig.
4.11a (fig 4.13a in 3rd ed.). After following the above steps, the relations would look like the following:

SUPPLIER

SName <other fields>

PART

PartNo <other fields>

PROJECT

ProjName <other fields>

SUPPLY

FK FK FK
SName PartNo ProjName Quantity

Step 8: If you have a superclass/subclass structure, there are four options for how to translate it into
relations. Note that each of these options is mutually exclusive.

Option 8a: Can be used with any combination of disjoint vs. overlapping, total vs. partial. Create a new
relation for the superclass entity and include the attributes of the superclass entity. The primary key will be
the primary key of the superclass entity. Create a new relation for each of the subclass entitites, including
the attributes of the particular subclass and the primary key of the superclass entity as a foreign key. The
primary key of each subclass relation will be the foreign key from the superclass relation.

Option 8b: Can be used only with disjoint subclasses with total participation. Create a new relation for each
subclass entity that includes all of the attributes of the particular subclass and all of the attributes of the
superclass. The primary key will be the primary key of the superclass entity.

Option 8c: Can only be used for disjoint subclasses. Not recommended if there are many attributes defined
at the subclass level. Create one new relation with all of the attributes from the superclass entity and all of
the attributes from each of the subclass entities. Also include a "type" attribute that will indicate the
subclass to which each instance belongs.

Option 8d: Can only be used for overlapping subclasses. Not recommended if there are many attributes
defined at the subclass level.Create one new relation with all of the attributes from the superclass entity
and all of the attributes from each of the subclass entities. For each subclass, there must be a True/False
(Boolean) attribute for each whose value will depend on whether or not a particular instance belongs to that
subclass.

3.6 Key definition


Module 3 : Relational Model and Algebra 55

Super key – It is of an entity set is a set of one or more attributes whose values uniquely determine
each entity.
Candidate key –It is of an entity set is a minimal super key.

3.7 Module theory

Symbolic Notation

• SELECT ->σ (sigma)


• PROJECT -> π(pi)
• PRODUCT -> ×(times)
• JOIN -> |×| (bow-tie)
• UNION ->  (cup)
• INTERSECTION -> ^ (cap)
• DIFFERENCE -> - (minus)
• RENAME ->ρ (rho)

Theory-

Relational Languages

A relational language is an abstract language which provides the database user with an interface through
which they can specify data to be retrieved according to certain selection criteria. The two main relational
languages are relational algebra and relational calculus. Relational Algebra is a collection of operations
on Relations.Relations are operands and the result of an operation is another relation. Relational algebra
operations work on one or more relations to define another relation without changing the original
relations. Thus, both operands and results are relations, so output from one operation can become input to
another operation.

Two main collections of relational operators:


1. Set theory operations:
Union, Intersection, Difference and Cartesian product.
2. Specific Relational Operations:
Selection, Projection, Join, Division

Set Theoretic Operations

First Last Age


Bill Smith 22
Sally Green 28
Database Management System 56

Mary Keen 23 Consider the following relations R and S

Tony Jones 32

First Last Age


Forrest Gump 36
Sally Green 28
DonJuan DeMarco 27

R S

• Union: R S
Result: Relation with tuples from R and S with duplicates removed.

R S

First Last Age


Bill Smith 22
Sally Green 28
Mary Keen 23
Tony Jones 32
Forrest Gump 36
DonJuan DeMarco 27

• Difference: R - S
Result: Relation with tuples from R but not from S

R-S

First Last Age


Bill Smith 22
Mary Keen 23
Tony Jones 32

• Intersection: R S
Result: Relation with tuples that appear in both R and S.

R S
Module 3 : Relational Model and Algebra 57

First Last Age


Sally Green 28

Union Compatible Relations-

• Attributes of relations need not be identical to perform union, intersection and difference
operations.
• However, they must have the same number of attributes or arity and the domains for corresponding
attributes must be identical.
• Domain is the datatype and size of an attribute.
• The degree of relation R is the number of attributes it contains.
• Definition: Two relations R and S are union compatible if and only if they have the same degree and
the domains of the corresponding attributes are the same.
• Some additional properties:
o Union, Intersection and difference operators may only be applied to Union Compatible
relations.
o Union and Intersection are commutative operations
R S=S R
R S=S R
o Difference operation is NOT commutative.
R - S not equal S - R
o The resulting relations may not have meaningful names for the attributes. Convention is to
use the attribute names from the first relation.
• Cartesian Product

Produce all combinations of tuples from two relations.

R S

First Last Age


Dinner Dessert
Bill Smith 22
Steak Ice Cream
Mary Keen 23
Lobster Cheesecake
Tony Jones 32

RXS

First Last Age Dinner Dessert


Bill Smith 22 Steak Ice Cream
Bill Smith 22 Lobster Cheesecake
Mary Keen 23 Steak Ice Cream
Mary Keen 23 Lobster Cheesecake
Tony Jones 32 Steak Ice Cream
Tony Jones 32 Lobster Cheesecake
Database Management System 58

Selection Operator

• Selection and Projection are unary operators.


• The selection operator is sigma:
• The selection operation acts like a filter on a relation by returning only a certain number of tuples.
• The resulting relation will have the same degree as the original relation.
• The resulting relation may have fewer tuples than the original relation.
• The tuples to be returned are dependent on a condition that is part of the selection operator.
• C (R) Returns only those tuples in R that satisfy condition C
• A condition C can be made up of any combination of comparison or logical operators that operate on
the attributes of R.
o Comparison operators:
o Logical operators:

Selection Examples

Assume the following relation EMP has the following tuples:

Name Office Dept Rank

Smith 400 CS Assistant

Jones 220 Econ Adjunct

Green 160 Econ Assistant

Brown 420 CS Associate

Smith 500 Fin Associate

Select only those Employees in the CS department:


Dept = 'CS' (EMP)
Result:

Name Office Dept Rank

Smith 400 CS Assistant

Brown 420 CS Associate

Select only those Employees with last name Smith who are assistant professors:
Name = 'Smith' Rank = 'Assistant' (EMP)
Result:

Name Office Dept Rank

Smith 400 CS Assistant


Module 3 : Relational Model and Algebra 59

• Select only those Employees who are either Assistant Professors or in the Economics department:
Rank = 'Assistant' Dept = 'Econ' (EMP)
Result:

Name Office Dept Rank

Smith 400 CS Assistant

Jones 220 Econ Adjunct

Green 160 Econ Assistant

Select only those Employees who are not in the CS department or Adjuncts:
(Rank = 'Adjunct' Dept = 'CS') (EMP)
Result:

Name Office Dept Rank

Green 160 Econ Assistant

Smith 500 Fin Associate

Projection Operator

• Projection is also a Unary operator.


• The Projection operator is pi:
• Projection limits the attributes that will be returned from the original relation.
• The general syntax is: attributes R
Where attributes is the list of attributes to be displayed and R is the relation.
• The resulting relation will have the same number of tuples as the original relation (unless there are
duplicate tuples produced).
• The degree of the resulting relation may be equal to or less than that of the original relation.

Projection Examples
Assume the same EMP relation above is used.

• Project only the names and departments of the employees:


name, dept (EMP)
Results:

Name Dept

Smith CS

Jones Econ

Green Econ
Database Management System 60

Brown CS

Smith Fin

Combining Selection and Projection

• The selection and projection operators can be combined to perform both operations.
• Show the names of all employees working in the CS department:

name ( Dept = 'CS' (EMP) )


Results:

Name

Smith

Brown

• Show the name and rank of those Employees who are not in the CS department or Adjuncts:

name, rank ( (Rank = 'Adjunct' Dept = 'CS') (EMP) )


Result:

Name Rank

Green Assistant

Smith Associate

Aggregate Functions

• We can also apply Aggregate functions to attributes and tuples:


o SUM
o MINIMUM
o MAXIMUM
o AVERAGE, MEAN, MEDIAN
o COUNT
• Aggregate functions are sometimes written using the Projection operator or the Script
F character: .
Module 3 : Relational Model and Algebra 61

Aggregate Function Examples


Assume the relation EMP has the following tuples:

Name Office Dept Salary

Smith 400 CS 45000

Jones 220 Econ 35000

Green 160 Econ 50000

Brown 420 CS 65000

Smith 500 Fin 60000

• Find the minimum Salary: MIN (salary) (EMP)


Results:

MIN(salary)

35000

• Find the average Salary: AVG (salary) (EMP)


Results:

AVG(salary)

51000
Database Management System 62

• Count the number of employees in the CS department: COUNT (name) ( Dept = 'CS' (EMP) )
Results:

COUNT(name)

• Find the total payroll for the Economics department: SUM (salary) ( Dept = 'Econ' (EMP) )
Results:

SUM(salary)

85000

Join Operation-

• Join operations bring together two relations and combine their attributes and tuples in a specific
fashion.
• The generic join operator (called the Theta Join is:
• It takes as arguments the attributes from the two relations that are to be joined.
• For example assume we have the EMP relation as above and a separate DEPART relation with (Dept,
MainOffice, Phone) :
EMP EMP.Dept = DEPART.Dept DEPART

• The join condition can be


• When the join condition operator is = then we call this an Equijoin
• Note that the attributes in common are repeated.

Join Examples
Assume we have the EMP relation from above and the following DEPART relation:

Dept MainOffice Phone

CS 404 555-1212

Econ 200 555-1234

Fin 501 555-4321

Hist 100 555-9876


Module 3 : Relational Model and Algebra 63

• Find all information on every employee including their department info:


EMP emp.Dept = depart.Dept DEPART
Results:

Name Office EMP.Dept Salary DEPART.Dept MainOffice Phone

Smith 400 CS 45000 CS 404 555-1212

Jones 220 Econ 35000 Econ 200 555-1234

Green 160 Econ 50000 Econ 200 555-1234

Brown 420 CS 65000 CS 404 555-1212

Smith 500 Fin 60000 Fin 501 555-4321

• Find all information on every employee including their department info where the employee works
in an office numbered less than the department main office:
EMP (emp.office < depart.mainoffice) (emp.dept = depart.dept) DEPART
Results:

Name Office EMP.Dept Salary DEPART.Dept MainOffice Phone

Smith 400 CS 45000 CS 404 555-1212

Green 160 Econ 50000 Econ 200 555-1234

Smith 500 Fin 60000 Fin 501 555-4321

Natural Join

• The Natural Join operation removes these duplicate attributes.


• The natural join operator is: *
• We can also assume using * that the join condition will be = on the two attributes in common.
• Example: EMP * DEPART
Results:

Name Office Dept Salary MainOffice Phone


Database Management System 64

Smith 400 CS 45000 404 555-1212

Jones 220 Econ 35000 200 555-1234

Green 160 Econ 50000 200 555-1234

Brown 420 CS 65000 404 555-1212

Smith 500 Fin 60000 501 555-4321

Outer Join

• In the Join operations so far, only those tuples from both relations that satisfy the join condition are
included in the output relation.
• The Outer join includes other tuples as well according to a few rules.
• Three types of outer joins:
1. Left Outer Join includes all tuples in the left hand relation and includes only those
matching tuples from the right hand relation.
2. Right Outer Join includes all tuples in the right hand relation and includes ony those
matching tuples from the left hand relation.
3. Full Outer Join includes all tuples in the left hand relation and from the right hand
relation.
• Examples:

Assume we have two relations: PEOPLE and MENU:

PEOPLE: MENU:

Name Age Food Food Day

Alice 21 Hamburger Pizza Monday

Bill 24 Pizza Hamburger Tuesday

Carl 23 Beer Chicken Wednesday

Dina 19 Shrimp Pasta Thursday

Tacos Friday

• PEOPLE people.food = menu.food MENU


Module 3 : Relational Model and Algebra 65

Name Age people.Food menu.Food Day

Alice 21 Hamburger Hamburger Tuesday

Bill 24 Pizza Pizza Monday

Carl 23 Beer NULL NULL

Dina 19 Shrimp NULL NULL

• PEOPLE people.food = menu.food MENU

Name Age people.Food menu.Food Day

Bill 24 Pizza Pizza Monday

Alice 21 Hamburger Hamburger Tuesday

NULL NULL NULL Chicken Wednesday

NULL NULL NULL Pasta Thursday

NULL NULL NULL Tacos Friday

• PEOPLE people.food = menu.food MENU

Name Age people.Food menu.Food Day

Alice 21 Hamburger Hamburger Tuesday

Bill 24 Pizza Pizza Monday

Carl 23 Beer NULL NULL

Dina 19 Shrimp NULL NULL

NULL NULL NULL Chicken Wednesday


Database Management System 66

NULL NULL NULL Pasta Thursday

NULL NULL NULL Tacos Friday

Outer Union

• The Outer Union operation is applied to partially union compatible relations.


• Operator is: *
• Example: PEOPLE * MENU

Name Age Food Day

Alice 21 Hamburger NULL

Bill 24 Pizza NULL

Carl 23 Beer NULL

Dina 19 Shrimp NULL

NULL NULL Hamburger Monday

NULL NULL Pizza Tuesday

NULL NULL Chicken Wednesday

NULL NULL Pasta Thursday

NULL NULL Tacos Friday

Case Study:
1) Convert the following ER model into Relational moidel
Module 3 : Relational Model and Algebra 67

Solution:

2) Suppose that we decompose the schema R = (A, B, C, D, E) into


(A, B, C)
(A, D, E).
Show that this decomposition is a lossless-join decomposition if the
following set F of functional dependencies holds:
A → BC
CD → E
B→D
E→A

Answer: A decomposition {R1, R2} is a lossless-join decomposition if R1 ∩ R2 → R1 or R1 ∩ R2


→ R2. Let R1 = (A, B, C), R2 = (A, D, E), and R1 ∩ R2 = A. Since A is a candidate key (see Practice
Exercise 8.6), Therefore R1 ∩ R2 → R1.

3) Use Armstrong’s axioms to prove the soundness of the union rule. (Hint: Use the
augmentation rule to show that, if a→b, then a→ab.Apply the augmentation rule again, using
a → g, and then apply the transitivity rule.)

Answer: To prove that:


if a → b and a → g then a → bg
Following the hint, we derive:
a → b given
aa → ab augmentation rule
a → ab union of identical sets
a → g given
ab → g b augmentation rule
a → bg transitivity rule and set union commutativity

3.8 Objective Questions


In the relational modes, cardinality is termed as:
Database Management System 68

(A) Number of tuples. (B) Number of attributes.


(C) Number of tables. (D) Number of constraints.

Cartesian product in relational algebra is


(A) a Unary operator. (B) a Binary operator.
(C) a Ternary operator. (D) not defined.

In case of entity integrity, the primary key may be


(A) not Null (B) Null
(C) both Null & not Null. (D) any value.
Ans: A

What is a relationship called when it is maintained between two entities?


(A) Unary (B) Binary
(C) Ternary (D) Quaternary

Which of the following operation is used if we are interested in only certain columns of a
table?
(A) PROJECTION (B) SELECTION
(C) UNION (D) JOIN
The RDBMS terminology for a row is
(A) tuple. (B) relation.
(C) attribute. (D) degree.

The natural join is equal to :


(A) Cartesian Product
(B) Combination of Union and Cartesian product
(C) Combination of selection and Cartesian product
(D) Combination of projection and Cartesian product

Which of the following is a group of one or more attributes that uniquely identifies a row?
(A) Key (B) Determinant (C) Tuple
A relation is considered a:
A. Column.

B. one-dimensional table.

C. two-dimensional table.

D. three-dimensional table.

In the relational model, relationships between relations or tables are created by using:
A. composite keys.
Module 3 : Relational Model and Algebra 69

B. determinants.

C. candidate keys.

D. foreign keys.
A key:
A. must always be composed of two or more columns.

B. can only be one column.

C. identifies a row.

D. identifies a column.

1. SQL provides a constraint called______


a) Secondary key b) Primary key c) Key
2. A logical table that derives data from other table is called _______
a) Cursor b) View c) Database
3. Set of operations that allows retrieval of data is called______
a) Relational Algebra b) Cardinality c) Relational Calculus
5. A key consisting of two or more columns is called_____
a) Composite key b) candidate key c) primary key

3.9 Subjective Question


1. Consider the following relational schema: (7)
PERSON (SS#, NAME, ADDRESS)
CAR (REGISTRATION_NUMBER, YEAR, MODEL)
ACCIDENT (DATE, DRIVER, CAR_REG_NO)
OWNS (SS#, LICENSE)
Construct the following relational algebra queries:
(i) Find the names of persons who are involved in an accident.
(ii) Find the registration number of cars which were not involved in any accident.
2. Explain the concepts of relational data model. Also discuss its advantages and disadvantages.
3. Consider the following relational schema:
Doctor(DName,Reg_no)
Patient(Pname, Disease)
Assigned_To (Pname,Dname)
Database Management System 70

Give expression in relational algebra for each of the queries:


(i) Get the names of patients who are assigned to more than one doctor.
(ii) Get the names of doctors who are treating patients with ‘Polio’.
4. Consider the relational database given below, where the primary keys are underlined.
Give an expression in the relational algebra to express each of the following queries:
a. Find the names of all employees who work for First Bank Corporation.
b. Find the names and cities of residence of all employees who work for First
Bank Corporation.
c. Find the names, street address, and cities of residence of all employees who work
for First Bank Corporation and earn more than $10,000 per annum.
d. Find the names of all employees in this database who live in the same city
as the company for which they work.
e. Find the names of all employees who live in the same city and on the same
Street as do their managers. .
f. Find the names of all employees in this database who do not work for First
Bank Corporation.
g. Find the names of all employees who earn more than every employee of
Small Bank Corporation.
h. Assume the companies may be located in several cities. Find all companies
located in every city in which Small Bank Corporation is located.
employee (person-name, street, city)
works (person-name, company-name, salary)
company (company-name, city)
manages (person-name, manager-name)
5. The outer-join operations extend the natural-join operation so that tuples from the participating
relations are not lost in the result of the join. Describe how the theta join operation can be extended
so that tuples from the left, right, or both relations are not lost from the result of a theta join.
6. Consider the relational database given below. Give an expression in the relational
Algebra for each request:
a. Modify the database so that Jones now lives in Newtown.
b. Give all employees of First Bank Corporation a 10 percent salary raise.
c. Give all managers in this database a 10 percent salary raise.
d. Give all managers in this database a 10 percent salary raise, unless the salary
would be greater than $100,000. In such cases, give only a 3 percent raise.
e. Delete all tuples in the works relation for employees of Small Bank Corporation.
employee (person-name, street, city)
Module 3 : Relational Model and Algebra 71

works (person-name, company-name, salary)


company (company-name, city)
manages (person-name, manager-name)
7. Using the bank example, write relational-algebra queries to find the account held by more than
two customers in the following ways:
a. Using an aggregate function.
b. Without using any aggregate functions.
8. Consider the relational database given below. Give a relational-algebra expression for each of
the following queries:
a. Find the company with the most employees.
b. Find the company with the smallest payroll.
c. Find those companies whose employees earn a higher salary, on average, than the
average salary at First Bank Corporation.
employee (person-name, street, city)
works (person-name, company-name, salary)
company (company-name, city)
manages (person-name, manager-name)

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy