Chapter 01 Relational Algebra
Chapter 01 Relational Algebra
Chapter 01 Relational Algebra
Query Languages
● Query language is a language in which user requests information from the database.
● Categories of languages
● Procedural
● Non-procedural, or declarative
● “Pure” languages(Procedural Languages):
● Relational algebra
● Tuple relational calculus
● Domain relational calculus
● Pure languages form underlying basis of query languages that people use.
Relational Algebra
● Procedural language
● Six basic operators
● select: σ
● project: ∏
● union: ∪
● set difference: –
● Cartesian product: x
● rename: ρ
● The operators take one or two relations as inputs and produce a new
relation as a result.
Relational Algebra Example
● Consider Player Relation
● Example of selection:
σ branch_name=“Perryridge”(account)
Select Operation – Example
● Relation r
A B C D
α α 1 7
α β 5 7
β β 12 3
β β 23 10
α α 1 7
β β 23 10
Project π
● Project /Projection (π):It is used to project the distinct values from the relations. It can project the
mentioned attributes. It will project all the attributes from the relation. It will removes the duplicate values.
It reduces the arity of the relation.
α 10 1
α 20 1
β 30 1
β 40 2
∏A,C (r) A C A C
α 1 α 1
α 1 = β 1
β 1 β 2
β 2
Project Operation
● Notation:
α 1 α 2
α 2 β 3
β 1
s
r
A B
● r ∪ s: α 1
α 2
β 1
β 3
Union Operation
● Notation: r ∪ s
● Defined as:
r ∪ s = {t | t ∈ r or t ∈ s}
● For r ∪ s to be valid.
1. r, s must have the same arity (same number of attributes)
2. The attribute domains must be compatible (example: 2nd column
of r deals with the same type of values as does the 2nd
column of s)
● Example: to find all customers with either an account or a loan
∏customer_name (depositor) ∪ ∏customer_name (borrower)
● Union Operation(U) :If R & S are compatible relations, then union of R & S is the set theoretic
union of R & S. The resultant relation P=R U S has tuples drawn from R & S such that, a tuple
in P is either in R or S or in both of them. Union eliminates the all the duplicates. Union
operation does not change the arity of resultant relation but change the cardinality
Depositor Borrower
Acc_No Name Loan_No Name
A-231 Rahul P-3261 Sachin
A-432 Omkar Q-6934 Raj
R-321 Sachin S-4321 Ramesh
S-231 Raj T-6281 Anil
T-239 Sumit
Name
Find the name of customer having an account/loan Rahul
Omkar
Raj
π name(Depositor) U π Name(Borrower) Sumit
Sachin
Ramesh
Anil
Set Difference Operation – Example
● Relations r, s:
A B A B
α 1 α 2
α 2 β 3
β 1
s
r
● r – s:
A B
α 1
β 1
Set Difference Operation
● Notation r – s
● Defined as:
r – s = {t | t ∈ r and t ∉ s}
α 1 α 10 a
β 10 a
β 2
β 20 b
r γ 10 b
s
● r x s:
A B C D E
α 1 α 10 a
α 1 β 10 a
α 1 β 20 b
α 1 γ 10 b
β 2 α 10 a
β 2 β 10 a
β 2 β 20 b
β 2 γ 10 b
Cartesian-Product Operation
● Notation r x s
● Defined as:
r x s = {t q | t ∈ r and q ∈ s}
● Assume that attributes of r(R) and s(S) are disjoint. (That is, R ∩ S = ∅).
● If attributes of r(R) and s(S) are not disjoint, then renaming must be
used.
● Cartesian Product Operations: Cartesian product of two relations is the concatenation of tuples belonging to
the two relations. It is denoted by X. If R & S are two relations then P=R X S, which contains all possible
combinations of tuples in R & S. For Cartesian product does not required compatible relations.
Employee Project
Emp_id Name Project_Id
R= Employee x Project
101 Sachin DBMS1
103 Rahul DBMS2
104 Omkar
● Relation r, s:
A B A B
α 1 α 2
α 2 β 3
β 1
r s
● r∩s
A B
α 2
Rename Operation
● Allows us to name, and therefore to refer to, the results of
relational-algebra expressions.
● Allows us to refer to a relation by more than one name.
● Example:
ρ x (E)
returns the result of expression E under the name X, and with the
attributes renamed to A1 , A2 , …., An .
● The Natural join it is a binary operation and a combination of certain selections and a
Cartesian product into one operation. It is denoted as It is associative. It forms a
Cartesian product of its two arguments. Then performs a selection forcing equality on those
attributes those appear in both the relations. And finally removes duplicates attributes.
Employee Salary
ID Name ID Salary
101 Sachin 101 65000
103 Rahul 103 35000
104 Kapil 104 22000
107 Ajay 107 21910
ID Name Salary
101 Sachin 65000
103 Rahul 35000
104 Kapil 22000
107 Ajay 21910
Composition of Operations
● Can build expressions using multiple operations
● Example: σA=C(r x s)
● rxs
A B C D E
α 1 α 10 a
α 1 β 10 a
α 1 β 20 b
α 1 γ 10 b
β 2 α 10 a
β 2 β 10 a
β 2 β 20 b
β 2 γ 10 b
● σA=C(r x s)
A B C D E
α 1 α 10 a
β 2 β 10 a
β 2 β 20 b
Formal Definition
● A basic expression in the relational algebra consists of either one of the following:
● A relation in the database
● A constant relation
● Let E1 and E2 be relational-algebra expressions; the following are all relational-algebra expressions:
● E ∪E
1 2
● E1 – E2
● E1 x E2
● Natural join
● Division
● Assignment
Division Operation – Example
● Relations r, s:
A B
B
α 1
α 2 1
α 3
β 1 2
γ 1
δ 1
δ 3
δ 4
s
∈ 6
∈ 1
β 2
● r ÷ s: A r
α
β
Another Division Example
● Relations r, s:
A B C D E D E
α a α a 1 a 1
α a γ a 1 b 1
α a γ b 1 s
β a γ a 1
β a γ b 3
γ a γ a 1
γ a γ b 1
γ a β b 1
r
● r ÷ s:
A B C
α a γ
γ a γ
Division Operation (Cont.)
● Property
● Let q = r ÷ s
● Then q is the largest relation satisfying q x s ⊆ r
● Definition in terms of the basic algebra operation
Let r(R) and s(S) be relations, and let S ⊆ R
To see why
A B C D B D E
α 1 α a 1 a α
β 2 γ a 3 a β
γ 4 β b 1 a γ
α 1 γ a 2 b δ
δ 2 β b 3 b ∈
r s
● r s
A B C D E
α 1 α a α
α 1 α a γ
α 1 γ a α
α 1 γ a γ
δ 2 β b δ
Division Operation
r
● Notation: ÷s
● Suited to queries that include the phrase “for all”.
● Let r and s be relations on schemas R and S respectively
where
● R = (A1, …, Am , B1, …, Bn )
● S = (B1, …, Bn)
The result of r ÷ s is a relation on schema
R – S = (A1, …, Am)
r ÷ s = { t | t ∈ ∏ R-S (r) ∧ ∀ u ∈ s ( tu ∈ r ) }
Where tu means the concatenation of tuples t and u to
produce a single tuple
Banking Example
branch (branch_name, branch_city, assets)
● Find the loan number for each loan of an amount greater than
$1200
∏customer_name (σbranch_name=“Perryridge”
(σborrower.loan_number = loan.loan_number(borrower x loan)))
(σ (borrower x loan))) –
borrower.loan_number = loan.loan_number
∏ (depositor)
customer_name
Example Queries
● Find the names of all customers who have a loan at the Perryridge branch.
● Query 1
∏customer_name (σbranch_name = “Perryridge” (
σborrower.loan_number = loan.loan_number (borrower x loan)))
● Query 2
∏customer_name(σloan.loan_number = borrower.loan_number (
(σbranch_name = “Perryridge” (loan)) x borrower))
Example Queries
● Find the largest account balance
● Strategy:
4 Find those balances that are not the largest
– Rename account relation as d so that we can compare each
account balance with all others
4 Use set difference to find those account balances that were not found
in the earlier step.
● The query is:
∏balance(account) - ∏account.balance
(σaccount.balance < d.balance (account x ρd (account)))
Bank Example Queries
● Find the names of all customers who have a loan and an account at
bank.
● Find the name of all customers who have a loan at the bank and the
loan amount
● Query 2
● Relation borrower
customer_na
loan_number
me
Jones L-170
Smith L-230
Hayes L-155
Outer Join – Example
● Join
loan borrower
customer_na
loan_number branch_name amount
me
L-170 Downtown 3000 Jones
L-230 Redwood 4000 Smith
customer_na
loan_number branch_name amount
me
L-170 Downtown 3000 Jones
L-230 Redwood 4000 Smith
L-155 null null Hayes
● Full Outer Join
loan borrower
customer_na
loan_number branch_name amount
me
L-170 Downtown 3000 Jones
L-230 Redwood 4000 Smith
L-260 Perryridge 1700 null
L-155 null null Hayes
Null Values
● It is possible for tuples to have a null value, denoted by null, for some
of their attributes
● null signifies an unknown value or that a value does not exist.
● The result of any arithmetic expression involving null is null.
● Aggregate functions simply ignore null values (as in SQL)
● For duplicate elimination and grouping, null is treated like any other
value, and two nulls are assumed to be the same (as in SQL)
Null Values
● Comparisons with null values return the special truth value: unknown
● If false was used instead of unknown, then not (A < 5)
would not be equivalent to A >= 5
● Three-valued logic using the truth value unknown:
● OR: (unknown or true) = true,
(unknown or false) = unknown
(unknown or unknown) = unknown
● AND: (true and unknown) = unknown,
(false and unknown) = false,
(unknown and unknown) = unknown
● NOT: (not unknown) = unknown
● In SQL “P is unknown” evaluates to true if predicate P evaluates
to unknown
● Result of select predicate is treated as false if it evaluates to unknown
Aggregate Functions and Operations
● Aggregation function takes a collection of values and returns a single value as a result.
avg: average value
min: minimum value
max: maximum value
sum: sum of values
count: number of values
● Aggregate operation in relational algebra
A B C
α α 7
α β 7
β β 3
β β 10
account_numbe
branch_name balance
r
Perryridge A-102 400
Perryridge A-201 900
Brighton A-217 750
Brighton A-215 750
Redwood A-222 700
branch_name
g sum(balance)
(account)
branch_name sum(balance)
Perryridge 1300
Brighton 1500
Redwood 700
Aggregate Functions (Cont.)
● Result of aggregation does not have a name
● Can use rename operation to give it a name
● For convenience, we permit renaming as part of aggregate operation
branch_name
g sum(balance) as sum_balance
(account)
Modification of the Database
● The content of the database may be modified using the following
operations:
● Deletion
● Insertion
● Updating
● All these operations are expressed using the assignment
operator.
Deletion
● A delete request is expressed similarly to a query, except instead of displaying tuples to the user, the
selected tuples are removed from the database.
● Can delete only whole tuples; cannot delete values on only particular attributes
● A deletion is expressed in relational algebra by:
r←r–E
where r is a relation and E is a relational algebra query.
Deletion Examples
● Delete all account records in the Perryridge branch.
● Each Fi is either
● the I th attribute of r, if the I th attribute is not updated, or,
● if the attribute is to be updated Fi is an expression, involving only
constants and the attributes of r, which gives the new value for the
attribute
Update Examples