Sql:Advance Queries: Structured Query Language (SQL) Was Designed and Implemented at IBM Research
Sql:Advance Queries: Structured Query Language (SQL) Was Designed and Implemented at IBM Research
Sql:Advance Queries: Structured Query Language (SQL) Was Designed and Implemented at IBM Research
MODULE 3
SQL :ADVANCE QUERIES
3.1 Data Definition, Constraints, and Schema Changes in SQL2
Structured Query Language (SQL) was designed and implemented at IBM Research.
Created in late 70‘s, under the name of SEQUEL
A standard version of SQL (ANSI 1986), is called SQL86 or SQL1.
A revised version of standard SQL, called SQL2 (or SQL92).
SQL are going to be extended with objectoriented and other recent database concepts.
Consists of
A Data Definition Language (DDL) for declaring database schemas
Data Manipulation Language (DML) for modifying and querying database
instances
In SQL, relation, tuple, and attribute are called table, row, and columns respectively.
The SQL commands for data definition are CREATE, ALTER, and DROP.
The CREATE TABLE Command is used to specify a new table by giving it a name and
specifying its attributes (columns) and constraints.
Data types available for attributes are:
o Numeric integer, real (formated, such as DECIMAL(10,2))
o CharacterString fixedlength and varyinglength
o BitString fixedlength, varyinglength
o Date in the form YYYYMMDD
o Time in the form HH:MM:SS
o Timestamp includes both the DATE and TIME fields
o Interval to increase/decrease the value of date, time, or timestamp
SQL allows a table (relation) to have two or more tuples that are identical in all their
attributes values. Hence, an SQL table is not a set of tuple, because a set does not allow
two identical members; rather it is a multiset of tuples.
A basic query statement in SQL is the SELECT statement.
The SELECT statement used in SQL has no relationship to the SELECT operation of
relational algebra.
Some example:
Query 0: Retrieve the birthday and address of the employee(s) whose name is ‗John B. Smith‘
FROM EMPLOYEE
Query 1: Retrieve the name and address of all employee who work for the ‗Research‘ Dept.
Query 2: For every project located in ‗Stafford‘, list the project number, the controlling
department number, and the department manager‘s last name, address, and birthdate.
‗Stafford‘
Ambiguity in the case where attributes are same name need to qualify the attribute using DOT
separator
More
Ambiguity in the case of queries that refer to the same relation twice
Query 8: For each employee, retrieve the employee‘s first and last name and the first and last
name of his or her immediate superviso
A missing WHEREclause indicates no conditions, which means all tuples are selected In
case of two or more table, then all possible tuple combinations are selected
Example: Q10: Select all EMPLOYEE SSNs , and all combinations of EMPLOYEE SSN and
DEPARTMENT DNAME
More
SELECT *
FROM EMPLOYEE
WHERE DNO=5
(name like ‘%a_‘) is true for all names having ‗a‘ as second letter from the end.
In order to list all employee who were born during 1960s we have the followings:
SELECT FNAME, LN
Examples: Show the resulting salaries if every employee working on the 'ProductX' project is
given a 10 percent raise.
Retrieve all employees in department number 5 whose salary between $30000 and $40000.
SELECT *
FROM EMPLOYEE
WHERE (SALARY BETWEEN 30000 AND 40000) AND DNO=5;
SQL does not delete duplicate because Duplicate elimination is an expensive operation (sort and
delete) user may be interested in the result of a query in case of aggregate function, we do not want
to eliminate duplicates
examples
Q11: Retrieve the salary of every employee , and (Q!2) all distinct salary values
FROM EMPLOYEE
FROM EMPLOYEE
Example: Q4: Make a list of Project numbers for projects that involve an employee whose last
name is ‗Smith‘, either as a worker or as a manger of the department that controls the project
FROM PROJECT
FROM WORKS_ON
FROM WORKS_ON
WHERE SSN=‗123456789‘
In addition to the IN operator, a number of other comparison operators can be used to compare a
single value v to a set of multiset V.
ALL V returns TRUE if v is greater than all the value in the set
Select the name of employees whose salary is greater than the salary of all the
employees in department 5
SELECT LNAME, FNAME
FROM EMPLOYEE
WHERE SALARY > ALL (SELECT SALARY
FROM EMPLOYEE
WHERE DNO=5);
Whenever a condition in the WHEREclause of a nested query references some attributes of a relation
declared in the outer query, the two queries are said to be correlated. The result of a correlated nested
query is different for each tuple (or combination of tuples) of the relation(s) the outer query.
In general, any nested query involving the = or comparison operator IN can always be rewritten as
a single block query
Query 12: Retrieve the name of each employee who has a dependent with the same first name as the employee.
E.FNAME=DEPENDENT_NAME)
In Q12, the nested query has a different result for each tuple in the outer query.
The original SQL as specified for SYSTEM R also had a CONTAINS comparison operator, which
is used in conjunction with nested correlated queries This operator was dropped from the language,
possibly because of the difficulty in implementing it efficiently Most implementations of SQL do
not have this operator The CONTAINS operator compares two sets of values , and returns TRUE
if one set contains all values in the other set (reminiscent of the division operation of algebra).
Query 3: Retrieve the name of each employee who works on all the projects controlled by
department number 5.
In Q3, the second nested query, which is not correlated with the outer query, retrieves the project
numbers of all projects controlled by department 5.
The first nested query, which is correlated, retrieves the project numbers on which the employee
works, which is different for each employee tuple because of the correlation.
EXISTS is used to check whether the result of a correlated nested query is empty (contains no
tuples) or not We can formulate Query 12 in an alternative form that uses EXISTS as Q12B below.
Query 12: Retrieve the name of each employee who has a dependent with the same first name as
the employee.
In Q6, the correlated nested query retrieves all DEPENDENT tuples related to an EMPLOYEE
tuple. If none exist , the EMPLOYEE tuple is selected EXISTS is necessary for the expressive
power of SQL
It is also possible to use an explicit (enumerated) set of values in the WHEREclause rather than a
nested query Query 13: Retrieve the social security numbers of all employees who work on project
number 1, 2, or 3.
Null example
SQL allows queries that check if a value is NULL (missing or undefined or not applicable) SQL
uses IS or IS NOT to compare NULLs because it considers each NULL value distinct from other
NULL values, so equality comparison is not appropriate .
FROM EMPLOYEE
Note: If a join condition is specified, tuples with NULL values for the join attributes are not
included in the result
Retrieve the name and address of every employee who works for ‗Search‘ department
Aggregate Functions
Query 15: Find the sum of the salaries of all employees the ‗Research‘ dept, and the max salary,
the min salary, and average:
FROM EMPLOYEE
Query 16: Find the maximum salary, the minimum salary, and the average salary among
employees who work for the 'Research' department.
Queries 17 and 18: Retrieve the total number of employees in the company (Q17), and the
number of employees in the 'Research' department (Q18).
Example of grouping
In many cases, we want to apply the aggregate functions to subgroups of tuples in a relation Each
subgroup of tuples consists of the set of tuples that have the same value for the grouping attribute(s)
SQL has a GROUP BYclause for specifying the grouping attributes, which must also appear in the
SELECTclause
For each project, select the project number, the project name, and the number of employees
who work on that projet
WHERE PNUMBER=PNO
In Q20, the EMPLOYEE tuples are divided into groupseach group having the same value for the
grouping attribute DNO The COUNT and AVG functions are applied to each such group of tuples
separately.The SELECTclause includes only the grouping attribute and the functions to be applied
on each group of tuples. A join condition can be used in conjunction with grouping
Query 21: For each project, retrieve the project number, project name, and the number of employees who work on
that project.
Q21: SELECT PNUMBER, PNAME, COUNT (*)
FROM PROJECT, WORKS_ON
WHERE PNUMBER=PNO
GROUP BY PNUMBER, PNAME
In this case, the grouping and functions are applied after the joining of the two relations
THE HAVINGCLAUSE:
Sometimes we want to retrieve the values of these functions for only those groups that satisfy
certain conditions. The HAVINGclause is used for specifying a selection condition on groups
(rather than on individual tuples)
Query 22: For each project on which more than two employees work , retrieve the project number,
project name, and the number of employees who work on that project.
More on Insert
WHERE LNAME=‗Brown‘
Change the location and controlling department number of project number 10 to ‗Bellaire‘ and
5 respectively
UPDATE PROJECT
Where PNUMBER=10;
WHERE DNUMBER=DNO
GROUP BY DNAME
More on view
FROM WORKS_ON1
WHERE PNMAE=‗PROJECTX‘
More on View
Updating of Views
In general, an update on a view on defined on a single table w/o any aggregate functions can be
mapped to an update on the base table
More on Views
A view with a single defining table is updatable if we view contain PK or CK of the base table
Embedding SQL statements in a general purpose languages (C, C++, COBOL, PASCAL)
SQL can also be used in conjunction with a general purpose programming language, such as
PASCAL, COBOL, or PL/I. The programming language is called the host language. The
embedded SQL statement is distinguished from programming language statements by prefixing it
with a special character or command so that a preprocessor can extract the SQL statements. In PL/I
the keywords EXEC SQL precede any SQL statement. In some implementations, SQL statements
are passed as parameters in procedure calls. We will use PASCAL as the host programming
language, and a "$" sign to identify SQL statements in the program. Within an embedded SQL
command, we may refer to program variables, which are prefixed by a "%" sign. The programmer
should declare program variables to match the data types of the database attributes that the program
will process.These program variables may or may not have names that are identical to their
corresponding attributes.
Example: Write a program segment (loop) that reads a social security number and prints out
some information from the corresponding EMPLOYEE tuple.
readln(SOC_SEC_NUM);
$SELECT FNAME, MINIT, LNAME, SSN,
BDATE, ADDRESS, SALARY
INTO %E.FNAME, %E.MINIT, %E.LNAME, %E.SSN,
%E.BDATE, %E.ADDRESS, %E.SALARY
FROM EMPLOYEE
WHERE SSN=%SOC_SEC_NUM ;
writeln( E.FNAME, E.MINIT, E.LNAME,
E.SSN, E.BDATE, E.ADDRESS, E.SALARY);
writeln('more social security numbers (Y or N)? ');
readln(LOOP)
end;
In E1, a single tuple is selected by the embedded SQL query; that is why we are able to assign its
attribute values directly to program variables. In general, an SQL query can retrieve many tuples.
The concept of a cursor is used to allow tupleatatime processing by the PASCAL
programCURSORS: We can think of a cursor as a pointer that points to a single tuple (row) from
the result of a query.The cursor is declared when the SQL query command is specified. A
subsequent OPEN cursor command fetches the query result and sets the cursor to a position before
the first row in the result of the query; this becomes the current row for the cursor. Subsequent
FETCH commands in the program advance the cursor to the next row and copy its attribute values
into PASCAL program variables specified in the FETCH command. An implicit variable
SQLCODE communicates to the program the status of SQL embedded commands. An SQLCODE
of 0 (zero) indicates successful execution. Different codes are returned to indicate exceptions and
errors. A special END_OF_CURSOR code is used to terminate a loop over the tuples in a query
result. A CLOSE cursor command is issued to indicate that we are done with
the result of the query When a cursor is defined for rows that are to be updated the clause FOR
UPDATE OF must be in the cursor declaration, and a list of the names of any attributes that will
be updated follows.The condition WHERE CURRENT OF cursor specifies that the current tuple
is the one to be updated (or deleted)
Example: Write a program segment that reads (inputs) a department name, then lists the names of
employees who work in that department, one at a time. The program reads a raise amount for each
employee and updates the employee's salary by that amount.
while SQLCODE = 0 do
begin
writeln('employee name: ', E.FNAME, E.MINIT, E.LNAME);
writeln('enter raise amount: '); readln(RAISE);
$UPDATE EMPLOYEE SET SALARY = SALARY +
%RAISE WHERE CURRENT OF EMP;
$FETCH EMP INTO %E.SSN, %E.FNAME, %E.MINIT,
%E.LNAME, %E.SAL;
end;
$CLOSE CURSOR EMP;
Impedance Mismatch
Incompatibilities between a host programming language and the database model, e.g.,
type mismatch and incompatibilities; requires a new binding for each language
set vs. record-at-a-time processing
need special iterators to loop over query results and manipulate individual values
Client program opens a connection to the database server
Client program submits queries to and/or updates the database
When database access is no longer needed, client program closes (terminates) the connection
can be embedded in a general-purpose host programming language such as
Most SQL statements
COBOL, C, Java
An embedded SQL statement is distinguished from the host language statements by
enclosing it between EXEC SQL
or EXEC SQL BEGIN and a matching END-EXEC or
EXEC SQL END (or semicolon)
Syntax may vary with language
Shared variables (used in both languages) usually prefixed with a colon (:) in
SQL
DECLARE are shared and can appear (while prefixed by a colon) in SQL
Variables inside
statements
SQLCODE is used to communicate errors/exceptions between the database and the program
int loop;
Connection (multiple connections are possible but only one
is
active) CONNECT TO server-name AS connection-name
AUTHORIZATION user-account-info;
Change from an active connection to another
one SET CONNECTION connection-name;
Disconnection
DISCONNECT connection-name;
loop = 1;
while (loop) {
EXEC SQL
END-EXEC
}A cursor (iterator) is needed to process multiple tuples
FETCH commands move the cursor to the next tuple
CLOSE CURSOR indicates that the processing of query results has been completed
Objective:
varchar sqlupdatestring[256];
EXECUTE sqlcommand;
Environment record:
Keeps track of database connections
Connection record:
Keep tracks of info needed for a particular connection
Statement record:
Keeps track of info needed for one SQL statement
Description record:
Keeps track of tuples
Load SQL/CLI libraries
for the above components (called: SQLHSTMT, SQLHDBC,
Declare record handle variables
SQLHENV, SQLHDEC)
Set up an environment record using SQLAllocHandle
Set up a connection record using SQLAllocHandle
Set up a statement record using SQLAllocHandle
Prepare a statement using SQL/CLI function SQLPrepare
Bound parameters to program variables
Execute SQL statement via SQLExecute
Bound query columns to a C variable via SQLBindCol
Use SQLFetch to retrieve column values into C variables
Persistent procedures/functions (modules) are stored locally and executed by the database
server
As opposed to execution by clients
Advantages:
by many applications, it can be invoked by any of them (thus
If the procedure is needed
reduce duplications)
Execution by the server reduces communication costs
Enhance the modeling power of views
Disadvantages:
Every DBMS has its own syntax and this can make the system less portable
A stored procedure
CREATE PROCEDURE procedure-name
(params) local-declarations
procedure-body;
A stored function
local-declarations
function-body;
Calling a procedure or function
CALL procedure-name/fun-name (arguments);
SQL/PSM:
Part of the SQL standard for writing persistent stored modules
SQL + stored procedures/functions + additional programming constructs
E.g., branching and looping statements
Enhance the power of SQL
RETURNS VARCHAR[7]
ENDIF;
Questions