Introduction To SQL: Structured Query Language ( Sequel')

Download as pdf or txt
Download as pdf or txt
You are on page 1of 29

CS 5614: Basic Data Definition and Modification in SQL 67

Introduction to SQL

† Structured Query Language (‘Sequel’)


† Serves as DDL as well as DML
† Declarative
† Say what you want without specifying how to do it
† One of the main reasons for commercial success of DBMSs

† Many standards and implementations


† ANSI SQL
† SQL-92/SQL-2 (Null operations, Outerjoins etc.)
† SQL3 (Recursion, Triggers, Objects)
† Vendor specific implementations

† “Bag Semantics” instead of “Set Semantics”


† Used in commercial RDBMSs

CS 5614: Basic Data Definition and Modification in SQL 68

Example:

† Create a Relation/Table in SQL

CREATE TABLE Students


(sid CHAR(9),
name VARCHAR(20),
login CHAR(8),
age INTEGER,
gpa REAL);

† Support for Basic Data Types


† CHAR(n)
† VARCHAR(n)
† BIT(n)
† BIT VARYING(n)
† INT/INTEGER
† FLOAT
† REAL, DOUBLE PRECISION
† DECIMAL(p,d)
† DATE, TIME etc.
CS 5614: Basic Data Definition and Modification in SQL 69

More Examples

† And one for Courses

CREATE TABLE Courses


(courseid CHAR(6),
department CHAR(20));

† And one for their relationship!

CREATE TABLE takes


(sid CHAR(9),
courseid CHAR(6));
† Why?

† Can also provide default values

CREATE TABLE Students


(sid CHAR(9),
....
age INTEGER DEFAULT 21,
gpa REAL);

CS 5614: Basic Data Definition and Modification in SQL 70

Examples Contd.

† DATE and TIME


† Implementations vary widely
† Typically treated as strings of a special form
† Allows comparisons of an ordinal nature (<, > etc.)

† DATE Example
† ‘1999-03-03’ (No Y2K problems)
† TIME Examples
† ‘15:30:29’
† ‘15:30:29.3875’
† Deleting a Relation/Table in SQL

DROP TABLE Students;


CS 5614: Basic Data Definition and Modification in SQL 71

Modifying Relation Schemas

† ‘Drop’ an attribute (column)

ALTER TABLE Students DROP login;

† ‘Add’ an attribute (column)

ALTER TABLE Students ADD phone CHAR(7);

† What happens to the new entry for the old records?


† Default is ‘NULL’ or say
ALTER TABLE Students ADD phone CHAR(7)
DEFAULT ‘unknown’;

† Always begin with ‘ ALTER TABLE <TABLE_Name>’

† Can use DEFAULT even with regular definition (as in Slide 69)

CS 5614: Basic Data Definition and Modification in SQL 72

How do you enter/modify data?

† INSERT command

INSERT
INTO Students
VALUES (‘53688’,’Mark’,’mark2345’,23,3.9)

† Cumbersome (use bulk loading; described later)


† DELETE command

DELETE
FROM Students S
WHERE S.name = ‘Smith’

† UPDATE command

UPDATE Students S
SET S.age=S.age+1, S.gpa=S.gpa-1
WHERE S.sid = ‘53688’
CS 5614: Basic Data Definition and Modification in SQL 73

Domains

† Domains: Similar to Structs and other user-defined types

CREATE DOMAIN Email AS CHAR(8) DEFAULT ‘unknown’;


....
login Email // instead of login CHAR(8) DEFAULT ‘unknown’

† Advantages: can be reused

junkaddress Email,
fromaddress Email,
toaddress Email,
....

† Can DROP DOMAINS too!

DROP DOMAIN Email;


† Affects only future declarations

CS 5614: Basic Data Definition and Modification in SQL 74

Keys

† To Specify Keys
† Use PRIMARY KEY or UNIQUE
† Declare alongside attribute
† For multiattribute keys, declare as a separate line
CREATE TABLE takes
( sid CHAR(9),
courseid CHAR(6),
PRIMARY KEY (sid,courseid)
);

† Whats the difference between PRIMARY KEY and UNIQUE?


† Typically only one PRIMARY KEY but any number of UNIQUE keys
† Implementor allowed to attach special significance
CS 5614: Basic Data Definition and Modification in SQL 75

Creating Indices/Indexes

† Why?
† Speeds up query processing time
† For Students

CREATE INDEX indexone ON Students(sid);


CREATE INDEX indextwo ON Students(login);

† How to decide attributes to place indices on?


† One is (typically) created by default on PRIMARY KEY
† Creation of indices on UNIQUE attributes is implementation-dependent
† In general, physical database design/tuning is very difficult!
† Use Tools: Microsoft SQLServer has an index selection Wizard

† Why not place indices on all attributes?


† Too cumbersome for insertions/deletions/updates

† Like all things in computer science, there is a tradeoff! :-)

CS 5614: Basic Data Definition and Modification in SQL 76

Other Properties

† ‘NOT NULL’ instead of DEFAULT

CREATE TABLE Students


(sid CHAR(9),
name VARCHAR(20),
login CHAR(8),
age INTEGER,
gpa REAL);

† Can insert a tuple without a value for gpa


† NULL will be inserted
† If we had specified

gpa REAL NOT NULL);

† insert cannot be made!



 


"!$#&%('*),+.-&/10(2$35462748%(0(09':!;/&-<=!$#&>?2@2A-&%CBD2@>?2$;!FEG/&2$>:<H0(IJ&KL/1ILKL2@'?ML>N2$O&>N2$'?2$;!;IP!;%C+L&'Q/&'N2$-%(SR$+LJT
U /&&RV!;%(+JW48%X!;#W!$#&2
>?2$0CIY!;%C+L&IJ0L)"+.-&2$0CZ\[]#&2
^&>N'!_%('9>N2$0(IP!;%C+L&IJ0.IL0(KJ2$`&>NIL3GILQIL0CKL2$`&>NIL%CR_IJ&-AO1>?+.R$2$-1/&>?IJ0
46I<Hab+J>]R$>?2@IY!;%C&Kc&2N4d>?2@0(IY!$%(+J&']ab>?+J)eKL%gfD2$+L12$'?Zh[]#&2Q'?2$R@+L&-%(']i*IY!;IJ0(+JKL3548#&%(RY#%(']0(+JKL%CR$IL0jIL&-
-&2$R@0(IL>NIY!$%XfD2h%(&IY!$/&>?2,k:'!;/1-&2$;!;']abIL)"%(0C%(IJ>l48%X!$#F!;#&2Qm
n]oQp9o8qrO&>?+JKL>NIL)"),%C&K,0CIL&KJ/&ILKJ2h48%C0(0^1&-
!;#1%('\IJ0(0!;+.+c&IP!;/&>NIL0(sNZtabIJRN!;3!;#&2@>?2uIJ>?2uR$0C+L'N2uR$+L>N>?2$'NOv+L&-&2@&R$2$'w`D2V!V4u2$2$-1IY!;IJ`&IL'N2Q'<'!;2@),'wIL&-
m
n]oQp9o8qQZwm_n]oQp9oQqxR$ILy`v2z!;#1+L/&KJ# !=+Ja]IL'AIS-&IY!$IL`&IJ'?2'<'!$2$){48#&2$>N2,IJ0(0|!;#12-&IY!$I}^J!$'A%C !$+
)"IL%(~)"2$)"+L><.Z€R@+L;!;>?IJ'!$3uI-1%('!$%(&KJ/&%('N#&%(1Kab2$IY!$/&>?2S+Launhiu‚]ƒ„&'c%C'Q!$#&IY!"!;#&2V<…+LOv2$>NIY!;2+J
'?2@R$+L&-1IL><€'!$+L>NILKL2@Z†oW!$#&2$>W!;#&IJ‡!;#&IP!ykIJ&-~'?+J),2-1%(Bv2$>?2@&R$2$'Q%(yEG/&2$><€O&>N+.R$2$'?'N%(&KJs?3ˆ!$#&2$>?2"IL>N2
2N‰R$2$0C0(2$;!7IL1IL0(+JKL'w!$+c`v+Y!;#SR$/&0g!;/&>N2$'?Z]‚]+P!;#m
n]o8p9oQqŠIJ&-ntiu‚]ƒ„1']IL>?2Q-&2@R$0(IJ>?IY!$%XfD2$ZŒ‹#&IY!]4u2
Ž &+4!;+"`v2A>?2@0(IY!$%(+J&']IL>N2A>?2@ab2$>?>N2$-z!;+"IL'*(O&>N2$-&%(R@IY!;2@'?‘%Cm_n]oQp9oQqQZ“’”!$/&O&0(2Q%C']R$IL0C0(2$-I"KL>?+J/&&-
abILRV!F%C}m
n]oQp9o8qQZ•’…!;IJ`&0(2z%('hR$IL0C0(2$-ILC2N‰G!;2$&'N%(+J&IL0l-&2$^&1%X!;%C+L&‘9%C}m
n]oQp9oQq–IJ&-'?++J&Z—k:iu+
&+P!8KL2N!u`D+JKLKJ2$-}-&+48`;<"!;#&2$'N2u'?Ov2$R@%(^&R$'N˜G4u2t%C&R$0(/1-&2
!$#&2$)™#&2$>N2 U /&'!u'?+*!$#&IY!w<D+L/R$ILš),I Ž 2
!$#&2
R$+J&&2$RN!$%(+J&3•%(a|<D+J/IJ>?2cIL0C>?2$IJ-J<›abIJ),%C0(%CIL>w48%g!;#m
n]oQp9o8qQZ
œ
0('N2$3•&+P!;#&%C&KF!;+F46+J>?><.Z(s[]#&2*!$#&%(>N-
EG/&2$><=>N2$O&>N2$'?2$;!;IP!;%C+L"%('?3j+La9R$+J/&>?'N2$3„1ž8pš!$#&IY!w46IJ']%(;!;>?+.-1/&R$2$-"2$IL>N0(%C2$>?ZwV6!;#12u>?2$)"IL%C&-&2$>
+Ja1!;#&%C'
-&+.R$/1),2$;!;3ˆ4u2F48%(0(0
%C !$>?+.-&/&R@2=`1IL'?%CR+LOv2$>?IP!;%C+L&'"IL&-~)"IL&%CO&/&0CIY!;%C+L&'*!$#&IY!c4u2R$IL~Ov2$>?ab+J>?)Ÿ+J
>?2@0(IY!$%(+J&'?Z" v+L>Q2$ILRY#'?/1R¡#`&IJ'?%CRc+LOv2$>?IP!;%C+L&3Œ462*48%(0C0“'N#&+4¢#&+4–%X!c%('Q>?2$O1>?2$'N2$ !$2$-}%C2$ILRY#+La|!$#&2
!;#1>?2$2Q-&%CBD2@>?2$;!6&+P!;IP!;%(+J&'?Z

£ Zu¤¥l¦¨§v¥©§vª]«¬G­¨®&¯@¦¨§v¥l° ±\[]#&2z/&&%C+L+Laj!4u+c>?2$0CIY!;%C+L&'7²IL&-=³†%('!;#&2Q'?2V!8+Lal2$0(2$)"2$;!;'Œ!;#&IP!
IJ>?2u%Cš²+J>\%C—³´+L>]`v+P!;#&Zˆ‹€2uIL'N'?/&)"2w!$#&IY!w!;#&2h'?RY#&2$)"IL'w+LaŒ²IJ&-—³›IL>N2uIL0C% Ž 28k+Ja9R$+L/1>?'?2@s
IJ&-z!;#1IY!
!;#&2@%(>•R$+J0(/&)"&'
IL>N2tIJ0('N+A+L>N-&2$>N2$-,IJ0(% Ž 26k+JaR$+J/&>?'N2$3vILKJIL%(1s?Z‹”2]&+4KJ%XfD2ˆ!;#12
!$#&>?2$2
>N2$O&>?2@'?2$;!;IP!;%(+J&']+La5!;#128/1&%(+J}>?2@0(IY!$%(+J&±

² ³k'?%C),O10(2$3j>?%(KJ# !$µ s

¶·?¸
¹?º
¹?»¹¼½~¾À¿›Á
·?¸¹º¹»¹?¼|½ˆÂ
¶·?¸
¹?º
¹?»¹¼½~¾À¿›Ã
·?¸¹º¹»¹?¼|½ˆÂ
Ä*+Y!;%CR$2S!$#&IY!!$#&2SfÀIL>N%(IL`10(2$'
¸ ¹º¹?»
¹?¼ IJ>?2)"2$>?2@0X<(O10(ILR@2$#&+L0C-&2$>N'?‘Q/&'?2@-ab+J>O&IY!N!;2$>N

)"IY!$R¡#1%(&KJ˜1462uR$+J/&0(-#&IfD2W48>?%g!?!$2$F!;#&2QIL`v+fD2W!46+cIL'N±

¶·?Å
¹?Æ
¹;ǹȽ~¾À¿›Á
·?Źƹ ǹ?È|½ˆÂ
¶·?É
¹?Ê
¹?˹̽~¾À¿›Ã
·?ɹʹ˹?Ì|½ˆÂ

·?ÃGÍÀÎGÍÀÏG¶ÑÐ

ÒÓlÔ\Õ Ö¨ÔØ×_Ù ÚÜۈÝLÔYÚÜÙNÞ?ß9àáÔYÖ9âYãÛßߨÚäÙâˆ×_åVæ]×ەßç×_Ú Þ?èéå ÛVêWÞ?Ô]ë]ê ÚÜìLÛÖÛVÙVÞ_êÔ@èVã í]ÛÙNÞ\ßîÞbæ.ïÜەàîÔ¡ÖjðQÔ$ê ãïäەñ¡ò

óLó
ôGÁõ ö©Á½
÷GøùÀõ;ø
·?ÃGÍÀÎGÍÀÏG¶ÑÐ
ôGÁõ ö©Ã½

ú Zuû¦ü_¬Gý ¬À¥lþ.¬"§Dªw«¬G­¨®&¯@¦¨§v¥l° ±\[]#&2*-&%(Bv2$>N2$&R$2t² ³~+Ja1!46+A>N2$0(IP!;%C+L&'\²ÿIJ&-—³›%('Œ!;#&2*'?2N!u+La


2$0C2$)"2$ !$'l!$#&IY!AIL>N2A%(=²`&/J!A&+Y! %C=³
Zh’u'h/&'?/1IL0(35462QIJ'?'?/1),27!;#&IP!t!$#&2A'NR¡#&2@),IJ']+La²IL&-
³›IL>N2uIL0C% Ž 2hIL&-F!$#&IY!w!;#&2$%C>
R$+L0C/&),1'\IL>N2uIL0C'?+c+J>?-&2$>N2$-IJ0(% Ž 2@Z\ƒ+L>N2$+fD2$>?31&+Y!$%(R$2w!$#&IY!6² ³
%C']&+Y!—kKJ2$&2$>NIL0(0g<|sŒ!;#&2Q'NIL)"28IJ'*³ ²*Z

² ³

¶·?¸
¹?º
¹?»¹¼½~¾À¿›Á
·?¸¹º¹»¹?¼|½D¹ø|õ;¶©Ã
·?¸¹º¹»¹?¼|½Â

·?ÃGÍÀÎGÍÀÏG¶ÑÐ
ôGÁõ ö©Á½
ÍGÏÀÍÀ¶
·?ÃGÍÀÎGÍÀÏG¶ÑÐ
ôGÁõ ö©Ã½

 
Z Y¥¯$¬Gý °$¬Gþ.¯$¦¨§v¥§vª
«¬G­¨®&¯@¦¨§v¥l° ±
[]#&2]%(;!;2$>N'?2$RV!;%(+JF² ³´+La!46+Q>?2$0CIY!$%(+L1'² IJ&-,³´%('j!$#&2]'?2V!
+Ja“2$0C2$)"2$ !$'l!$#&IY!8IJ>?2A%C
š ²IL&-}³
Z]’*KLIL%C&31462*IL'?'N/&)"2h!;#1IY!]!;#128'NR¡#12$),IJ']+La²ÑIJ&-‡³
IJ>?2\IJ0(% 2_IJ&-*!;#1IY!•!$#&2$%(>9R@+L0(/1),&'lIL>N2]IL0C'?+h+L>?-12$>?2$-cIJ0(% Ž 2$Z•Äu+P!;%(R@29!$#&IY!7²
Ž ³
 ² k¨² ³sNZ

² ³

¶·?¸
¹?º
¹?»¹¼½~¾À¿›Á
·?¸¹º¹»¹?¼|½D¹Ã
·?¸
¹?º¹»¹¼D½Â

·?ÃGÍÀÎGÍÀÏG¶ÑÐ
ôGÁõ ö©Á½
ù;øG¶ÀÍGÁÀÃGÍGÏÀ¶
·?ÃGÍÀÎGÍÀÏG¶ÑÐ
ôGÁõ ö©Ã½

 
Z Qý § ¬Àþ.¯$¦§v¥•±*o8Ov2$>NIY!$2$'u+LyI'N%(&KJ0(2A>N2$0(IP!;%C+L}IJ&->?2@),+fD2$'h'?+J),2z+La|!;#&2cR@+L0(/1),&'NZu'N2$ab/&0
ab+J>,>N2$'!$>?%(RV!;%(1K}%(1ab+L>?)"IY!$%(+J&Z©’u'?'N/&)"2c!$#&IY!"462F4uIL;!=+L&0g<!;#&2&IJ),2IJ&-€IJ-&-&>?2@'?'"ab>?+L)
>N2$0(IP!;%(+J=²*Z

! #"$"&%')(*( ²

¶·?¸
¹?º|½©¾G¿›Á
·?¸
¹?º¹»¹¼D½Â
[]#G/&' »¹¼ `D2@R$+L)"2]%(>N>?2$0C2NfÀIL;!WIP!?!;>N%(`&/!;2$'NZl‹€2]R$+L/&0C-,%C&'!$2$IL-c>N2$%(1ab+L>?R@2ˆ!;#&%C'•` <z48>?%g!;%(1K

¶·?¸
¹?º|½©¾G¿›Á
·?¸
¹?º¹,+¹,+.½Â

ÃGÍGÎÀÍGÏÀ¶.-GÅ/Àɹ—ÅGÈÀÈ0GÉ211
ôGÁõ ö©Á

3 
Z 4l¬À­¬Gþ.¯$¦¨§v¥ˆ±\oQOv2$>?IP!;2$'h+LSIc'?%C&KL0C28>N2$0(IP!;%C+LIL&-S>?2$)"+fD2$']'?+J),2Q+Jaj!;#&2Q>N+ 48'NZt[]#12A>?2@),+fÀIL0
%C'8`1IL'?2@-+L'N+L)"2,R@+L&-&%g!;%C+Ly'?Ov2$R$%C^&2$-y` <!$#&2,/1'?2$>NZ D+J>Q2N‰|IJ),O10(2$3
'?/1O&Ov+L'?28462646IL;!—IJ0(0
!$#&2W!;/&O&0C2$']ab>?+J)¢²…48#12$>?2W!;#1281IL)"28%C'](ƒ%CR¡#&IJ2$0(‘CZ

ó65
7 8:9<;>=<?A@
#)BDC C ²

¶·?¸
¹?º
¹?»¹¼½~¾À¿›Á
·?¸¹º¹»¹?¼|½D¹¸FEHGöJIGÇ ÌGÅÀÉKJLÂ

ÃGÍGÎÀÍGÏÀ¶Ð
ôGÁõ ö©Á
MNGÍÀÁGÍ.-ÀÅ/GÉ.EOG?ö2IGÇ;ÌÀÅGÉ2KJL
P 
Z Qÿ§vý ¬R4l¬À­¬Gþ.¯$¦¨§v¥l°TSH„&2$0C2$RN!;%C+L&'c`v2$R@+L)"2}R$+J),O&0C%(R$IP!;2$-‡48#&2@y!;#12}R$+J&-&%X!$%(+J&',KJ2N!}0(+L1KL2$>N3
O&IJ>!$%(R$/&0CIL>N0X<48%X!$#}iuIP!;IL0C+LKkç!$#&2A+P!;#&2@>w!46+ab+J>?)"'tIJ>?2cO&>N2N!?!<'!$>?IL%CKL#;!;ab+J>46IJ>?-&sNZV]
U +L&'N%(-&2@>
ab+J>_2V‰|IJ),O&0C2$3;48#&2$*462ˆ4uIL;!7IL0C0J!;#12ˆ!;/&O&0C2$'•ab>N+L)dn´48#12$>?2ˆ!$#&2]&IL)"2]%('ˆ(ƒ%(RY#&IJ2$0(‘.oQny48#12$
!$#&2QKL2$&-12$>]%(']Cƒ‘(ZG‹€2W48>?%g!;2w!$#&%(']%C=i*IY!;IJ0(+JKIL'N±

¶·¸¹?º
¹?»
¹?¼½†¾G¿›Á
·?¸
¹?º¹»¹¼D½¹,¸FEHG?ö2IGÇ ÌGÅGÉ2KJLDÂ
¶·¸¹?º
¹?»
¹?¼½†¾G¿›Á
·?¸
¹?º¹»¹¼D½¹,»FEHG?öWLÂ

Äu+P!;%CR$2h!$#&IY!Q462cC'?O&0C%X!$‘j!;#&2cR@+L&-&%g!;%C+LILR@>?+L'N'h!46+š>?/&0C2$'?3 U /1'!,IL'W462"-&+%(š!$#&2c/&&%C+LyR$IL'N2
kXU]+L)"2z!;+ !;#&%C Ž +Lah%X!$3l!;#12,oQn™%('A%C&-&2$2@-!$#&2,/1&%(+J†+Jaˆ!46+}R@+L&-&%g!;%C+L&'Ns?Zy„&%()"%(0CIL>?0g<3j!$#&2
R$+J),)"I%C}2$IJR¡#S+La|!;#&2cIJ`v+ fD2c-1IY!;IJ0(+JK=>N/&0(2@'t)"+.-&2$0C'w!;#&2c’*ÄuidR$+J&-&%g!;%(+J&Zcp92N!$‘('hR$+L&'N%(-&2@>
I)"+L>N2AR$+J),O10(%(R@IY!;2@-=R@+L&-&%g!;%C+L&±Q„&2$0C2$RN!8IJ0(05!;#&27!;/&O&0C2$'hab>?+L)xn…!;#&IP! IL>N2A&2$%g!;#&2@>t)"IL0C2A&+J>
#&IfD2w!$#&2Q&IL)"2Q( v+‰|‘CZ

¶·¸¹?º
¹?»
¹?¼½†¾G¿›Á
·?¸
¹?º¹»¹¼D½¹,» ¾YZG?öWLG¹=¸d¾FY[GôJ\;¸WLÂ

‹#&%C0(2Q'?2@0(2$RN!$%(&K*!$#&2h!$/&O&0C2$']ab>?+J) n”!$#&IY! IJ>?2z&+Y! `v+P!;#)"IL0(2QIJ&-#&IfD2W!;#&2z&IL)"2Q( v+‰|‘9%C'


IJR¡#&%C2NfD2$-"` <†kç48#;<|µ s?±

¶·¸¹?º
¹?»
¹?¼½†¾G¿›Á
·?¸
¹?º¹»¹¼D½¹,» ¾YZG?öWLÂ
¶·¸¹?º
¹?»
¹?¼½†¾G¿›Á
·?¸
¹?º¹»¹¼D½¹,¸ ¾YZG?ô2\;¸HL|Â

ó Z]c®&ý ¯$¬G°$¦®&¥^Qý§:_`lþ.¯ ±c[]#&%('z%('w!$#&2"'?2N!,+Law(O&IJ%(>N'?‘


ab+L>N),2$-y`;<›RY#&+.+L'N%(&K !;#&2"^&>?':!,2$0(2$)"2$;!
ab>N+L)¢²ÑIL1-"!;#12u'?2$R@+L&-2$0C2$),2@ !tab>?+J)¢³Z]R$IL'N2Q+LalR$+L&ab/1'?%(+J&'hIL)"+L&K"IY!N!;>?%C`&/J!$2u&IL)"2$'N3
-&%C'?IJ)c`&%(KJ/&IY!$2h!;#12$)x` <O&>N2$^J‰%(&K6!;#&2@)Š48%g!;# !;#&2c>N2$0(IP!;%(+J}&IJ),2@Z8[]#12cR$IL>:!;2$'N%(ILSO&>?+.-1/&RN!
+Jav!46+">?2@0(IY!$%(+J&'7²ÑIL&-}³ %('hKL%gfD2$`;<|±

² ³

¶·?¸2a¹ºJa¹»Ja|¹V¼JaD¹?¸Fbˆ¹?ºFbˆ¹?»bˆ¹¼ b½~¾À¿›Á
·?¸Ja|¹?º2a¹?»2a¹V¼2a.½¹,Ã
·?¸b
¹?ºFb¹?»Fb¹V¼Fb½ˆÂ

ÃGÍGÎÀÍGÏÀ¶©ÁhÂ*-ÀÅ/Àɹ—ÁhÂÅÀÈGÈF0GÉJ1F1D¹ÁtÂËÀÉ-GÈÀÉ0
¹,Áh¨ÆJIc0FdGÌÀÈGÅd.É
¹
ÃhÂ*-ÀÅ/Àɹ—ÃhÂÅÀÈGÈF0GÉJ1F1D¹ÃtÂËÀÉ-GÈÀÉ0
¹,Ãh¨ÆJIc0FdGÌÀÈGÅd.É
ôGÁõ ö©Á¹Ã

Äu+P!;%CR$2Q#&+4462z-&%('NIL)c`&%CKL/&IP!;2AIP!?!$>?%(`1/J!;2$'h%( !;#&2Q„&žQp‡fD2$>N'?%(+J&Zu’u0C'?+"&+Y!$%(R$27!;#&IP!F%Ca²r#1IL'
e !;/&O&0C2$']IL1-‡³ #1IL'Wf‡!;/&O&0C2$'?31!$#&2$=² ³H48%C0(0#1I fD2 e f‡!;/1O&0(2$'NZ

ó6g
5 
Z hji“¬G¯$®lk#m§v¦¨¥ˆ±[]#&%('c%C' U /&'!=0C% Ž 2*!$#&2R$IL>:!;2$'N%(IL~O&>N+.-&/&RN!=`1/J!=KL+.2$'"I'!$2$O~ab/&>!$#&2$>NZ†’*aä!;2$>
f !;/&O10(2$'N3%X!u'?2$0C2$RN!$'\+L10X<HIc'N/&`&'?2V!8+La5!;#&2$)Ñ!;+c%C&R$0C/&-&2u%C%X!$']IL&':462$>N39`&IL'N2$-
ab+J>?)"%(&K*!$#&2 e ‡
+J'N+L)"2R$+L&-1%X!;%C+L&ZŠ[]#G/&'N3h!;#12š!;#&2N!$IYT U +L%C©+Lah!4u+>?2$0CIY!$%(+L1'#&IL'8!;#&2'NIL)"2À/&)c`v2$>š+La
R$+J0(/&)"&'hIL'!;#&2zR$IL>:!;2$'N%(IJ}O&>N+.-&/&RN!A`&/J!A&+Y!c&2$R$2$'N'?IJ>?%(0g<!;#128'NIL)"2AG/&)c`v2$>]+La
>N+ 48'Ak'N+L)"2
+Ja]!$#&2}>N+48'Q48%C0(0h`D2S>?2$)"+fD2$-~`v2$R$IL/1'?2š!;#12N<…-&%C'?'?IP!;%C'?aä<Ñ'N+L)"2}R$+J&-&%g!;%(+J&s?Z^U]+J&'?%C-&2$>ab+J>
2N‰IL)"O&0C2$3 !;#&IP!]462W46IJ !w!;+c^&&-CO&IL%C>?'?‘9+Ja“':!;/&-12$ !$'\'N/&R¡#F!$#&IY!]!$#&2Q^&>?':!8OD2@>?'?+J%(F!;#&2*O&IL%C>
%C']IL0X4uI<|']Cƒ%(RY#&IL2$0C‘(Z|‹€28KJ2N!;±

²on)p6qsr t#u#8lv ;>=<?A@


#)BDC ³

¶·?¸2a¹ºJa¹»Ja|¹V¼JaD¹?¸Fbˆ¹?ºFbˆ¹?»bˆ¹¼ b½~¾À¿›Á
·?¸Ja|¹?º2a¹?»2a¹V¼2a.½¹,Ã
·?¸b
¹?ºFb¹?»Fb¹V¼Fb½D¹,¸2acEHGöJIÀÇ;ÌGÅÀÉJKL|Â

ÃGÍGÎÀÍGÏÀ¶©ÁhÂ*-ÀÅ/Àɹ—ÁhÂÅÀÈGÈF0GÉJ1F1D¹ÁtÂËÀÉ-GÈÀÉ0
¹,Áh¨ÆJIc0FdGÌÀÈGÅd.É
¹
ÃhÂ*-ÀÅ/Àɹ—ÃhÂÅÀÈGÈF0GÉJ1F1D¹ÃtÂËÀÉ-GÈÀÉ0
¹,Ãh¨ÆJIc0FdGÌÀÈGÅd.É
ôGÁõ ö©Á¹Ã
MNGÍÀÁGÍ©ÁtÂ*-GÅF/GÉwExG?öJIÀÇ;ÌÀÅGÉJK2L
Äu+P!;%CR$2|!;#&IP!“4u2_%C !$>?+.-&/1R$2l!$#&2
(`v+4]!;%(2@‘.'<)c`D+J0yn*pz'?/lz6‰ÀT?2$-z` <u!$#&2\R@+L&-&%g!;%C+LAab+J>“%C&-&%CR$IY!$%(&K
!$#&2W!;#&2N!$IYT U +L%C&Z]’u0('N+L39>N2$IL0C%|{$2w!$#&IY!

²}n*p~´³ 7 ~]k² ³s

g 
Z €S®&¯`lý®1­m§v¦¥•±S[]#&%('z%(' U /&':!—IR$0C2NfD2$>N2$>w46I<!$+R$+L)c`1%(&26!46+}>N2$0(IP!;%C+L&'c%C !$+}+L12$Z~[]#&2
`&IJ'?%CR,%C-&2$I%('W!;#1IY!—%Ca|!;#&26!46+>?2$0CIY!$%(+L1'8#1I fD2"'?+J),2"R$+J0(/&)"Œk'NsQ%(yR$+L)"),+J&3|!;#&2@4u2,R$IJ
CR$+L0C0(ILO1'?2$‘j!$#&2$) %C !$+"!;#12,'NIL)"2R$+L0C/&)"%C!$#&2,^1&IL0w+L/J!$O&/J!;Zyƒ+L>N2$+fD2$>?3|4u2,R$IJ-&+ !;#&%C'
+J&0X<}%(a!;#&2w!4u+h!;/1O&0(2$'
ab>N+L)Ñ!;#&2w!4u+8>N2$0(IP!;%C+L&'wILKL>N2$2u%Cz!;#&+J'?2uR@+L)"),+JR$+L0C/&)"&'?Zw[]#À/&'N3%g!
%C'u'?%C),%C0(IL>ˆ!$+z!;#12cR$IL>:!;2$'N%(ILO1>?+.-&/&RV!;3•`&/!u462z U +L%C&‘l+L&0g<}!;#1+L'?2cO1IL%(>N'w!;#1IY! )"IY!$R¡#S%(š!;#12$%(>
R$+J),)"+L"IY!N!;>N%(`&/J!$2$'?Z‚U]+L&'N%(-&2@>l!;#1IY!
46246IJ !
!;+Q^&1-z!;#&2h&IJ),2$3vIJ-&-&>?2@'?'?3vKJ2$&-&2$>N3vKLO&IzIL&-
`&%C>!$#&-&IY!$2A+Ja•'!$/&-&2$;!;'h%(I'N%(&KJ0(2Q>?2$0CIY!$%(+L1ZuÄu+Y!$%(R$2_!;#&IP! KLO&I"%('hIfÀIL%(0CIL`&0C2Qab>?+L)„ƒ”`&/!t!$#&2
+P!;#&2$>
ab+J/&>•IY!N!;>N%(`&/J!$2$'
IL>N2]O&>?2$'N2$;!7%( ²*Z]„1+L3 462]&2@2$-cIW46I<A!$+8%C !$2$0(0C%(KJ2$ !$0X<AR$+L)c`1%(&2Œ!;#&2$'N2
!46+c>N2$0(IP!;%C+L&'N±

²on)p…ƒ

¶·?¸
¹?º
¹?»¹!†¹¼D½~¾G¿~Á·?¸
¹?º
¹?»¹¼½|¹,η¸¹?º
¹†|½Â

ÃGÍGÎÀÍGÏÀ¶©ÁhÂ*-ÀÅ/Àɹ—ÁhÂ"ÅÀÈGÈF0GÉJ1F1¹,ÁtÂËÀÉ-GÈÀÉ0
¹Ît¡ÀŹ—Áh¨ÆJIc0FdGÌÀÈGÅd.É
ôGÁõ ö©Á¹Î
MNGÍÀÁGÍ©ÁtÂ*-GÅF/GÉFEGÎhÂ)-.Å/ÀÉ
ˆ ø
‰ Áh¨ÅGÈGÈFG0 É211Š
E ÎhÂÅÀÈGÈF0GÉJ1F1
£‹ Z8«¬G¥l®lŒ¦¥uj±\[]#1%('h%(' U /&':!FIcR$+.+J0v!$#&%(1KL39%(R$IJ'?2W462Q#&IfD2W!;+.+")"IL;<H&IL)"%(1KcR$+LlŽ1%(RN!$']IL&-
R$+J&ab/&'N%(+L1'uIL>N%('?%C&KLZ7‹”2cR@IL}/1'?2*!;#1%('h+LOv2$>?IP!;+J>]!$+>?2@&IL)"2AI>N2$0(IP!;%C+L&‘C't&IJ),2zIL&-1ML+L>Q+J&2
+J>t)"+L>N28+Ja
%X!;']IP!?!$>?%(`1/J!;2$'NZu v+L>]2N‰IL)"O&0C2$39IL'N'?/&)"2h4u2]4uIL;!t!$+,>N2$&IJ),28²…!$+ IJ&-})"I Ž 2
%g!;'hR$+L0C/&),1'ˆ!;+`v2AR@IL0(0C2$- e £ 3 e ú IL1- e  ZQ[]#&%C't%C't)"+L':!F/&'N2$ab/&0548%X!$#=>N2$0(IP!;%C+L&IJ0“IJ0(KL2@`&>?IJ3
0C% Ž 2h'?+L±

‘ ;“’”–•& ‚—# s˜t™ k¨²*s

5 ‹
CS 5614: Misc. SQL Stuff, Safety in Queries 81

What is still to be covered


(and will be)

† Declaring constraints
† Domain Constraints
† Referential Integrity (Foreign Keys)
† More SQL Stuff
† Subqueries
† Aggregation

† SQL Peculiarities
† Strange Phenomena
† More on Bag Semantics
† Ifs and Buts

† Embedding SQL in a Programming Environment


† Accessing DBs from within a PL
† (will be covered in Module 3)

CS 5614: Misc. SQL Stuff, Safety in Queries 82

What will be mentioned


(but not covered in detail)

† Triggers
† Read Cow Book or Boat Book
† More SQL Gory Details

† Recursive Queries (SQL3)


† Why do we need these?
† Security

† Authorization and Privacy

† Trends towards Object Oriented DBMSs


CS 5614: Misc. SQL Stuff, Safety in Queries 83

Tuple-Based Domain Constraints

† Already Seen
† NOT NULL
† UNIQUE, PRIMARY KEY etc.
† In General

CREATE TABLE Students


(sid CHAR(9),
name VARCHAR(20),
login CHAR(8),
age INTEGER,
gpa REAL,
CHECK (gpa >= 0.0)
);

† Note: Implementations vary, but this is the general idea

† Other Complicated Forms


† Constraints on whole relations, Assertions

CS 5614: Misc. SQL Stuff, Safety in Queries 84

Referential Integrity Constraints

† Foreign Keys
† An attribute a of R1 is a foreign key if it “references”
the primary key (say b) of another relation R2
† In addition, there is a ref. integrity constraint from R1 to R2.
† Example
† login is a FOREIGN KEY for Students
CREATE TABLE Students
(sid CHAR(9) PRIMARY KEY,
name VARCHAR(20),
login CHAR(8)
REFERENCES Accounts(acct),
age INTEGER,
gpa REAL
);

CREATE TABLE Accounts


(
acct CHAR(8) PRIMARY KEY
);
CS 5614: Misc. SQL Stuff, Safety in Queries 85

Alternatively

† Can use “FOREIGN KEY” construct

CREATE TABLE Students


(sid CHAR(9) PRIMARY KEY,
name VARCHAR(20),
login CHAR(8),
age INTEGER,
gpa REAL,
FOREIGN KEY login
REFERENCES Accounts(acct)
);

CREATE TABLE Accounts


(
acct CHAR(8) PRIMARY KEY
);

† Note: acct should be declared as PRIMARY KEY for Accounts!


† in both cases

CS 5614: Misc. SQL Stuff, Safety in Queries 86

SQL Subqueries

† Given
Students(sid,name,login,age,gpa)
HasCar(sid,carname)
† Find
† the car of the student with login=”mark”
† Traditional Way

SELECT carname
FROM Students, HasCar
WHERE Students.login=’mark’
AND Students.sid=HasCar.sid;

† The ‘Subway’

SELECT carname
FROM HasCar
WHERE sid=
(SELECT sid FROM Students
WHERE login=’mark’);
CS 5614: Misc. SQL Stuff, Safety in Queries 87

Aggregation

† Given
Students(sid,name,login,age,gpa)

† Find
† the average of the ages of all the students
† Solution

SELECT AVG(age)
FROM Students;

† Other Operations
† SUM (summation of all the values in a column)
† MIN (least value)
† MAX (highest value)
† COUNT (the number of values), e.g.
SELECT COUNT(*)
FROM Students;

† COUNTs the number of Students!

CS 5614: Misc. SQL Stuff, Safety in Queries 88

Ordering

† Given
Students(sid,name,login,age,gpa)

† List
† the students in (ascending) alphabetical order of name
† Solution

SELECT *
FROM Students
ORDER BY name;

† and that for DESCending ORDER is

SELECT *
FROM Students
ORDER BY name DESC;

† Default is ASC
CS 5614: Misc. SQL Stuff, Safety in Queries 89

Grouping

† Given
Students(sid,name,login,age,gpa)

† Find
† the names of students with gpa=4.0 and
† group people with like ages together

† Solution

SELECT name
FROM Students
WHERE gpa=4.0
GROUP BY name;

CS 5614: Misc. SQL Stuff, Safety in Queries 90

More on Grouping

† Given
Students(sid,name,login,age,gpa)

† Find
† the names of students with gpa=4.0 and
† group people with like ages together and
† show only those groups that have more than 2 students in it

† Solution

SELECT name
FROM Students
WHERE gpa=4.0
GROUP BY name
HAVING COUNT(*) > 2;
CS 5614: Misc. SQL Stuff, Safety in Queries 91

Summary of SQL Syntax

† General Form

SELECT <attribute(s)>
FROM <relation(s)>
WHERE <condition(s)>
GROUP BY <attribute(s)>
HAVING <grouping condition(s)>

† Order of Execution
† FROM
† WHERE
† GROUP BY
† HAVING
† SELECT

CS 5614: Misc. SQL Stuff, Safety in Queries 92

Views

† Can be viewed as temporary relations


† do not exist physically BUT
† can be queried and modified (sometimes) just like normal relations

† Example:

CREATE VIEW GoodStudents(id,name) AS


SELECT sid,name
FROM Students
WHERE gpa=4.0;

SELECT *
FROM GoodStudents
WHERE name=’Mark’;

† Can we update the original relation using the GoodStudents VIEW?


CS 5614: Misc. SQL Stuff, Safety in Queries 93

Beginning of Wierd Stuff

† SQL uses Bag Semantics


† meaning: does not normally eliminate duplicates
† e.g. the SELECT clause

† BUT (a big BUT)

† this doesn’t apply to


† UNION, INTERSECT and DIFFERENCE

† Either way, it provides facilities to do whatever we want

† If you want duplicates eliminated in SELECT clause


† use SELECT DISTINCT .....

† If you want to prevent elimination of duplicates in UNION etc.


† use (SELECT ...) UNION ALL (SELECT ...)
† Likewise for INTERSECT and DIFFERENCE

CS 5614: Misc. SQL Stuff, Safety in Queries 94

... and that’s just the tip of the iceberg

† What happens with the following code?

SELECT R.A
FROM R,S,T
WHERE R.A = S.A or R.A = T.A

† when R(A) has {2,3}, S(A) has {3,4} and T(A) is {}

† Confusion Reigns!
CS 5614: Misc. SQL Stuff, Safety in Queries 95

Safety in Queries

† Some queries are inherently “unsafe”


† should not be permitted in DB access
† Example
† Given only the following relation
Students(id)

† Find all those who are not students

† Easy to distinguish unsafe queries via common-sense


† Final result is not closed
† Is there an automatic way to determine “safety”?

CS 5614: Misc. SQL Stuff, Safety in Queries 96

Answer: Yes!

† Easiest to spot when written in Datalog

Answer(id) <- NOTStudents(id).

† Golden Rule
† Any variable that appears anywhere must also
appear in a non-negated body part
† In this case, id causes the query to be unsafe
† Example of a Safe Query

Answer(id) <- People(id), NOT Students(id)

† This produces all those people who are NOT students


† safe because the People relation provides a reference point
† id which appears in a negated body part also appears non-negated
CS 5614: Misc. SQL Stuff, Safety in Queries 97

More Dangers

† Problem not restricted to negated body parts


† occurs even with arithmetic body parts (why?)
† Given
† only the following relation
Students(id,age)

† Find all those numbers that are greater than the age of some student
Answer(x) <- Student(id,age), x>age.

† Extension to previous rule


† Any variable that appears anywhere must also
appear in a non-negated, non-arithmetic body part
† In this case, x causes the query to be unsafe
bcoz it doesn’t appear in a non-negated, non-arithmetic part

CS 5614: Misc. SQL Stuff, Safety in Queries 98

One More Example

† Given
† a relation Composite(x)
† which lists all the composite numbers
† Write a query to find
† the prime numbers
† Wrong Way
† Prime(x) <- NOT Composite(x).
† Right Way
† Prime(x) <- Number(x), NOT Composite(x).
† Safety in Other Notations
† Relational Algebra: via the subtraction operator
† SQL: via the EXCEPT construct

† Notice how SQL and Relational Algebra do not allow unsafe queries
† because there is no way to write such queries with the given constructs
† how clever, eh? :-)
† It is always amazing how “languages” force you to think in a certain manner
† a problem long studied by philosophers
CS 5614: Misc. SQL Stuff, Safety in Queries 99

Recursion in Queries

† Used to specify an indefinite number of “applications” of a relation

† Example
† Given only the following relation
Person(name,parent)

† Find all the ancestors of “Mark”

† Easy to find an ancestor at a predefined level


† parent: Use Person
† grandparent: Join Person with Person
† great-grandparent: Join Person with Person with Person
† and so on.

† To find an ancestor at no predefined level


† Need to join Person with Person an “indefinite” number of times

† SQL3 provides support for recursive definitions

CS 5614: Misc. SQL Stuff, Safety in Queries 100

Solution in Datalog

† First, the base case

Ancestor(x,y) <- Person(x,y).

† Then, the inductive step

Ancestor(x,y) <- Person(x,z), Ancestor(z,y).

† Can also write the previous rule as

Ancestor(x,y) <- Ancestor(x,z), Ancestor(z,y).

† why?
CS 5614: Misc. SQL Stuff, Safety in Queries 101

Recursion in SQL3

† Use the “WITH RECURSIVE ... SELECT” construct

† Example

WITH RECURSIVE Ancestor(name,ans) AS

(SELECT *
FROM Person)
UNION
(SELECT Person.name,Ancestor.ans
FROM Person, Ancestor
WHERE Person.parent=Ancestor.name)

SELECT * FROM Ancestor;

† Use with caution: Some kinds of recursive queries will not be allowed!
† example: the following Datalog query might not be allowed in SQL3
Ancestor(x,y) <- Ancestor(x,z), Ancestor(z,y).

† because the rule involves 2 applications of the recursively defined predicate


† “Linear recursion” allows only one (as in the SQL code above)

CS 5614: Misc. SQL Stuff, Safety in Queries 102

Final Example

† Be careful when combining negation, aggregation and recursion


† perfect recipe for disaster!
† Mutual Recursion
† Odd(x) <- Number(x), NOT Even(x).
† Even(x) <- Number(x), NOT Odd(x).
† What are the problems?
† Notice that the query appears “safe” (per Slide 96)
† cycles indefinitely!; no proper base cases

† Illegal in SQL3
† not because of mutual recursion
† but due to the fact that there is no “unique interpretation” to the query
† Eg: 6 could be either in Odd or in Even; both are acceptable!

† Sometimes mutual recursion is good and fruitful, if written properly


† with proper limiting constraints and base cases
CS 5614: Misc. SQL Stuff, Safety in Queries 103

Introduction to Deductive DBMSs

† Intersection of traditional RDBMSs and Logic Programming

† Example Systems
† CORAL (Univ. Wisc.)
† LDL++ (MCC)
† XSB Systems (SUNY, Stony Brook)
† Can be viewed as
† extending PROLOG-type systems with secondary storage
† extending RDBMSs with deductive functionality
† Mappings: Commonalities between PROLOG and DBMSs
† Predicate: Relation
† Argument: Attribute
† Ground Fact: Tuple
† Extensional Definition: Table (defined by data)
† Intensional Definition: Table (defined by a view)

CS 5614: Misc. SQL Stuff, Safety in Queries 104

PROLOG vs. RDBMSs

† Characteristics of PROLOG
† Tuple-at-a-time
† Backward Chaining
† Top-Down
† Goal-Oriented
† Fixed-Evaluation Strategy (Depth-First)
† Characteristics of RDBMSs
† Set-at-a-time (recall relational algebra)
† Forward Chaining
† Bottom-Up
† Query Optimizer figures a good evaluation strategy

† Example
† ancestor(X,X). parent(amy,bob).
† ancestor(X,Y) <- parent(X,Z), ancestor(Z,Y).
† Query
† Find the ancestors of bob: ancestor(X,bob)?
CS 5614: Misc. SQL Stuff, Safety in Queries 105

PROLOG Pitfalls

† Previous Example
† Linear Recursion
† Tail Recursion
† What if we reverse the order of clauses in
† ancestor(X,Y) <- parent(X,Z), ancestor(Z,Y).
† PROLOG goes into an infinite loop (why?)
† What if we make it
† ancestor(X,Y) <- ancestor(X,Z), ancestor(Z,Y).
† “Not Linear” Recursion
† Inference = Resolution + Unification
† Entailment in First Order Logic is Semi-decidable

CS 5614: Misc. SQL Stuff, Safety in Queries 106

Example of Deductive Query Optimization

† Same-Generation: Hello World of DDBMSs


† sg(X,Y) <- flat(X,Y).
† sg(X,Y) <- up(X,U),sg(U,V),down(V,Y).
† Magic: A Rewriting Technique
† Rewrite query such that advantages of bottom-up evaluation
goal-oriented behavior are combined

† Example: For the query


† sg(john,Z)?
† Magic produces
† sg(X,Y) <- magic_sg(X),flat(X,Y).
† sg(X,Y) <- magic_sg(X),up(X,U),sg(U,V),down(V,Y).
† magic_sg(john).
† How do you know when to stop?
† Iterative Fixpoint Evaluation (when the answer stops changing)
CS 5614: Misc. SQL Stuff, Safety in Queries 107

SQL in a Programming Environment

† Incorporating SQL in a complete application

† Why?
† There are some things we cannot do with SQL alone
† e.g. preserving complex states, looping, branching etc.
† Typically embed SQL in a host-language interface

† Problems: Impedance Mismatch


† SQL operates on sets of tuples
† Languages such as C, C++ operate on an individual basis

† Solution
† easy when SELECT returns only one row

† When more than one row is returned


† design an iterator to “run” over the results
† called a “cursor”

CS 5614: Misc. SQL Stuff, Safety in Queries 108

How are these implemented?

† Vendor-Specific Implementations
† ORACLE: PL/SQL (procedural extensions to SQL)

† Open Database Connectivity Standard


† Provides a standard API for transparent database access
† used when “database independence” is important
† used when required to “connect” to diverse data sources
CS 5614: Misc. SQL Stuff, Safety in Queries 109

Tradeoffs

† ODBC
† originated by Microsoft in 1991
† adds one more abstraction layer
† not as fast as a native API (does not exploit “special features”)
† least-common denominator approach
† constantly evolving

† PL/SQL etc.
† “tailored” to the details of the underlying DBMS
† might not extend to heterogeneous domains
† modeled after a specific programming language (e.g. Ada for Pl/SQL)

CS 5614: Misc. SQL Stuff, Safety in Queries 110

In Between: Stored Procedures

† Used for developing “tightly-coupled” applications


† ”push computations” selectively into the database system
† avoid performance degradation
† work in database address space instead of application address space
† Advantages
† No sending SQL statements to and fro
† eliminate pre-processing
† speedup by an order of magnitude
† Example Applications
† Database Adminstration
† Integrity Maintenance and Checks
† Database Mining
† Disadvantages
† Non-standard implementation
† Difficult to enforce transactional synchronization
† Without traditional SQL optimization, can lead to performance degradation
CS 5614: Misc. SQL Stuff, Safety in Queries 111

Introduction to Query Optimization

† Helps attain declarativeness of RDBMSs


† One of the main reasons for commercial success of DBMSs
† A motivating example
† Find all students with 4.0 gpa enrolled in CS5614

SELECT name
FROM Students, Classroll
WHERE Students.name = Classroll.studentname
AND Students.gpa = 4.0
AND Classroll.coursename = ‘CS5614’

† Two Strategies
† Do join and then filter out the ones with gpa <> 4.0 and course <> CS5614
† Filter first the ones with gpa <> 4.0 and course <> CS5614 and then Join
† Which is Better?
† Always good to “push selections” as far down into the query parse tree

CS 5614: Misc. SQL Stuff, Safety in Queries 112

How does a Query Optimizer work?

† Three Requirements
† A Search Space of “Plans”
† A Cost Model (for Plan evaluation)
† An Enumeration Algorithm
† Ideally
† Search Space: contains both good and efficient plans
† Cost Models: cheap to compute and accurate
† Enumeration Algorithm: efficient (not a monkey-typewriter algorithm)
† Example of a Search Space
† See Previous Slide
† Examples of Cost Models
† #(tuples) evaluation
† #(main memory locations) etc.
† Example of an enumeration algorithm
† Sequential enumeration of a lattice of plans
† Dynamic Programming vs. Greedy Approaches
CS 5614: Misc. SQL Stuff, Safety in Queries 113

A Simple Measure of “Cost”

† #(Tuples) in a query

† Easiest to compute for


† Cartesian Product: #(R X S) = #(R)#(S)
† Projection:#(Pi(R)) = #(R)
† A Notation for Other Operations
† V(R,A) = Number of distinct values of attribute “A” in R
† formulas assume that all values of “A” are equally likely in R
† Holds in average case for most distributions (e.g. Zipf)
† Selectivity Factors for Selection Operations
† Equality Tests: Use 1/V(R,A)
† < or > Tests: Use 1/3
† “<>” Test: Use (V(R,A)-1)/V(R,A)
† AND Conditions: Multiply Selectivity Factors
† OR Conditions: Three Choices
† Sum of results from individual selectivity factors
† Max(sum,total size of relation): why?
† n(1-(1-m1/n)(1-m2/n)) formula : most accurate

CS 5614: Misc. SQL Stuff, Safety in Queries 114

Estimating the Size of a Join

† Assume: R(X,Y) Join S(Y,Z)

† Range of Values
† Minimum: 0
† In-between: #(R) (if Y is a foreign key for R and a key for S)
† Maximum: #(R)#(S) (if Y’s in R and S are all the same)
† Assumptions for Join Size Estimation
† Containment of Value Sets
† Preservation of Value Sets
† Containment of Value Sets
† If V(R,Y) <= V(S,Y) then the Y’s in R are a subset of the Y’s in S
† Satisfied when Y is a foreign key in R and a key in S
† Preservation of Value Sets
† #(R Join S,X) = #(R,X)
† #(R Join S,Z) = #(S,Z)
† why is this reasonable?
CS 5614: Misc. SQL Stuff, Safety in Queries 115

The Actual Estimate

† Assume that V(R,Y) <= V(S,Y)


† Every tuple in R has a chance of 1/V(S,Y) of joining with a tuple of S
† Every tuple in R has a chance of #(S)/V(S,Y) of joining with S
† All tuples in R have a chance of #(R)#(S)/V(S,Y) of joining with S
† What if V(S,Y) <= V(R,Y)
† Answer: #(R)#(S)/V(R,Y)
† In general: #(R)#(S)/(max (V(S,Y),V(R,Y)))
† What if there are multiple join attributes
† Have a “max” factor in the denominator for each such attribute!
† How to Estimate #(R Join S Join T)?
† Does it matter which we do first?
† Surprise!
† Estimation formula preserves associativity of Joins!
† In other words, “it takes care of itself!”
† Thus, for a Join attribute appearing > 2 times
† 3 times: Use two highest values
† 4 times: Use three highest values etc.

CS 5614: Misc. SQL Stuff, Safety in Queries 116

More on Join Associativity and Commutativity

† Which is better: (R Join S) or (S Join R)


† Good to put the smaller relation on the left
† Why? Most Join algorithms are assymmetric
† Example:
† Construct a “good query tree” for the following

SELECT movietitle
FROM Actors,ActedIn
WHERE Actors.name = ActedIn.actorname
AND Actors.age = 23

† Number of Possible Trees of n Attributes


† Arises from the shape of the trees: T(n)
† Arises from permuting the leaves: n!
† Total choices: n! T(n)
CS 5614: Misc. SQL Stuff, Safety in Queries 117

What is T(n)?

† Sample Values
† 1: 1
† 2: 1
† 3: 2
† 4: 5
† 5: 14
† A formula
† T(1) = 1 (Basis)
† T(n) = T(1)T(n-1) + T(2)T(n-2) + ..... + T(n-1)T(1)
† Classifications
† Left-Deep Trees: All right children are leaves
† Right-Deep Trees: All left children are leaves
† Bushy Trees: Neither Left-Deep nor Right-Deep
† Choosing a Join Order: Restricted to Left-Deep Trees
† By Dynamic Programming: O(n!)
† Greedy Approach: Make local selections

CS 5614: Misc. SQL Stuff, Safety in Queries 118

Example

† Consider
† R(a,b): #(R) = 1000, V(R,a) = 100, V(R,b) = 200
† S(b,c): #(S) = 1000, V(S,b) = 100, V(S,c) = 500
† T(c,d): #(T) = 1000, V(T,c) = 20, V(T,d) = 50
† Possible Join Orders
† (R Join S) Join T
† (S Join R) Join T (same as above; why?)
† (R Join T) Join S
† (T Join R) Join S (same as above)
† (S Join T) Join R
† (T Join S) Join R (same as above)
† Cost Estimation = Sizes of Intermediate Relations
† (R Join S) Join T: 5000
† (R Join T) Join S: 1000000
† (S Join T) Join R: 2000
† Best Plan = (S Join T) Join R
CS 5614: Misc. SQL Stuff, Safety in Queries 119

Database Tuning: Why?

† Two Families of Queries


† OLTP (Access small number of records)
† OLAP (Summarize from a large number of records)

† Sources of Poor Performance


† Imprecise Data Searches
† Random vs. Sequential Disk Accesses
† Short Bursts of Database Interaction
† Delays due to Multiple Transactions

† What can be done?


† Tune Hardware Architecture
† Tune OS
† Tune Data Structures and Indices

CS 5614: Misc. SQL Stuff, Safety in Queries 120

Examples of Database Tuning

† To normalize or not to
† Sacrificing Redundancy Elimination
† Sacrificing Dependency Elimination
† Several Choices of Normalized Schemas
† Vertical Partitioning Applications
† Recomputing Indices
† Histograms etc. might be outdated
† Restricting Uses of Subqueries
† Unnesting query blocks by Joins
† Declining the Use of Indices
† Table Scans for Small Tables
† Rule-based optimization: Rewrite A=6 as “A+0=6”
† Provide Redundant Tables
† Decision-Support/ Data Mining Queries

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy