0% found this document useful (0 votes)
44 views14 pages

10 DistQueryOptimization

The document discusses distributed database management systems and distributed query processing. It outlines topics on distributed DBMS architecture, design, query processing, transaction management, and building distributed database systems. It also covers distributed query processing methodology including query decomposition, optimization, and processing distributed queries.

Uploaded by

RonaldMartinez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views14 pages

10 DistQueryOptimization

The document discusses distributed database management systems and distributed query processing. It outlines topics on distributed DBMS architecture, design, query processing, transaction management, and building distributed database systems. It also covers distributed query processing methodology including query decomposition, optimization, and processing distributed queries.

Uploaded by

RonaldMartinez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 14

Outline

 Introduction
 Background
 Distributed DBMS Architecture
 Distributed Database Design
 Distributed Query Processing
 Query Processing Methodology
 Distributed Query Optimization
 Distributed Transaction Management (Extensive)
 Building Distributed Database Systems (RAID)
 Mobile Database Systems
 Privacy, Trust, and Authentication
 Peer to Peer Systems

Distributed DBMS © 1998 M. Tamer Özsu & Patrick Valduriez Page 7-9. 1
Useful References
 Textbook Principles of Distributed Database Systems,
Chapter 7

Distributed DBMS © 1998 M. Tamer Özsu & Patrick Valduriez Page 7-9. 2
Distributed Query Processing
Methodology
Calculus Query on Distributed
Relations

Query
Query
GLOBAL
GLOBAL
Decomposition
Decomposition SCHEMA
SCHEMA

Algebraic Query on Distributed


Relations
CONTROL
Data FRAGMENT
SITE Data FRAGMENT
Localization
Localization
SCHEMA
SCHEMA

Fragment Query

Global STATS
STATSON
ON
Global
Optimization
Optimization
FRAGMENTS
FRAGMENTS

Optimized Fragment Query


with Communication Operations

LOCAL Local LOCAL


LOCAL
Local
SITES Optimization
Optimization
SCHEMAS
SCHEMAS

Optimized Local
Queries

Distributed DBMS © 1998 M. Tamer Özsu & Patrick Valduriez Page 7-9. 3
Restructuring
 Convert relational calculus to relational ENAME Project
algebra
 Make use of query trees
 Example DUR=12 OR DUR=24
Find the names of employees other than J. Doe
who worked on the CAD/CAM project for
either 1 or 2 years. PNAME=“CAD/CAM” Select
SELECT ENAME
FROM EMP, ASG, PROJ
WHERE EMP.ENO = ASG.ENO ENAME≠“J. DOE”
AND ASG.PNO = PROJ.PNO
AND ENAME ≠ “J. Doe”
PNO
AND PNAME = “CAD/CAM”
AND (DUR = 12 OR DUR = 24)
ENO Join

PROJ ASG EMP

Distributed DBMS © 1998 M. Tamer Özsu & Patrick Valduriez Page 7-9. 4
Restructuring –Transformation
Rules
 Commutativity of binary operations
 RSSR
 R SS R
 RSSR
 Associativity of binary operations
 ( R  S )  T  R  (S  T)
 ( R S ) T  R (S T )
 Idempotence of unary operations
 A’(A’(R)) A’(R)
 p1(A1)(p2(A2)(R)) = p1(A1)  p2(A2)(R)
where R[A] and A'  A, A"  A and A'  A"
 Commuting selection with projection
Distributed DBMS © 1998 M. Tamer Özsu & Patrick Valduriez Page 7-9. 5
Restructuring – Transformation
Rules
 Commuting selection with binary operations
 p(A)(R  S)  (p(A) (R))  S
 p(Ai)(R (Aj,Bk) S)  (p(Ai) (R)) (Aj,Bk) S
 p(Ai)(R  T)  p(Ai) (R)  p(Ai) (T)

where Ai belongs to R and T

 Commuting projection with binary operations


 C(R  S)  A’(R)  B’(S)
 C(R (Aj,Bk) S)  A’(R) (Aj,Bk) B’(S)
 C(R  S)  C (R)  C (S)
where R[A] and S[B]; C = A'  B' where A'  A, B'  B
Distributed DBMS © 1998 M. Tamer Özsu & Patrick Valduriez Page 7-9. 6
Example
Recall the previous example: ENAME Project
Find the names of employees other than J.
Doe who worked on the CAD/CAM
project for either one or two years. DUR=12 OR DUR=24

SELECT ENAME PNAME=“CAD/CAM” Select


FROM PROJ, ASG, EMP
WHERE ASG.ENO=EMP.ENO
AND ASG.PNO=PROJ.PNO
ENAME≠“J. DOE”
AND ENAME≠“J. Doe”
AND PROJ.PNAME=“CAD/CAM” PNO

AND (DUR=12 OR DUR=24)


ENO Join

PROJ ASG EMP

Distributed DBMS © 1998 M. Tamer Özsu & Patrick Valduriez Page 7-9. 7
Equivalent Query
ENAME

PNAME=“CAD/CAM” (DUR=12  DUR=24) ENAME≠“J. DOE”

PNO ENO

ASG PROJ EMP

Distributed DBMS © 1998 M. Tamer Özsu & Patrick Valduriez Page 7-9. 8
Restructuring
ENAME

PNO

PNO,ENAME

ENO

PNO PNO,ENO PNO,ENAME

PNAME = "CAD/CAM" DUR =12 DUR=24 ENAME ≠ "J. Doe"

PROJ ASG EMP

Distributed DBMS © 1998 M. Tamer Özsu & Patrick Valduriez Page 7-9. 9
Cost Functions
 Total Time (or Total Cost)
 Reduce each cost (in terms of time) component individually
 Do as little of each cost component as possible
 Optimizes the utilization of the resources


Increases system throughput
 Response Time
 Do as many things as possible in parallel
 May increase total time because of increased total activity

Distributed DBMS © 1998 M. Tamer Özsu & Patrick Valduriez Page 7-9. 10
Total Cost
Summation of all cost factors

Total cost = CPU cost + I/O cost + communication cost

CPU cost = unit instruction cost  no.of instructions

I/O cost = unit disk I/O cost  no. of disk I/Os

communication cost = message initiation + transmission

Distributed DBMS © 1998 M. Tamer Özsu & Patrick Valduriez Page 7-9. 11
Total Cost Factors

 Wide area network


 message initiation and transmission costs high
 local processing cost is low (fast mainframes or
minicomputers)
 ratio of communication to I/O costs = 20:1

 Local area networks


 communication and local processing costs are more or less
equal
 ratio = 1:1.6

Distributed DBMS © 1998 M. Tamer Özsu & Patrick Valduriez Page 7-9. 12
Response Time

Elapsed time between the initiation and the completion of a query

Response time = CPU time + I/O time + communication time

CPU time = unit instruction time  no. of sequential instructions

I/O time = unit I/O time  no. of sequential I/Os

communication time = unit msg initiation time  no.


of sequential msg + unit transmission time  no. of
sequential bytes

Distributed DBMS © 1998 M. Tamer Özsu & Patrick Valduriez Page 7-9. 13
Example
Site 1
x units

Site 3

Site 2 y units

Assume that only the communication cost is considered


Total time = 2  message initialization time + unit transmission time
 (x+y)
Response time = max {time to send x from 1 to 3, time to send y
from 2 to 3}
time to send x from 1 to 3 = message initialization time + unit
transmission time  x
time to send y from 2 to 3 = message initialization time + unit
transmission time  y
Distributed DBMS © 1998 M. Tamer Özsu & Patrick Valduriez Page 7-9. 14

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy