0% found this document useful (0 votes)
3 views

14-queryexecution2

The document outlines the syllabus and key topics for the Database Systems Query Execution II course taught by Prof. Andy Pavlo in Fall 2024. It covers various aspects of parallel and distributed database management systems, including process models, execution parallelism, and specific techniques like intra-query and inter-query parallelism. Important deadlines for projects and homework assignments are also provided.

Uploaded by

ayush anand
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

14-queryexecution2

The document outlines the syllabus and key topics for the Database Systems Query Execution II course taught by Prof. Andy Pavlo in Fall 2024. It covers various aspects of parallel and distributed database management systems, including process models, execution parallelism, and specific techniques like intra-query and inter-query parallelism. Important deadlines for projects and homework assignments are also provided.

Uploaded by

ayush anand
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 47

Database

Systems
Query
Execution II
15-445/645 FALL 2024 PROF. ANDY PAVLO

15-445/645 FALL 2024 PROF. ANDY PAVLO


2

ADMINISTRIVIA
Project #2 is due Sunday Oct 27th @ 11:59pm
→ Saturday Office Hours on Oct 26th @ 3:00-5:00pm

Homework #4 is due Sunday Nov 3rd @ 11:59pm

5-445/645 (Fall 2024)

15-445/645 (Fall 2024)


5

PARALLEL QUERY EXECUTION


The database is spread across multiple resources to
→ Deal with large data sets that don’t fit on a single
machine/node
→ Higher performance
→ Redundancy/Fault-tolerance

Appears as a single logical database instance to the


application, regardless of physical organization.
→ SQL query for a single-resource DBMS should generate
the same result on a parallel or distributed DBMS.

5-445/645 (Fall 2024)

15-445/645 (Fall 2024)


6

PARALLEL VS. DISTRIBUTED


Parallel DBMSs
→ Resources are physically close to each other.
→ Resources communicate over high-speed interconnect.
→ Communication is assumed to be cheap and reliable.

Distributed DBMSs
→ Resources can be far from each other.
→ Resources communicate using slow(er) interconnect.
→ Communication costs and problems cannot be ignored.

5-445/645 (Fall 2024)

15-445/645 (Fall 2024)


7

TODAY’S AGENDA
Process Models
Execution Parallelism
I/O Parallelism
DB Flash Talk: ClickHouse

5-445/645 (Fall 2024)

15-445/645 (Fall 2024)


8

PROCESS MODEL
A DBMS’s process model defines how the system
is architected to support concurrent requests /
queries.

A worker is the DBMS component responsible for


executing tasks on behalf of the client and returning
the results.

5-445/645 (Fall 2024)

15-445/645 (Fall 2024)


9

PROCESS MODEL
Approach #1: Process per DBMS Worker

Approach #2: Thread per DBMS Worker Most Common

Approach #3: Embedded DBMS

5-445/645 (Fall 2024)

15-445/645 (Fall 2024)


10

PROCESS PER WORKER


Each worker is a separate OS process.
→ Relies on the OS dispatcher.
→ Use shared-memory for global data structures.
→ A process crash does not take down the entire system.
→ Examples: IBM DB2, Postgres, Oracle

Connect

Application Dispatcher Worker Processes

5-445/645 (Fall 2024)

15-445/645 (Fall 2024)


10

PROCESS PER WORKER


Each worker is a separate OS process.
→ Relies on the OS dispatcher.
→ Use shared-memory for global data structures.
→ A process crash does not take down the entire system.
→ Examples: IBM DB2, Postgres, Oracle

SQL Commands

Application Dispatcher Worker Processes

5-445/645 (Fall 2024)

15-445/645 (Fall 2024)


11

THREAD PER WORKER


Single process with multiple worker threads.
→ DBMS (mostly) manages its own scheduling.
→ May or may not use a dispatcher thread.
→ Thread crash (may) kill the entire system.
→ Examples: MSSQL, MySQL, DB2, Oracle (2014)
Oracle (2014)

Almost every DBMS created in the last 20 years!

Connect

Application Dispatcher Worker Threads


5-445/645 (Fall 2024)

15-445/645 (Fall 2024)


11

THREAD PER WORKER


Single process with multiple worker threads.
→ DBMS (mostly) manages its own scheduling.
→ May or may not use a dispatcher thread.
→ Thread crash (may) kill the entire system.
→ Examples: MSSQL, MySQL, DB2, Oracle (2014)
Oracle (2014)

Almost every DBMS created in the last 20 years!

SQL Commands

Application Dispatcher Worker Threads


5-445/645 (Fall 2024)

15-445/645 (Fall 2024)


14

EMBEDDED DBMS
DBMS runs inside the same address space as the
application. Application is (primarily) responsible
for threads and scheduling.
The application may support outside connections.
→ Examples: BerkeleyDB, SQLite, RocksDB, LevelDB

Application
5-445/645 (Fall 2024)

15-445/645 (Fall 2024)


15

SCHEDULING
For each query plan, the DBMS decides where,
when, and how to execute it.
→ How many tasks should it use?
→ How many CPU cores should it use?
→ What CPU core should the tasks execute on?
→ Where should a task store its output?

The DBMS nearly always knows more than the OS.

5-445/645 (Fall 2024)

15-445/645 (Fall 2024)


16

PROCESS MODELS
Advantages of a multi-threaded architecture:
→ Less overhead per context switch.
→ Do not have to manage shared memory.

The thread per worker model does not mean that


the DBMS supports intra-query parallelism.

DBMS from the last 15 years use native OS threads


unless they are Redis or Postgres forks.

5-445/645 (Fall 2024)

15-445/645 (Fall 2024)


17

PARALLEL EXECUTION
The DBMS executes multiple tasks simultaneously
to improve hardware utilization.
→ Active tasks do not need to belong to the same query.
→ High-level approaches do not vary on whether the DBMS
is multi-threaded, multi-process, or multi-node.

Approach #1: Inter-Query Parallelism


Approach #2: Intra-Query Parallelism

5-445/645 (Fall 2024)

15-445/645 (Fall 2024)


18

INTER-QUERY PARALLELISM
Improve overall performance by allowing multiple
queries to execute simultaneously.
→ Most DBMSs use a simple first-come, first-served policy.

If queries are read-only, then this requires almost


no explicit coordination between the queries.
→ Buffer pool can handle most of the sharing if necessary.

If multiple queries are updating the database at the


same time, then this is tricky to do correctly…

5-445/645 (Fall 2024)

15-445/645 (Fall 2024)


19

INTRA-QUERY PARALLELISM
Improve the performance of a single query by
executing its operators in parallel.
→ Think of the organization of operators in terms of a
producer/consumer paradigm.

Approach #1: Intra-Operator (Horizontal)


Approach #2: Inter-Operator (Vertical)

These techniques are not mutually exclusive.


There are parallel versions of every operator.
→ Can either have multiple threads access centralized data
structures or use partitioning to divide work up.
5-445/645 (Fall 2024)

15-445/645 (Fall 2024)


20

PARALLEL GRACE HASH JOIN


Use a separate worker to perform the join for each
level of buckets for R and S after partitioning.

R(id,name) HTR HTS S(id,value,cdate)


0
1
h1 2 h1
⋮ ⋮
max

5-445/645 (Fall 2024)

15-445/645 (Fall 2024)


20

PARALLEL GRACE HASH JOIN


Use a separate worker to perform the join for each
level of buckets for R and S after partitioning.

R(id,name) HTR HTS S(id,value,cdate)


1
0
2 1
h1 3 2 h1
⋮ ⋮
n max

5-445/645 (Fall 2024)

15-445/645 (Fall 2024)


21

INTRA-QUERY PARALLELISM
Approach #1: Intra-Operator (Horizontal) Most Common

Approach #2: Inter-Operator (Vertical) Less Common

Approach #3: Bushy Higher-end Systems

5-445/645 (Fall 2024)

15-445/645 (Fall 2024)


22

INTRA-OPERATOR PARALLELISM
Approach #1: Intra-Operator (Horizontal)
→ Operators are decomposed into independent instances that
perform the same function on different subsets of data.

The DBMS inserts an exchange operator into the


query plan to coalesce/split results from multiple
children/parent operators.
→ Postgres calls this “gather”

5-445/645 (Fall 2024)

15-445/645 (Fall 2024)


23

INTRA-OPERATOR PARALLELISM
SELECT A.id, B.value
FROM A JOIN B
ON A.id = B.id
WHERE A.value < 99
AND B.value > 100



A 1 A2 A 3  
A B
1 2 3

5-445/645 (Fall 2024)

15-445/645 (Fall 2024)


23

INTRA-OPERATOR PARALLELISM
SELECT A.id, B.value
FROM A JOIN B
ON A.id = B.id
WHERE A.value < 99
AND B.value > 100


   ⨝
A 1 A2 A 3  
A B
1 2 3

5-445/645 (Fall 2024)

15-445/645 (Fall 2024)


23

INTRA-OPERATOR PARALLELISM
SELECT A.id, B.value
FROM A JOIN B
ON A.id = B.id
WHERE A.value < 99
AND B.value > 100


  
   ⨝
A 1 A2 A 3  
A B
1 2 3

5-445/645 (Fall 2024)

15-445/645 (Fall 2024)


23

INTRA-OPERATOR PARALLELISM
SELECT A.id, B.value
FROM A JOIN B
ON A.id = B.id
WHERE A.value < 99
AND B.value > 100
Exchange

Build HT Build HT Build HT 


  
   ⨝
A 1 A2 A 3  
A B
1 2 3

5-445/645 (Fall 2024)

15-445/645 (Fall 2024)


23

INTRA-OPERATOR PARALLELISM
SELECT A.id, B.value
FROM A JOIN B
ON A.id = B.id

Exchange
⨝ WHERE
AND
A.value < 99
B.value > 100

Build HT Build HT Build HT 


  
   ⨝
A 1 A2 A 3  
A B
1 2 3

5-445/645 (Fall 2024)

15-445/645 (Fall 2024)


23

INTRA-OPERATOR PARALLELISM
SELECT A.id, B.value
FROM A JOIN B
ON A.id = B.id

Exchange
⨝ WHERE
AND
A.value < 99
B.value > 100

Build HT Build HT Build HT 


  
   ⨝
A1 A2 A3 B1 B2 B3  
A B
1 2 3 1 2 3

5-445/645 (Fall 2024)

15-445/645 (Fall 2024)


23

INTRA-OPERATOR PARALLELISM
SELECT A.id, B.value
FROM A JOIN B
ON A.id = B.id

Exchange
⨝ WHERE
AND
A.value < 99
B.value > 100

Build HT Build HT Build HT 


     
      ⨝
A1 A2 A3 B1 B2 B3  
A B
1 2 3 1 2 3

5-445/645 (Fall 2024)

15-445/645 (Fall 2024)


23

INTRA-OPERATOR PARALLELISM
SELECT A.id, B.value
Exchange FROM A JOIN B
ON A.id = B.id

Exchange
⨝ WHERE
AND
A.value < 99
B.value > 100

Build HT Build HT Build HT Probe HT Probe HT Probe HT 


     
      ⨝
A1 A2 A3 B1 B2 B3  
A B
1 2 3 1 2 3

5-445/645 (Fall 2024)

15-445/645 (Fall 2024)


23

INTRA-OPERATOR PARALLELISM
SELECT A.id, B.value
Exchange FROM A JOIN B
ON A.id = B.id

Exchange
⨝ WHERE
AND
A.value < 99
B.value > 100

Build HT Build HT Build HT Probe HT Probe HT Probe HT 


     
      ⨝
A1 A2 A3 B1 B2 B3  
A B
1 2 3 1 2 3

5-445/645 (Fall 2024)

15-445/645 (Fall 2024)


24

EXCHANGE OPERATOR
Exchange Type #1 – Gather Gather
→ Combine the results from multiple workers
into a single output stream.
Operator Operator Operator

Exchange Type #2 – Distribute Operator Operator Operator

→ Split a single input stream into multiple Distribute


output streams.

Exchange Type #3 – Repartition Operator Operator


→ Shuffle multiple input streams across
multiple output streams. Repartition
→ Some DBMSs always perform this step after Operator Operator Operator
every pipeline (e.g., Dremel/BigQuery).
Source: Craig Freedman
Craig Freedman

5-445/645 (Fall 2024)

15-445/645 (Fall 2024)


25

INTER-OPERATOR PARALLELISM
Approach #2: Inter-Operator (Vertical)
→ Operations are overlapped to pipeline data from one stage
to the next without materialization.
→ Workers execute multiple operators from different
segments of a query plan at the same time.
→ Still need exchange operators to combine intermediate
results from segments.

Also called pipelined parallelism.

5-445/645 (Fall 2024)

15-445/645 (Fall 2024)


26

INTER-OPERATOR PARALLELISM
SELECT A.id, B.value
FROM A JOIN B
ON A.id = B.id
WHERE A.value < 99
AND B.value > 100



 
for r1 ∊ outer:
1
⨝ for r2 ∊ inner:
emit(r1⨝r2)

A B
5-445/645 (Fall 2024)

15-445/645 (Fall 2024)


26

INTER-OPERATOR PARALLELISM
SELECT A.id, B.value
FROM A JOIN B
ON A.id = B.id


WHERE A.value < 99
for r ∊ incoming: AND B.value > 100
2
emit((r))



 
for r1 ∊ outer:
1
⨝ for r2 ∊ inner:
emit(r1⨝r2)

A B
5-445/645 (Fall 2024)

15-445/645 (Fall 2024)


26

INTER-OPERATOR PARALLELISM
SELECT A.id, B.value
FROM A JOIN B
ON A.id = B.id


WHERE A.value < 99
for r ∊ incoming: AND B.value > 100
2
emit((r))



 
for r1 ∊ outer:
1
⨝ for r2 ∊ inner:
emit(r1⨝r2)

A B
5-445/645 (Fall 2024)

15-445/645 (Fall 2024)


27

BUSHY PARALLELISM
Approach #3: Bushy Parallelism
→ Hybrid of intra- and inter-operator parallelism where
workers execute multiple operators from different
segments of a query plan at the same time.
→ Still need exchange operators to combine intermediate
results from segments.

5-445/645 (Fall 2024)

15-445/645 (Fall 2024)


28

BUSHY PARALLELISM
SELECT *
3 Exchange 4 FROM A
JOIN B
⨝ ⨝ JOIN
JOIN
C
D

Exchange Exchange

⨝ ⨝

A B C D ⨝ ⨝
1 2
A B C D
5-445/645 (Fall 2024)

15-445/645 (Fall 2024)


29

OBSERVATION
Using additional processes/threads to execute
queries in parallel won’t help if the disk is always the
main bottleneck.

It can sometimes make the DBMS’s performance


worse if a worker is accessing different segments of
the disk at the same time.

5-445/645 (Fall 2024)

15-445/645 (Fall 2024)


30

I/O PARALLELISM
Split the DBMS across multiple storage devices to
improve disk bandwidth latency.

Many different options that have trade-offs:


→ Multiple Disks per Database
→ One Database per Disk
→ One Relation per Disk
→ Split Relation across Multiple Disks

Some DBMSs support this natively. Others require


admin to configure outside of DBMS.

5-445/645 (Fall 2024)

15-445/645 (Fall 2024)


31

MULTI-DISK PARALLELISM
Store data across multiple disks to File of 6 pages (logical view):
improve performance + durability. page page page page page page
1 2 3 4 5 6

Striping (RAID 0)

page page page


1 2 3

page page page


4 5 6

Physical layout of pages across disks

5-445/645 (Fall 2024)

15-445/645 (Fall 2024)


31

MULTI-DISK PARALLELISM
Store data across multiple disks to File of 6 pages (logical view):
improve performance + durability. page page page page page page
1 2 3 4 5 6

Mirroring (RAID 1)

page page page


1 1 1

page page page


2 2 2

Physical layout of pages across disks

5-445/645 (Fall 2024)

15-445/645 (Fall 2024)


31

MULTI-DISK PARALLELISM
Store data across multiple disks to File of 6 pages (logical view):
improve performance + durability. page page page page page page
1 2 3 4 5 6

Hardware-based: I/O controller Mirroring (RAID 1)


makes multiple physical devices
appear as single logical device. page page page
→ Transparent to DBMS (e.g., RAID). 1 1 1

page page page


2 2 2
S
→ Faster and more flexible.
→ s erasure codes at the file/object level. Physical layout of pages across disks

5-445/645 (Fall 2024)

15-445/645 (Fall 2024)


31

MULTI-DISK PARALLELISM
Store data across multiple disks to
improve performance + durability.
Performance
Hardware-based: I/O controller
makes multiple physical devices
appear as single logical device.
→ Transparent to DBMS (e.g., RAID).
Durability Capacity
S
→ Faster and more flexible.
→ s erasure codes at the file/object level.

5-445/645 (Fall 2024)

15-445/645 (Fall 2024)


32

DATABASE PARTITIONING
Some DBMSs allow you to specify the disk location
of each individual database.
→ The buffer pool manager maps a page to a disk location.

This is also easy to do at the filesystem level if the


DBMS stores each database in a separate directory.
→ The DBMS recovery log file might still be shared if
transactions can update multiple databases.

5-445/645 (Fall 2024)

15-445/645 (Fall 2024)


33

PARTITIONING
Split a single logical table into disjoint physical
segments that are stored/managed separately.

Partitioning should (ideally) be transparent to the


application.
→ The application should only access logical tables and not
have to worry about how things are physically stored.

We will cover this further when we talk about


distributed databases.

5-445/645 (Fall 2024)

15-445/645 (Fall 2024)


34

CONCLUSION
Parallel execution is important, which is why
(almost) every major DBMS supports it.

However, it is hard to get right.


→ Coordination Overhead
→ Scheduling
→ Concurrency Issues
→ Resource Contention

5-445/645 (Fall 2024)

15-445/645 (Fall 2024)


35

NEXT CLASS
Query Optimization
→ Logical vs Physical Plans
→ Search Space of Plans
→ Cost Estimation of Plans

5-445/645 (Fall 2024)

15-445/645 (Fall 2024)

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy