14-queryexecution2
14-queryexecution2
Systems
Query
Execution II
15-445/645 FALL 2024 PROF. ANDY PAVLO
ADMINISTRIVIA
Project #2 is due Sunday Oct 27th @ 11:59pm
→ Saturday Office Hours on Oct 26th @ 3:00-5:00pm
Distributed DBMSs
→ Resources can be far from each other.
→ Resources communicate using slow(er) interconnect.
→ Communication costs and problems cannot be ignored.
TODAY’S AGENDA
Process Models
Execution Parallelism
I/O Parallelism
DB Flash Talk: ClickHouse
PROCESS MODEL
A DBMS’s process model defines how the system
is architected to support concurrent requests /
queries.
PROCESS MODEL
Approach #1: Process per DBMS Worker
Connect
SQL Commands
Connect
SQL Commands
EMBEDDED DBMS
DBMS runs inside the same address space as the
application. Application is (primarily) responsible
for threads and scheduling.
The application may support outside connections.
→ Examples: BerkeleyDB, SQLite, RocksDB, LevelDB
Application
5-445/645 (Fall 2024)
SCHEDULING
For each query plan, the DBMS decides where,
when, and how to execute it.
→ How many tasks should it use?
→ How many CPU cores should it use?
→ What CPU core should the tasks execute on?
→ Where should a task store its output?
PROCESS MODELS
Advantages of a multi-threaded architecture:
→ Less overhead per context switch.
→ Do not have to manage shared memory.
PARALLEL EXECUTION
The DBMS executes multiple tasks simultaneously
to improve hardware utilization.
→ Active tasks do not need to belong to the same query.
→ High-level approaches do not vary on whether the DBMS
is multi-threaded, multi-process, or multi-node.
INTER-QUERY PARALLELISM
Improve overall performance by allowing multiple
queries to execute simultaneously.
→ Most DBMSs use a simple first-come, first-served policy.
INTRA-QUERY PARALLELISM
Improve the performance of a single query by
executing its operators in parallel.
→ Think of the organization of operators in terms of a
producer/consumer paradigm.
INTRA-QUERY PARALLELISM
Approach #1: Intra-Operator (Horizontal) Most Common
INTRA-OPERATOR PARALLELISM
Approach #1: Intra-Operator (Horizontal)
→ Operators are decomposed into independent instances that
perform the same function on different subsets of data.
INTRA-OPERATOR PARALLELISM
SELECT A.id, B.value
FROM A JOIN B
ON A.id = B.id
WHERE A.value < 99
AND B.value > 100
⨝
A 1 A2 A 3
A B
1 2 3
INTRA-OPERATOR PARALLELISM
SELECT A.id, B.value
FROM A JOIN B
ON A.id = B.id
WHERE A.value < 99
AND B.value > 100
⨝
A 1 A2 A 3
A B
1 2 3
INTRA-OPERATOR PARALLELISM
SELECT A.id, B.value
FROM A JOIN B
ON A.id = B.id
WHERE A.value < 99
AND B.value > 100
⨝
A 1 A2 A 3
A B
1 2 3
INTRA-OPERATOR PARALLELISM
SELECT A.id, B.value
FROM A JOIN B
ON A.id = B.id
WHERE A.value < 99
AND B.value > 100
Exchange
INTRA-OPERATOR PARALLELISM
SELECT A.id, B.value
FROM A JOIN B
ON A.id = B.id
Exchange
⨝ WHERE
AND
A.value < 99
B.value > 100
INTRA-OPERATOR PARALLELISM
SELECT A.id, B.value
FROM A JOIN B
ON A.id = B.id
Exchange
⨝ WHERE
AND
A.value < 99
B.value > 100
INTRA-OPERATOR PARALLELISM
SELECT A.id, B.value
FROM A JOIN B
ON A.id = B.id
Exchange
⨝ WHERE
AND
A.value < 99
B.value > 100
INTRA-OPERATOR PARALLELISM
SELECT A.id, B.value
Exchange FROM A JOIN B
ON A.id = B.id
Exchange
⨝ WHERE
AND
A.value < 99
B.value > 100
INTRA-OPERATOR PARALLELISM
SELECT A.id, B.value
Exchange FROM A JOIN B
ON A.id = B.id
Exchange
⨝ WHERE
AND
A.value < 99
B.value > 100
EXCHANGE OPERATOR
Exchange Type #1 – Gather Gather
→ Combine the results from multiple workers
into a single output stream.
Operator Operator Operator
INTER-OPERATOR PARALLELISM
Approach #2: Inter-Operator (Vertical)
→ Operations are overlapped to pipeline data from one stage
to the next without materialization.
→ Workers execute multiple operators from different
segments of a query plan at the same time.
→ Still need exchange operators to combine intermediate
results from segments.
INTER-OPERATOR PARALLELISM
SELECT A.id, B.value
FROM A JOIN B
ON A.id = B.id
WHERE A.value < 99
AND B.value > 100
⨝
for r1 ∊ outer:
1
⨝ for r2 ∊ inner:
emit(r1⨝r2)
A B
5-445/645 (Fall 2024)
INTER-OPERATOR PARALLELISM
SELECT A.id, B.value
FROM A JOIN B
ON A.id = B.id
WHERE A.value < 99
for r ∊ incoming: AND B.value > 100
2
emit((r))
⨝
for r1 ∊ outer:
1
⨝ for r2 ∊ inner:
emit(r1⨝r2)
A B
5-445/645 (Fall 2024)
INTER-OPERATOR PARALLELISM
SELECT A.id, B.value
FROM A JOIN B
ON A.id = B.id
WHERE A.value < 99
for r ∊ incoming: AND B.value > 100
2
emit((r))
⨝
for r1 ∊ outer:
1
⨝ for r2 ∊ inner:
emit(r1⨝r2)
A B
5-445/645 (Fall 2024)
BUSHY PARALLELISM
Approach #3: Bushy Parallelism
→ Hybrid of intra- and inter-operator parallelism where
workers execute multiple operators from different
segments of a query plan at the same time.
→ Still need exchange operators to combine intermediate
results from segments.
BUSHY PARALLELISM
SELECT *
3 Exchange 4 FROM A
JOIN B
⨝ ⨝ JOIN
JOIN
C
D
Exchange Exchange
⨝ ⨝
⨝
A B C D ⨝ ⨝
1 2
A B C D
5-445/645 (Fall 2024)
OBSERVATION
Using additional processes/threads to execute
queries in parallel won’t help if the disk is always the
main bottleneck.
I/O PARALLELISM
Split the DBMS across multiple storage devices to
improve disk bandwidth latency.
MULTI-DISK PARALLELISM
Store data across multiple disks to File of 6 pages (logical view):
improve performance + durability. page page page page page page
1 2 3 4 5 6
Striping (RAID 0)
MULTI-DISK PARALLELISM
Store data across multiple disks to File of 6 pages (logical view):
improve performance + durability. page page page page page page
1 2 3 4 5 6
Mirroring (RAID 1)
MULTI-DISK PARALLELISM
Store data across multiple disks to File of 6 pages (logical view):
improve performance + durability. page page page page page page
1 2 3 4 5 6
MULTI-DISK PARALLELISM
Store data across multiple disks to
improve performance + durability.
Performance
Hardware-based: I/O controller
makes multiple physical devices
appear as single logical device.
→ Transparent to DBMS (e.g., RAID).
Durability Capacity
S
→ Faster and more flexible.
→ s erasure codes at the file/object level.
DATABASE PARTITIONING
Some DBMSs allow you to specify the disk location
of each individual database.
→ The buffer pool manager maps a page to a disk location.
PARTITIONING
Split a single logical table into disjoint physical
segments that are stored/managed separately.
CONCLUSION
Parallel execution is important, which is why
(almost) every major DBMS supports it.
NEXT CLASS
Query Optimization
→ Logical vs Physical Plans
→ Search Space of Plans
→ Cost Estimation of Plans