The non-recursive with
clause, views, and derived tables (from
clause subqueries) provide very similar functionality. The main difference is their visibility. Consequently, the principle of least astonishment suggests that refactoring a derived table into a with
query should not have any side effects on performance. The tested databases generally obey this principle.
The following sections outline the most important optimizations that interfere with the with
clause and present an overview of which databases apply them to with
queries, views, and derived tables.
Predicate Pushdown
Predicate pushdown is an optimization that applies conditions (predicates) as early as possible—ideally preventing any loading of unneeded rows.
The following example uses a derived table to demonstrate this:
SELECT c1, c2, ...
FROM (SELECT c1, c2, ...
FROM base_table
) derived_table
WHERE c1 = ?
If the subquery is executed in isolation, it returns all rows of base_table
. Even if there is an index on c1
, the isolated subquery doesn’t know about the restriction on c1
and so it cannot use the index.
Predicate pushdown passes the condition to the subquery so that the query effectively looks like this:
SELECT c1, c2, ...
FROM (SELECT c1, c2, ...
FROM base_table
WHERE c1 = ?
) derived_table
In this example, one could ague that the where
clause should be written in the subquery anyway. However, this argument doesn’t apply when querying a view instead of a derived table.
Predicate pushdown is subject to some logical constraints. For example, join predicates cannot be pushed below the first join they affect. On the other hand, predicates can be pushed through group by
and window functions if they are among the grouping- or partitioning keys.
Predicate pushdown is particularly important to make better use of indexes.
Projection Pushdown
Projection pushdown or column elimination aim to eliminate unneeded columns as early as possible—ideally not fetching them at all.
Consider the following example, which uses a view for demonstration:
CREATE VIEW view_name AS
SELECT *
FROM base_table
WHERE deleted = 0
SELECT count(*)
FROM view_name
Note that count(*)
does not need any table columns.0 If the database can push this information down, i.e. the columns that are actually needed, the database can avoid fetching unneded columns.
In the above example, projection pushdown is particularly important if there is an index on the deleted
column: in that case that database can satisfy the query from the index alone (the so-called Index-Only Scan) and does not need to fetch the other columns from the table itself. This can improve query speed by an order of magnitude. Learn more about the Index-Only Scan on Use The Index, Luke!.
If you like this page, you might also like …
… to subscribe my mailing lists, get free stickers, buy my book or join a training.
Sort Elimination
Sort elimination removes redundant sort operations. Redundant sort operations can arise out of multiple order by
clauses:
WITH cte
AS (SELECT id
FROM feature_with
ORDER BY id
)
SELECT id
FROM cte
ORDER BY id
There is no reason to sort the result twice. Even if the order by
clauses were different, only the outermost order by
clause determines the row order in the result. With some exceptions,1 any order by
clause except the outermost one can be eliminated.
Redundant sort operations are less obvious when they arise out of various order by
, group by
(potentially sort-based), over (partition by… order by…)
, and join (sort-merge) operations. Additionally, indexes might yield rows in a useful order and might thus make a sort operation obsolete.
In the following example, a database might sort the rows on (c1, c2)
to resolve the row_number()
window function.2 In that case, the order produced by the with
query already satisfies the outer order by
clause so that the database doesn’t need to run another sort operation.
WITH cte
AS (SELECT c1
, c2
, ROW_NUMBER()
OVER(PARTITION BY c1 ORDER BY c2) rn
FROM table_name
)
SELECT c1, rn
FROM cte
ORDER BY c1, c2
Note that the outer order by
clause is still required because the over
clause only defines the relevant order for the window function. If the database executes the window function differently, e.g., distributing the partitions to parallel threads, then the database cannot eliminate the last sort operation. Omitting the outer order by
clause would then return the rows in undefined order.
Comparison of Products
Not all tested databases optimize with
clauses in the same way that they optimize views or derived tables.
- Not with window functions
- Not with window functions • When using
top
in the inner query - When using
top
in the inner query - When using a subquery in the view to allow
order by
inside the view