PG18 Hacktober: 31 Days of New Features – Skip scan on multicolumn B-tree indexes

Welcome to the 2nd blog post in the “Indexing & Constraints” segment of PG18 Hacktober!

With the release of PostgreSQL 18, developers and database administrators gain access to several powerful indexing enhancements designed to push performance even further, especially for large datasets and complex queries.

If you have not read the earlier blog post on “Parallelized CREATE INDEX for GIN Indexes” it is available here

One of the most important additions is Skip Scan on Multicolumn B-tree Indexes, a long-awaited optimization that enables PostgreSQL to leverage existing indexes more intelligently, even when a query doesn’t filter on the leading column.

This feature fundamentally improves how PostgreSQL navigates multicolumn indexes by allowing index scans to “skip” irrelevant sections, avoiding unnecessary reads and dramatically speeding up query execution.

Introduction

Skip scan allows B-tree index scans to efficiently navigate through multicolumn indexes when one or more of the leading columns lack an equality condition.

It works by internally generating an = condition on underspecified columns, effectively splitting the index scan into multiple smaller ‘index searches.’

This means PostgreSQL can skip over large, irrelevant sections of the index, particularly effective when the skipped column has relatively few distinct values.

Why this matters

Traditionally, PostgreSQL could not use multicolumn B-tree indexes efficiently unless the leading columns were part of the filter condition. With skip scan, that limitation is finally lifted.

This enhancement allows multi-column indexes to be useful in far more query scenarios, especially when:

The query doesn’t restrict the first indexed column.

The restrictions on the leading column are non-equality conditions (e.g., <, >, or LIKE), while later columns have useful equality filters.

By enabling PostgreSQL to perform targeted subindex scans, the skip scan optimization improves real-world workloads where composite indexes exist but query patterns.

Key benefits

Better Index Utilization: Makes existing multicolumn indexes useful for a wider variety of queries.
Faster Query Execution: Reduces costly full index or sequential scans.
Smarter Planning: PostgreSQL dynamically determines when skip scans will outperform regular scans.
Improved I/O Efficiency: Skips irrelevant index sections, saving CPU and disk operations.

How it works

Skip Scan treats a multicolumn index as if it were composed of multiple logical subindexes, each representing a unique value in the skipped (unfiltered) leading column.

PostgreSQL then performs multiple small index scans, one for each distinct value, instead of scanning the entire index sequentially.

Examples:

In PostgreSQL 18
postgres=# EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM sales
WHERE category = 'cat1' AND sale_date > CURRENT_DATE - 30;
                                                                    QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on sales  (cost=1416.34..1816.53 rows=167 width=100) (actual time=0.163..1.517 rows=2056.00 loops=1)
   Recheck Cond: ((category = 'cat1'::text) AND (sale_date > (CURRENT_DATE - 30)))
   Heap Blocks: exact=326
   Buffers: shared hit=332 read=8
   ->  Bitmap Index Scan on idx_sales_region_category_date  (cost=0.00..1416.30 rows=167 width=0) (actual time=0.127..0.128 rows=2056.00 loops=1)
         Index Cond: ((category = 'cat1'::text) AND (sale_date > (CURRENT_DATE - 30)))
         Index Searches: 7
         Buffers: shared hit=6 read=8
 Planning:
   Buffers: shared hit=24 read=1
 Planning Time: 0.186 ms
 Execution Time: 2.564 ms
(12 rows)
In PostgreSQL 17
postgres=# EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM sales
WHERE category = 'cat1' AND sale_date > CURRENT_DATE - 30;
                                                                  QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on sales  (cost=1416.34..1816.53 rows=167 width=100) (actual time=0.838..2.770 rows=2056 loops=1)
   Recheck Cond: ((category = 'cat1'::text) AND (sale_date > (CURRENT_DATE - 30)))
   Heap Blocks: exact=326
   Buffers: shared hit=326 read=103
   ->  Bitmap Index Scan on idx_sales_region_category_date  (cost=0.00..1416.30 rows=167 width=0) (actual time=0.802..0.802 rows=2056 loops=1)
         Index Cond: ((category = 'cat1'::text) AND (sale_date > (CURRENT_DATE - 30)))
         Buffers: shared read=103
 Planning:
   Buffers: shared hit=24 read=10
 Planning Time: 1.724 ms
 Execution Time: 3.753 ms
(11 rows)

Key Observations:

Pre – PostgreSQL 18:
  • PostgreSQL 17 does not support Index Skip Scan or report Index Searches in the execution plan.
  • The planner performs a traditional Bitmap Index Scan, which reads more blocks from disk.
  • Buffer statistics show lower cache hits (326) and higher reads (103), indicating more disk I/O and slower execution.
  • The overall execution time is higher because PostgreSQL 17 must scan a larger portion of the index sequentially.
PostgreSQL 18:
  • PostgreSQL 18 introduces Index Skip Scan and enhanced index probe metrics.
  • The new field Index Searches shows how many internal index probes were performed.
  • This leads to faster lookups, fewer disk reads, and better cache utilization.
  • It allows PostgreSQL to reuse existing multi-column indexes even when the query does not filter on all leading columns.
  • In the example, PostgreSQL 18 shows Index Searches: 7, indicating efficient probing within the index.
  • Buffers show high cache hits (332) and low reads (8), meaning most data was fetched from memory.

Conclusion:

The Skip Scan optimization in PostgreSQL 18 represents a major leap in query performance and indexing intelligence. More than just a performance enhancement, it redefines how the query planner interacts with multicolumn indexes, allowing PostgreSQL to intelligently skip irrelevant index sections, reduce I/O, and speed up execution without requiring changes to query structure.

This smarter, more adaptable behavior empowers developers and DBAs to write efficient queries even in complex schemas and large-scale datasets. As PostgreSQL 18 continues to evolve into a modern, high-performance database engine, Skip Scan stands out as a game-changing feature for anyone aiming to optimize speed, scalability, and resource efficiency in demanding environments.

What’s next? We’re going to look at the new Replication features – Logical Replication of Stored Generated Columns. Stay tuned!

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top