PG18 Hacktober: 31 Days of New Features: pg_overexplain – See the Real Plans

Welcome to Day 26 of the PG18 Hacktober series!

Have you had this discussion with Devs?

I wrote a clean SQL query. I even EXPLAINed it. Everything looks fine. But the actual query execution time? Way off. What’s going on?

The answer lies in what PostgreSQL actually executes, not what your query looks like.

Now, pg_overexplain, a powerful new extension in PostgreSQL 18 that reveals the true, rewritten, planner-visible SQL and the execution plan that actually gets run.  It adds two new EXPLAIN options:

  • EXPLAIN (DEBUG) reveals hidden details about the planner’s internal structures and decision-making.
  • EXPLAIN (RANGE_TABLE) shows an internal representation used to manage the data sources involved in a query.

This is a must-have for:

  • Query tuning and optimization
  • Debugging hidden query rewrites
  • Understanding view expansion
  • Analyzing rule-based transformations

Enabling pg_overexplain

You can load it dynamically for your current session:

postgres=# LOAD 'pg_overexplain';
LOAD

Or you can also preload it into some or all sessions by including pg_overexplain in the session_preload_libraries or shared_preload_libraries parameters in postgresql.conf file.  The session_preload_libraries and  shared_preload_libraries parameters are available under.

session_preload_libraries = 'pg_overexplain'

Understanding EXPLAIN(DEBUG)

The (DEBUG) option shows low-level planner details for each plan node. including flags, parameters, and internal identifiers that normal EXPLAIN hides.

It includes:

  • Command Type (e.g., SELECT, UPDATE)
  • Flags (planner attributes such as hasReturning, canSetTag)
  • Relation OIDs (internal table identifiers)
  • Plan Node ID (unique node ID for coordination in parallel queries)
  • Parallel Safe (whether it’s safe under Gather nodes)
  • extParam / allParam (parameter dependencies)

Steps to check how pg_overexplain works

Standard EXPLAIN

#We are checking with EXPLAIN
postgres=# EXPLAIN  
SELECT e.name, e.salary, d.dept_name
FROM employees e
JOIN departments d ON e.department_id = d.id
WHERE e.salary > 120000;                                   QUERY PLAN 
-----------------------------------------------------------------------------
 Hash Join  (cost=29.12..2192.55 rows=30108 width=52)
 Hash Cond: (e.department_id = d.id)
 ->  Seq Scan on employees e  (cost=0.00..2084.00 rows=30108 width=24)
         Filter: (salary > '120000'::numeric)
   ->  Hash  (cost=18.50..18.50 rows=850 width=36)
         ->  Seq Scan on departments d  (cost=0.00..18.50 rows=850 width=36)
(6 rows)
Time: 0.404 ms

Simple and readable, perfect for performance tuning.

EXPLAIN (DEBUG)

postgres=# EXPLAIN (DEBUG)
SELECT e.name, e.salary, d.dept_name
FROM employees e
JOIN departments d ON e.department_id = d.id
WHERE e.salary > 120000;                                                                      QUERY PLAN    
-----------------------------------------------------------------------------
 Hash Join  (cost=29.12..2192.55 rows=30108 width=52)
   Hash Cond: (e.department_id = d.id)
   Disabled Nodes: 0
   Parallel Safe: true
   Plan Node ID: 0
   ->  Seq Scan on employees e  (cost=0.00..2084.00 rows=30108 width=24)
         Filter: (salary > '120000'::numeric)
         Disabled Nodes: 0
         Parallel Safe: true
         Plan Node ID: 1
   ->  Hash  (cost=18.50..18.50 rows=850 width=36)
         Disabled Nodes: 0
         Parallel Safe: true
         Plan Node ID: 2
         ->  Seq Scan on departments d  (cost=0.00..18.50 rows=850 width=36)
               Disabled Nodes: 0
               Parallel Safe: true
               Plan Node ID: 3
 PlannedStmt:
   Command Type: select
   Flags: canSetTag
   Subplans Needing Rewind: none
   Relation OIDs: 16411 16400
   Executor Parameter Types: none
   Parse Location: 0 to end
(25 rows)
Time: 0.564 ms

 What this means:

  • Command Type: Query identified as SELECT
  • Flags: canSetTag: The command sets a result tag (“SELECT n”)
  • Relation OIDs: The internal object ID of the customers table
  • Plan Node ID: Unique internal identifier for this node
  • Parallel Safe: True can safely participate in parallel plans
  • extParam / allParam: No parameter dependencies here

Understanding EXPLAIN (RANGE_TABLE)

Every SQL query in PostgreSQL has an internal structure called a Range Table. a list of all relations (tables, subqueries, CTEs, joins) involved in the query.

Each entry is called a Range Table Entry (RTE) and is identified by a Range Table Index (RTI).

These entries guide the planner in understanding:

  • Which relations participate in the query
  • What columns are visible
  • What permissions are required
  • How joins and subqueries are linked

The EXPLAIN (RANGE_TABLE) option lets you inspect this internal structure directly.

Simple SELECT query

EXPLAIN (ANALYZE, COSTS OFF, VERBOSE, RANGE_TABLE)
SELECT d.dept_name, COUNT(e.id) AS num_employees
FROM departments d
JOIN employees e ON d.id = e.department_id
GROUP BY d.dept_name;
                           QUERY PLAN                                           
────────────────────────────────────────────────────────────────── 
HashAggregate (actual time=40.951..40.956 rows=10.00 loops=1)
   Output: d.dept_name, count(e.id)
   Group Key: d.dept_name
   Batches: 1  Memory Usage: 32kB
   Buffers: shared hit=835
   ->  Hash Join (actual time=0.035..24.106 rows=100000.00 loops=1)
         Output: d.dept_name, e.id
         Inner Unique: true
         Hash Cond: (e.department_id = d.id)
         Buffers: shared hit=835
         ->  Seq Scan on public.employees e (actual time=0.011..5.594 rows=100000.00 loops=1)
               Output: e.id, e.name, e.department_id, e.salary, e.hire_date
               Buffers: shared hit=834               
Scan RTI: 2
         ->  Hash (actual time=0.018..0.019 rows=10.00 loops=1)
               Output: d.dept_name, d.id
               Buffers: shared hit=1
               ->  Seq Scan on public.departments d (actual time=0.006..0.008 rows=10.00 loops=1)
                     Output: d.dept_name, d.id
                     Buffers: shared hit=1
                     Scan RTI: 1
 Planning Time: 0.135 ms
 RTI 1 (relation, in-from-clause):
   Alias: d ()
   Eref: d (id, dept_name, location)
   Relation: public.departments
   Relation Kind: relation
   Relation Lock Mode: AccessShareLock
   Permission Info Index: 1
 RTI 2 (relation, in-from-clause):
   Alias: e ()
   Eref: e (id, name, department_id, salary, hire_date)
   Relation: public.employees
   Relation Kind: relation
   Relation Lock Mode: AccessShareLock
   Permission Info Index: 2
 RTI 3 (join, in-from-clause):
   Eref: unnamed_join (id, dept_name, location, id, name, department_id, salary, hire_date)
   Join Type: Inner
 RTI 4 (group):
   Eref: "*GROUP*" (dept_name)
 Unprunable RTIs: 1 2
 Execution Time: 40.994 ms
(44 rows)

Explanation:

  • RTI 1, RTI 2: Base relations (departments, employees) appearing in the FROM clause.
  • RTI 3: Represents the join between departments and employees.
  • RTI 4: Represents the GROUP BY level of aggregation.
  • Unprunable RTIs: Relations that must remain in the plan (cannot be optimized away).

This is how PostgreSQL’s planner internally records the FROM clause before query optimization.

Subquery

EXPLAIN (ANALYZE, COSTS OFF, VERBOSE, RANGE_TABLE)
SELECT sq.department_id, COUNT(*) AS high_salary_employees
FROM (
    SELECT department_id
    FROM employees
    WHERE salary > 120000
) AS sq
GROUP BY sq.department_id;
                          QUERY PLAN                             
──────────────────────────────────────────────────────────────────
 HashAggregate (actual time=17.175..17.178 rows=10.00 loops=1)
   Output: employees.department_id, count(*)
   Group Key: employees.department_id
   Buffers: shared hit=834
   ->  Seq Scan on public.employees (actual time=0.009..12.859 rows=29999.00 loops=1)
         Filter: (employees.salary > '120000'::numeric)
         Rows Removed by Filter: 70001
         Scan RTI: 3
 Planning Time: 0.244 ms
 RTI 1 (subquery, in-from-clause):
   Alias: sq ()
   Eref: sq (department_id)
 RTI 2 (group):
   Eref: "*GROUP*" (department_id)
 RTI 3 (relation, in-from-clause):
   Eref: employees (id, name, department_id, salary, hire_date)
   Relation: public.employees
   Relation Kind: relation
   Relation Lock Mode: AccessShareLock
   Permission Info Index: 1
 Unprunable RTIs: 3
 Execution Time: 17.207 ms

Explanation:

  • RTI 1: Represents the subquery sq (a derived table).
  • RTI 3: Refers to the base table employees used inside the subquery.
  • RTI 2: Represents the GROUP BY aggregation scope.

Subqueries appear as RTEs of type subquery in PostgreSQL’s planner.

JOIN Query

EXPLAIN (ANALYZE, COSTS OFF, VERBOSE, RANGE_TABLE)
SELECT d.dept_name, p.project_name, COUNT(ep.employee_id) AS num_people
FROM departments d
JOIN projects p ON d.id = p.department_id
JOIN employee_projects ep ON ep.project_id = p.id
GROUP BY d.dept_name, p.project_name;
                          QUERY PLAN                     
────────────────────────────────────────────────────────────────── 
HashAggregate (actual time=125.507..125.798 rows=1000.00 loops=1)
   Output: d.dept_name, p.project_name, count(ep.employee_id)
   Group Key: d.dept_name, p.project_name
   Buffers: shared hit=1194
   ->  Hash Join (actual time=0.653..82.000 rows=199789.00 loops=1)
         Hash Cond: (p.department_id = d.id)
         ->  Hash Join (actual time=0.612..51.042 rows=199789.00 loops=1)
               Hash Cond: (ep.project_id = p.id)
               ->  Seq Scan on public.employee_projects ep
                     Scan RTI: 4
               ->  Seq Scan on public.projects p
                     Scan RTI: 2
         ->  Seq Scan on public.departments d
               Scan RTI: 1
 Planning Time: 1.539 ms
 RTI 1 (relation): departments
 RTI 2 (relation): projects
 RTI 3 (join):     d ⨝ p
 RTI 4 (relation): employee_projects
 RTI 5 (join):     (d ⨝ p) ⨝ ep
 RTI 6 (group):    *GROUP*
 Unprunable RTIs: 1 2 4
 Execution Time: 126.715 ms

Explanation:

PostgreSQL builds multiple range table entries:

  • RTI 1, RTI 2, RTI 4: Base relations.
  • RTI 3, RTI 5: Represent the join combinations (d ⨝ p and (d ⨝ p) ⨝ ep).
  • RTI 6: Represents the GROUP BY aggregation level.

Each join RTE explicitly records its join type, participating tables, and visible columns — helping the planner reason about join ordering and cost estimation.

Why RANGE_TABLE matters

Understanding Range Tables helps:

  • Extension developers write planner hooks or analyze query structure.
  • Researchers study how PostgreSQL represents joins, CTEs, or subqueries.
  • DBAs debug unexpected join expansions or CTE materialization issues.

It’s essentially an X-ray view of the PostgreSQL planner.

Use cases of pg_overexplain

  • Understanding view performance: If a view contains joins, filters, or aggregates, you might miss how filters are being pushed down. pg_overexplain helps clarify whether WHERE clauses are being applied early or late in the plan.
  • Debugging rewrite rules: When using RULE-based views or conditional logic, things may not always behave as expected. With pg_overexplain, you can see how your logic is interpreted after rewrite.
  • Analyzing query planning discrepancies: Sometimes, queries with identical output perform very differently due to rewrite order or how joins are expanded. Use this extension to pinpoint where that divergence happens.
  • Teaching & learning tool: This is also a phenomenal tool for teaching PostgreSQL internals, showing how logical query rewriting affects planning and execution in real time.

Summary

pg_overexplain is the tool you never knew you needed, until those carefully tuned query starts misbehaving.You’ll be able to:

  • Reveals the real query plan
  • Explains why the planner made certain choices
  • Bridges the gap between what you wrote and what PostgreSQL runs

Whether you’re debugging views, optimizing joins, or just curious about PostgreSQL internals, this extension is pure gold.

What’s Next?

As we near the final days of PG18 Hacktober, the next posts will focus on another powerful enhancement, “Internationalization” features to handle diverse character sets, sorting rules, and language-specific formatting options introduced in PG18.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top