Optimizing PostgreSQL Delete Operations: A Deep Dive into Performance Troubleshooting

Our client was experiencing frustratingly slow delete operations in their PostgreSQL database. What should have been a monthly data cleanup task was taking an unacceptable amount of time, impacting their application’s performance and blocking other critical operations.

Our Investigation Approach

When dealing with performance issues, we followed a systematic diagnostic methodology rather than making blind optimizations. Here’s how we tackled this challenge:

Step 1: Examining the Delete Query

The first step was understanding what we were working with. We requested the actual DELETE statement being executed. This gave us insight into:

  • The WHERE clause logic: Were we filtering efficiently?
  • Subquery structure: Were there nested queries that could be problematic?
  • Join conditions: How many tables were involved in identifying rows to delete?

A typical problematic delete query might look like this:

#This is a sample delete query , we can't disclose the client's query
DELETE FROM orders 
WHERE customer_id IN (
    SELECT customer_id 
    FROM customers 
    WHERE last_purchase_date < '2023-01-01'
)
AND order_status = 'completed';

Step 2: Analyzing Table Structure and Indexes

Next, we retrieved the Data Definition Language (DDL) for all tables involved. This revealed:

  • Primary keys and their definitions
  • Foreign key constraints (which can significantly slow down deletes)
  • Existing indexes on columns used in WHERE clauses
  • Table size and row counts

We used PostgreSQL commands like:

\d+ table_name

Or queried the information schema:

SELECT * FROM pg_indexes WHERE tablename = 'orders';

Key Finding: We discovered that while there were indexes on the main table, the columns used in the subquery conditions weren’t properly indexed.

Step 3: Reviewing the Execution Plan

The EXPLAIN plan is PostgreSQL’s way of telling you how it will execute your query. We ran:

EXPLAIN DELETE FROM orders 
WHERE customer_id IN (
    SELECT customer_id 
    FROM customers 
    WHERE last_purchase_date < '2023-01-01'
)
AND order_status = 'completed';

Why not EXPLAIN ANALYZE?

This is crucial: EXPLAIN ANALYZE actually executes the query and measures real performance. For a DELETE operation that was already taking too long, we couldn’t afford to run it—we’d just recreate the problem. EXPLAIN (without ANALYZE) shows the planned execution without actually running it.

What we looked for in the plan:

  • Sequential scans (bad) vs. index scans (good)
  • Estimated row counts
  • Nested loop joins that might be inefficient
  • Subplan execution methods

The plan revealed sequential scans on both the main table and the subquery table, indicating PostgreSQL wasn’t leveraging indexes effectively.

Step 4: Investigating Table Statistics

Here’s where we found the smoking gun. PostgreSQL’s query planner relies on statistics about table data to make execution decisions. We queried:

SELECT 
    schemaname,
    relname,
    last_vacuum,
    last_autovacuum,
    last_analyze,
    last_autoanalyze,
    n_dead_tup,
    n_live_tup
FROM pg_stat_user_tables
WHERE relname IN ('orders', 'customers');

Critical Discovery:

  1. The orders table: last_analyze and last_vacuum were NULL—meaning statistics had never been gathered on this table
  2. The customers table (used in the subquery): Last analyzed 10 days ago, but with significant data changes since then

Why this matters:

Without current statistics, PostgreSQL’s query planner was essentially flying blind. It might think a table has 100 rows when it actually has 100,000, or vice versa. This leads to catastrophically poor execution plans—choosing sequential scans when indexes would be far more efficient, or allocating insufficient memory for operations.

Step 5: The Solution – VACUUM and ANALYZE

We executed maintenance operations on both tables:

-- Update statistics for the query planner
ANALYZE orders;
ANALYZE customers;

-- Clean up dead tuples and update statistics
VACUUM ANALYZE orders;
VACUUM ANALYZE customers;

What VACUUM does:

  • Removes dead tuples nearly 500000 (rows that have been updated or deleted but still occupy space).
  • Marks space as reusable
  • Updates the visibility map
  • Can prevent transaction ID wraparound issues

What ANALYZE does:

  • Samples table data
  • Updates statistics in pg_statistic
  • Helps the planner estimate row counts, data distribution, and cardinality
  • Determines the most efficient execution plan

The Result

After updating the table statistics, we re-ran the EXPLAIN plan (still not executing the actual delete). The new plan showed:

  • Index scans replacing sequential scans
  • More accurate row count estimates
  • A dramatically different execution strategy

When the client executed the actual DELETE query, the execution time dropped significantly—from what was likely hours or tens of minutes down to a manageable timeframe.

Key Takeaways

1. Statistics are Critical

PostgreSQL’s autovacuum daemon should handle this automatically, but in high-write environments or with specific configuration settings, tables can fall behind. Regular monitoring of pg_stat_user_tables is essential.

2. VACUUM vs. ANALYZE vs. VACUUM ANALYZE

  • Use ANALYZE when you just need updated statistics
  • Use VACUUM when you need to reclaim space
  • Use VACUUM ANALYZE (or VACUUM (ANALYZE) in newer syntax) to do both

3. The Diagnostic Process Matters

We didn’t jump straight to “add an index” or “rewrite the query.” We systematically examined each layer:

  • Query logic
  • Schema design
  • Execution plans
  • Database statistics

4. Preventive Measures

After this fix, we recommended:

  • Monitoring autovacuum: Check that it’s running and completing successfully
  • Adjusting autovacuum settings if necessary for high-transaction tables
  • Setting up alerts for tables with stale statistics
  • Scheduling manual VACUUM ANALYZE during maintenance windows for critical tables

Conclusion

Database performance issues are rarely about a single “silver bullet” solution. This case demonstrated that even with proper indexes and well-written queries, outdated statistics can cripple performance. By following a methodical troubleshooting approach and understanding how PostgreSQL’s query planner works, we identified and resolved the root cause efficiently—without the need for schema changes or query rewrites.

The lesson? Sometimes the best optimization is simply giving your database the information it needs to make smart decisions.

Those who are beginner can check out this post for understanding the concepts of Vacuum.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top