Our client was experiencing frustratingly slow delete operations in their PostgreSQL database. What should have been a monthly data cleanup task was taking an unacceptable amount of time, impacting their application’s performance and blocking other critical operations.
Our Investigation Approach
When dealing with performance issues, we followed a systematic diagnostic methodology rather than making blind optimizations. Here’s how we tackled this challenge:
Step 1: Examining the Delete Query
The first step was understanding what we were working with. We requested the actual DELETE statement being executed. This gave us insight into:
- The WHERE clause logic: Were we filtering efficiently?
- Subquery structure: Were there nested queries that could be problematic?
- Join conditions: How many tables were involved in identifying rows to delete?
A typical problematic delete query might look like this:
#This is a sample delete query , we can't disclose the client's query
DELETE FROM orders
WHERE customer_id IN (
SELECT customer_id
FROM customers
WHERE last_purchase_date < '2023-01-01'
)
AND order_status = 'completed';
Step 2: Analyzing Table Structure and Indexes
Next, we retrieved the Data Definition Language (DDL) for all tables involved. This revealed:
- Primary keys and their definitions
- Foreign key constraints (which can significantly slow down deletes)
- Existing indexes on columns used in WHERE clauses
- Table size and row counts
We used PostgreSQL commands like:
\d+ table_name
Or queried the information schema:
SELECT * FROM pg_indexes WHERE tablename = 'orders';
Key Finding: We discovered that while there were indexes on the main table, the columns used in the subquery conditions weren’t properly indexed.
Step 3: Reviewing the Execution Plan
The EXPLAIN plan is PostgreSQL’s way of telling you how it will execute your query. We ran:
EXPLAIN DELETE FROM orders
WHERE customer_id IN (
SELECT customer_id
FROM customers
WHERE last_purchase_date < '2023-01-01'
)
AND order_status = 'completed';
Why not EXPLAIN ANALYZE?
This is crucial: EXPLAIN ANALYZE actually executes the query and measures real performance. For a DELETE operation that was already taking too long, we couldn’t afford to run it—we’d just recreate the problem. EXPLAIN (without ANALYZE) shows the planned execution without actually running it.
What we looked for in the plan:
- Sequential scans (bad) vs. index scans (good)
- Estimated row counts
- Nested loop joins that might be inefficient
- Subplan execution methods
The plan revealed sequential scans on both the main table and the subquery table, indicating PostgreSQL wasn’t leveraging indexes effectively.
Step 4: Investigating Table Statistics
Here’s where we found the smoking gun. PostgreSQL’s query planner relies on statistics about table data to make execution decisions. We queried:
SELECT
schemaname,
relname,
last_vacuum,
last_autovacuum,
last_analyze,
last_autoanalyze,
n_dead_tup,
n_live_tup
FROM pg_stat_user_tables
WHERE relname IN ('orders', 'customers');
Critical Discovery:
- The
orderstable:last_analyzeandlast_vacuumwere NULL—meaning statistics had never been gathered on this table - The
customerstable (used in the subquery): Last analyzed 10 days ago, but with significant data changes since then
Why this matters:
Without current statistics, PostgreSQL’s query planner was essentially flying blind. It might think a table has 100 rows when it actually has 100,000, or vice versa. This leads to catastrophically poor execution plans—choosing sequential scans when indexes would be far more efficient, or allocating insufficient memory for operations.
Step 5: The Solution – VACUUM and ANALYZE
We executed maintenance operations on both tables:
-- Update statistics for the query planner
ANALYZE orders;
ANALYZE customers;
-- Clean up dead tuples and update statistics
VACUUM ANALYZE orders;
VACUUM ANALYZE customers;
What VACUUM does:
- Removes dead tuples nearly 500000 (rows that have been updated or deleted but still occupy space).
- Marks space as reusable
- Updates the visibility map
- Can prevent transaction ID wraparound issues
What ANALYZE does:
- Samples table data
- Updates statistics in
pg_statistic - Helps the planner estimate row counts, data distribution, and cardinality
- Determines the most efficient execution plan
The Result
After updating the table statistics, we re-ran the EXPLAIN plan (still not executing the actual delete). The new plan showed:
- Index scans replacing sequential scans
- More accurate row count estimates
- A dramatically different execution strategy
When the client executed the actual DELETE query, the execution time dropped significantly—from what was likely hours or tens of minutes down to a manageable timeframe.
Key Takeaways
1. Statistics are Critical
PostgreSQL’s autovacuum daemon should handle this automatically, but in high-write environments or with specific configuration settings, tables can fall behind. Regular monitoring of pg_stat_user_tables is essential.
2. VACUUM vs. ANALYZE vs. VACUUM ANALYZE
- Use
ANALYZEwhen you just need updated statistics - Use
VACUUMwhen you need to reclaim space - Use
VACUUM ANALYZE(orVACUUM (ANALYZE)in newer syntax) to do both
3. The Diagnostic Process Matters
We didn’t jump straight to “add an index” or “rewrite the query.” We systematically examined each layer:
- Query logic
- Schema design
- Execution plans
- Database statistics
4. Preventive Measures
After this fix, we recommended:
- Monitoring autovacuum: Check that it’s running and completing successfully
- Adjusting autovacuum settings if necessary for high-transaction tables
- Setting up alerts for tables with stale statistics
- Scheduling manual VACUUM ANALYZE during maintenance windows for critical tables
Conclusion
Database performance issues are rarely about a single “silver bullet” solution. This case demonstrated that even with proper indexes and well-written queries, outdated statistics can cripple performance. By following a methodical troubleshooting approach and understanding how PostgreSQL’s query planner works, we identified and resolved the root cause efficiently—without the need for schema changes or query rewrites.
The lesson? Sometimes the best optimization is simply giving your database the information it needs to make smart decisions.
Those who are beginner can check out this post for understanding the concepts of Vacuum.
