Introduction
pgstattuple is an open-source utility in PostgreSQL, provided as an extension. It helps manage and clear table bloat, which in turn improves table performance. It’s ability to provide detailed insights into the extent of bloat, makes pgstattuple a crucial tool for optimizing database efficiency and ensuring that your tables operate smoothly.
PostgreSQL is renowned for its robustness, flexibility, and advanced features. One such feature, pgstattuple, provides invaluable insights into the health and efficiency of your database tables and indexes. In this blog, we’ll delve into what pgstattuple is, how it works, and how it can be a game-changer for database maintenance.
What is pgstattuple?
pgstattuple is an extension for PostgreSQL that helps you analyze the physical storage of database tables. It provides detailed statistics on the space usage within your database, including how much space is wasted due to bloat. This information is crucial for maintaining optimal performance and ensuring efficient use of disk space.
Why is pgstattuple Important?
Over time, PostgreSQL databases can accumulate bloat due to various operations like inserts, updates, and deletes. This bloat results in unused space within tables and indexes, which can degrade performance and lead to inefficient storage use. pgstattuple helps you identify and quantify this bloat, allowing you to take corrective actions such as vacuuming or reindexing.
Step 1
Creating the Extension and Table
postgres=# create Extension pgstattuple;
CREATE EXTENSION
postgres=# create table test (id int, name varchar(90));
CREATE TABLE
Install the pgstattuple extension, which is necessary for analyzing bloat in tables. A table named test is created with two columns: id and name.
Step 2
Insert Data into the Table
postgres=# INSERT INTO test (id, name)
SELECT generate_series(1, 50) AS id, 'some_text' AS name;
INSERT 0 50
We insert 50 rows into the test
table. Each row has an id
from 1 to 50, and the name
is set to 'some_text'
.
Step 3
Initial Bloat Check
Check Table Bloat Using pgstattuple:
postgres=# select * from pgstattuple ('test');
-[ RECORD 1 ]------+------
table_len | 8192
tuple_count | 50
tuple_len | 1900
tuple_percent | 23.19
dead_tuple_count | 0
dead_tuple_len | 0
dead_tuple_percent | 0
free_space | 5964
free_percent | 72.8
- The initial check shows:
- table_len: Total table size in bytes (8192 bytes).
- tuple_count: Number of live tuples (50).
- tuple_len: Total size of live tuples (1900 bytes).
- dead_tuple_count: Number of dead tuples (0).
- free_space: Free space available in the table (5964 bytes).
- At this stage, there are no dead tuples, meaning there is no bloat.
Step4
Creating Bloat
Delete Some Rows:
postgres=# DELETE FROM test WHERE id BETWEEN 41 AND 50;
DELETE 10
We deleted 10 rows from the table where the id is between 41 and 50.
Check Bloat
postgres=# select * from pgstattuple ('test');
-[ RECORD 1 ]------+------
table_len | 8192
tuple_count | 40
tuple_len | 1520
tuple_percent | 18.55
dead_tuple_count | 10
dead_tuple_len | 380
dead_tuple_percent | 4.64
free_space | 5964
free_percent | 72.8
- After the deletion:
- dead_tuple_count: The number of dead tuples is now 10, indicating bloat.
- dead_tuple_len: Size of dead tuples (380 bytes).
- free_space: Free space remains the same (5964 bytes), but bloat is now present.
Step 5
Removing Bloat
- Vacuum the Table:
postgres=# vacuum test;
VACUUM
Running VACUUM helps to reclaim space from dead tuples, thus reducing bloat.
Check Bloat After Vacuuming:
postgres=# select * from pgstattuple ('test');
-[ RECORD 1 ]------+------
table_len | 8192
tuple_count | 40
tuple_len | 1520
tuple_percent | 18.55
dead_tuple_count | 0
dead_tuple_len | 0
dead_tuple_percent | 0
free_space | 6404
free_percent | 78.17
After vacuuming:
- dead_tuple_count: The dead tuples are now 0, meaning bloat has been cleared.
- free_space: The free space has increased to 6404 bytes
Note: It is recommended to schedule the VACUUM
activity during off-peak hours.
Conclusion:
pgstattuple is an indispensable tool for PostgreSQL database administrators aiming to maintain optimal performance and efficient storage. By offering detailed insights into table and space utilization, it helps identify areas where space is being wasted and performance may be degraded. Regular use of pgstattuple can reveal critical information about dead tuples, free space, and overall storage efficiency.
Incorporating pgstattuple into your database maintenance routine allows for proactive management, helping to mitigate issues related to bloat before they impact performance. By analyzing the results, you can make informed decisions about vacuuming, reindexing, and adjusting autovacuum settings, ultimately leading to a healthier and more efficient PostgreSQL environment. In this blog, we used pgstattuple to monitor and manage table bloat effectively in order to ensure optimal database performance.
Leave a Reply