Managing large datasets in PostgreSQL can quickly become a challenge, especially as tables grow and query performance starts to degrade. Partitioning is a proven approach to tackle this, but native PostgreSQL partitioning can be complex and time-consuming to set up and maintain. Here comes pg_partman, a robust extension that automates and simplifies partition management, making it accessible to everyone from database novices to seasoned DBAs.
What is pg_partman?
pg_partman (PostgreSQL Partition Manager) is an open-source extension designed to automate the creation and maintenance of partitioned tables in PostgreSQL. It supports both time-based and serial-based (number-based) partitioning, and is actively maintained by the PostgreSQL community.
With pg_partman, we can:
- Automatically create new partitions as data grows.
- Drop old partitions based on retention policies.
- Seamlessly manage both new and existing tables.
- Eliminate the need for external schedulers (like cron) through its built-in background worker process
Installation
pg_partman is available on github. The prerequisities are:
postgresql-develpackagegcc,make, and Git
Once we have prerequisties , execute the following commands with root or user with sudo privileges
# 1. Navigate to a temporary directory
cd /tmp
# 2. Clone the pg_partman GitHub repository
git clone https://github.com/pgpartman/pg_partman.git
cd pg_partman
# 3. Ensure PostgreSQL 16 binaries are in your PATH
export PATH=/usr/pgsql-16/bin:$PATH
# 4.1 Build and install the extension
make install
# 4.2 If we want to build and install the extension without background worker
make NO_BGW=1 install
Once the installation is done, connect to the database and enable extension
CREATE SCHEMA partman;
CREATE EXTENSION pg_partman SCHEMA partman;
postgres=# \dx
List of installed extensions
Name | Version | Schema | Description
------------+---------+------------+------------------------------------------------------
pg_partman | 5.2.4 | partman | Extension to manage partitioned tables by time or ID
plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language
(2 rows)
Time-Based and Serial-Based Partitioning in pg_partman
pg_partman supports two primary types of range partitioning: time-based and serial-based. Both approaches help manage large tables by dividing them into smaller, more manageable partitions, but they differ in their partitioning keys, use cases, and configuration.
| Feature | Time-Based Partitioning | Serial-Based Partitioning |
|---|---|---|
| Partition Key | Date or timestamp column | Integer or serial (ID) column |
| Partition Interval | Time intervals (e.g., day, month) | Numeric intervals (e.g., every 100,000) |
| Use Cases | Time-series data, logs, IoT, events | Sequential IDs, batch imports, archiving |
| Partition Naming | Based on date/time (e.g., 20240601) | Based on ID range (e.g., 100000-200000) |
| Configuration | p_interval set to time unit | p_interval set to numeric value |
| Typical Examples | Partition by day, week, month, year | Partition by every N IDs |
Let’s create a table and partition them using pg_partman
Time-based partition
#Create a table and partition using timestamp
postgres=# CREATE TABLE public.time_events (
event_id BIGSERIAL,
event_time TIMESTAMPTZ NOT NULL,
event_type TEXT,
details JSONB,
PRIMARY KEY (event_id, event_time)
) PARTITION BY RANGE (event_time);
CREATE TABLE
postgres=# SELECT partman.create_parent(
p_parent_table := 'public.time_events',
p_control := 'event_time',
p_interval := '1 day'
);
create_parent
---------------
t
(1 row)
postgres=# \d+ public.time_events
Partitioned table "public.time_events"
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
------------+--------------------------+-----------+----------+-----------------------------------------------+----------+-------------+--------------+-------------
event_id | bigint | | not null | nextval('time_events_event_id_seq'::regclass) | plain | | |
event_time | timestamp with time zone | | not null | | plain | | |
event_type | text | | | | extended | | |
details | jsonb | | | | extended | | |
Partition key: RANGE (event_time)
Indexes:
"time_events_pkey" PRIMARY KEY, btree (event_id, event_time)
Partitions: time_events_p20250613 FOR VALUES FROM ('2025-06-13 00:00:00+05:30') TO ('2025-06-14 00:00:00+05:30'),
time_events_p20250614 FOR VALUES FROM ('2025-06-14 00:00:00+05:30') TO ('2025-06-15 00:00:00+05:30'),
time_events_p20250615 FOR VALUES FROM ('2025-06-15 00:00:00+05:30') TO ('2025-06-16 00:00:00+05:30'),
time_events_p20250616 FOR VALUES FROM ('2025-06-16 00:00:00+05:30') TO ('2025-06-17 00:00:00+05:30'),
time_events_p20250617 FOR VALUES FROM ('2025-06-17 00:00:00+05:30') TO ('2025-06-18 00:00:00+05:30'),
time_events_p20250618 FOR VALUES FROM ('2025-06-18 00:00:00+05:30') TO ('2025-06-19 00:00:00+05:30'),
time_events_p20250619 FOR VALUES FROM ('2025-06-19 00:00:00+05:30') TO ('2025-06-20 00:00:00+05:30'),
time_events_p20250620 FOR VALUES FROM ('2025-06-20 00:00:00+05:30') TO ('2025-06-21 00:00:00+05:30'),
time_events_p20250621 FOR VALUES FROM ('2025-06-21 00:00:00+05:30') TO ('2025-06-22 00:00:00+05:30'),
time_events_default DEFAULT
Series-based partition
#Create a table and partition using serial
postgres=# CREATE TABLE public.serial_items (
item_id BIGSERIAL PRIMARY KEY,
item_name TEXT,
created_at TIMESTAMPTZ DEFAULT now()
) PARTITION BY RANGE (item_id);
CREATE TABLE
postgres=# SELECT partman.create_parent(
p_parent_table := 'public.serial_items',
p_control := 'item_id',
p_interval := '1000'
);
-[ RECORD 1 ]-+--
create_parent | t
postgres=# \d+ serial_items
Partitioned table "public.serial_items"
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
------------+--------------------------+-----------+----------+-----------------------------------------------+----------+-------------+--------------+-------------
item_id | bigint | | not null | nextval('serial_items_item_id_seq'::regclass) | plain | | |
item_name | text | | | | extended | | |
created_at | timestamp with time zone | | | now() | plain | | |
Partition key: RANGE (item_id)
Indexes:
"serial_items_pkey" PRIMARY KEY, btree (item_id)
Partitions: serial_items_p0 FOR VALUES FROM ('0') TO ('1000'),
serial_items_p1000 FOR VALUES FROM ('1000') TO ('2000'),
serial_items_p2000 FOR VALUES FROM ('2000') TO ('3000'),
serial_items_p3000 FOR VALUES FROM ('3000') TO ('4000'),
serial_items_p4000 FOR VALUES FROM ('4000') TO ('5000'),
serial_items_default DEFAULT
pg_partman maintains a configuration table partman.part_config. After setting up the partitions, querying this table shows details about each partitioned parent:
postgres=# select * from partman.part_config;
-[ RECORD 1 ]--------------+-------------------------------------
parent_table | public.time_events
control | event_time
time_encoder |
time_decoder |
partition_interval | 1 day
partition_type | range
premake | 4
automatic_maintenance | on
template_table | partman.template_public_time_events
retention |
retention_schema |
retention_keep_index | t
retention_keep_table | t
epoch | none
constraint_cols |
optimize_constraint | 30
infinite_time_partitions | f
datetime_string | YYYYMMDD
jobmon | t
sub_partition_set_full | f
undo_in_progress | f
inherit_privileges | f
constraint_valid | t
ignore_default_data | t
date_trunc_interval |
maintenance_order |
retention_keep_publication | f
maintenance_last_run |F
-[ RECORD 2 ]--------------+-------------------------------------
parent_table | public.serial_items
control | item_id
time_encoder |
time_decoder |
partition_interval | 1000
partition_type | range
premake | 4
automatic_maintenance | on
template_table | partman.template_public_serial_items
retention |
retention_schema |
retention_keep_index | t
retention_keep_table | t
epoch | none
constraint_cols |
optimize_constraint | 30
infinite_time_partitions | f
datetime_string |
jobmon | t
sub_partition_set_full | f
undo_in_progress | f
inherit_privileges | f
constraint_valid | t
ignore_default_data | t
date_trunc_interval |
maintenance_order |
retention_keep_publication | f
maintenance_last_run |
Understanding key columns of partman.part_config
parent_table: name of the parent (e.g.,public.time_events,public.serial_items)control: column used for partitioning (event_timeoritem_id)partition_interval: daily for time, or 1000 for seriestemplate_table: auto-generated internal template used to create each new child partitionautomatic_maintenance: whether pg_partman should auto-maintain partitions
We can also configure retention and infinite partitioning for the parent table managed by pg_partman by updating the partman.part_config. It enables infinite_time_partitions, allowing partitions to extend indefinitely into the future, and sets up automatic retention by specifying a retention period. Expired partitions will be moved to the specified schema instead of being dropped, with both table structures and indexes preserved by setting retention_keep_table and retention_keep_index to true.
Most frequenctly used procedures of pg_partman
Here are some of the frequently used procedures of pg_partman
| Function Name | Purpose |
|---|---|
create_parent | Register an existing table as a partition parent and configure partitioning |
undo_partition | Remove partitioning from a table and optionally merge data back to the parent |
run_maintenance | Manually trigger partition maintenance (create new partitions, drop old ones, etc.) |
show_partitions | List all child partitions for a given parent table |
pg_partman supports automatic partition maintenance to proactively create future partitions and avoid data falling into the default partition. This can be achieved using the run_maintenance() function, either through a background worker or a cron scheduler. To enable the background worker, the shared_preload_libraries must include 'pg_partman_bgw', which requires a PostgreSQL restart. Once enabled, configuration parameters such as pg_partman_bgw.interval, dbname, role, and analyze allow fine control over how and when partition maintenance runs. Alternatively, a cron job like 00 01,12 * * * psql -c "SELECT pg_partman.run_maintenance()" can be scheduled to run the maintenance function periodically. The key advantages of pg_partman include automated data retention policies (archive or drop partitions), improved query performance by narrowing scans to relevant partitions, better vacuum efficiency due to smaller partition sizes, and the ability to attach or detach partitions concurrently without service disruption.
Before we conclude, here are some blogs that explore manual approaches to partitioning and archiving :
- PostgreSQL Partitioning Made Easy: Features, Benefits, and Tips
- Efficient Data Management: Overcoming the Challenges of Large Tables with an Archival Strategy
Conclusion
Managing large-scale datasets in PostgreSQL becomes significantly more efficient with pg_partman, a powerful and flexible extension that simplifies and automates table partitioning. By supporting both time-based and serial-based partitioning, pg_partman caters to a wide range of use cases—from time-series logs to bulk data imports. It not only automates partition creation and retention but also provides tools like run_maintenance(), background workers, and cron integration to maintain optimal performance without manual intervention. With easy setup, customizable configurations, and powerful features like infinite partitions and non-destructive retention, pg_partman greatly reduces administrative overhead while improving query performance, vacuum efficiency, and operational flexibility. Whether you’re a PostgreSQL beginner or a seasoned DBA, pg_partman is an essential tool for scaling your data infrastructure with confidence.
