PG18 Hacktober: 31 Days of New Features : Ansible based Installation & Replication

PG18 Hacktober shifts gears from database features to DevOps. We’ll explore how to automate the setup of PostgreSQL 18 and configure replication using modern infrastructure-as-code techniques, the DevOps way. As PostgreSQL continues to evolve with every release, automation has become an essential part of modern database deployment.Manual setup of PostgreSQL clusters especially with streaming replication can be error-prone, time-consuming, and hard to reproduce in large environments.

In this post, we’ll explore how to use Ansible to automate the installation and replication setup of PostgreSQL 18 on RHEL. By codifying each step from package installation and configuration tuning to initializing primary and standby nodes ,we can ensure repeatable, version-controlled, and quick deployments.

This hands-on guide demonstrates how automation not only simplifies administration but also helps teams scale faster, reduce human errors, and achieve predictable PostgreSQL environments across development, staging, and production.

Prerequisites

We are considering One Primary and two standbys.

RHEL 9/10 servers (1 control node, 1 primary PostgreSQL server, and 2 PostgreSQL standbys.)
Ansible Installed on control node.
Manged nodes must and should have python latest version.
SSH access with sudo privileges to all nodes
Basic understanding of PostgreSQL and Linux

Architecture Overview

Primary Node :- Accepts read/write traffic. Maintains WAL(Write Ahead Logs).
Standby Nodes :- Receive WAL from the primary and stay in sync. Can serve read-only queries if hot_standby is enabled.
Replication Slot :- Ensures Wal required for standby is not removed prematurely.

Primary (PostgreSQL 18)
WAL Streaming
———————–
| |
Standby1 Standby2
(read-only) (read-only)

Key Configuration Concepts

postgresql.conf
- wal_level = replica -> Required for replication
- max_wal_senders = 10 -> Number of standbys that can connect
- wal_keep_size = 128MB -> Ensures WAL segments remain long enough for standby.
- hot_standby = on -> Allows read queries on standby.
pg_hba.conf
- Configure replication user access for standby IPs
Replication User
- Must have REPLICATION LOGIN privileges
Replication Slots
- Prevents WAL segments from being removed before standby consumes them.

Ansible Playbook Overview

I automated the following:

Start and configure PostgreSQL on primary.
Add replication entries in pg_hba.conf file.
Create replication user if it does not exist.
Prepare standby nodes:
- Remove old data directories
- Create .pgpass file for passwordless replication.
- Run pg_basebackup to clone primary
- Start standby PostgreSQL.
Enable PostgreSQL service at boot for both primary and standby nodes.

What does the Playbook do?

Here I am using the following 2 playbooks and an additional inventory.ini:

PostgreSQL.yaml
replication.yaml

`PostgreSQL.yaml`

Setting up PostgreSQL manually on RHEL-based systems can be repetitive and error-prone. To simplify the process, I created an Ansible playbook that automates the installation and initialization of PostgreSQL 18 using the official PGDG (PostgreSQL Global Development Group) repository.

What This Playbook Does:

Installs prerequisites like curl and ca-certificates.
Adds the official PostgreSQL YUM repository (PGDG).
Imports the PGDG GPG key for package verification.
Installs PostgreSQL 17 server and client packages.
Initializes the database cluster.
Enables and starts the PostgreSQL service automatically.

`replication.yaml`

Objective:

Configure PostgreSQL 18 as a primary and multiple standby nodes.
Automate replication setup, .pgpass creation, and slot management.

Key Features:

Automatically configures postgresql.conf and pg_hba.conf for replication
Creates the replication user securely
Handles WAL and replication slot cleanup
Runs pg_basebackup to initialize standby nodes
Starts and enables PostgreSQL services on both primary and standbys

These vars control how PostgreSQL is installed and configured

- name: PostgreSQL 18 Streaming Replication Setup (RHEL PGDG)
  hosts: managed
  become: true
  vars:
    pg_version: 18
    pg_port_primary: 5432
    pg_port_standby: 5433
    replication_user: "replicator"
    replication_password: "ReplPass@123"
    pg_data_dir: "/var/lib/pgsql/{{ pg_version }}/data"
    standby_data_dir: "/var/lib/pgsql/{{ pg_version }}/standby_data"
    pg_service_name: "postgresql-{{ pg_version }}"
    pg_basebackup_path: "/usr/pgsql-{{ pg_version }}/bin/pg_basebackup"
    pg_ctl_path: "/usr/pgsql-{{ pg_version }}/bin/pg_ctl"
    primary_ip: "{{ hostvars['pg_primary']['ansible_host'] }}"

What they mean:

pg_data_dir: main PostgreSQL data directory on the primary.
standby_data_dir: where standby data will be restored.
replication_user: special user for replication connections.
pg_basebackup_path: path to tool used for full base backup.

Primary Node Setup

These tasks run only on hosts in the [primary] group.

Add entries in pg_hba.conf file
- host replication replicator <IP_addr>/32 scram-sha-256
- This uses the standby nodes authenticate to the primary using the replication user.
Create replication user
- CREATE ROLE replicator REPLICATION LOGIN PASSWORD '<Password>';
- The standby connects with the help of this user for replication.
Restart PostgreSQL
- Applies all configuration changes.

Standby Node Setup

These tasks run on [standby] group nodes.

Stop PostgreSQL on standby
- Ensures PostgreSQL is not running before overwriting its data directory.
Create .pgpass file
- Stores replication credentials securely
- :5432:*:replicator:<password>
This avoids interactive password prompts during backup.

Running pg_basebackup

This is the most important step — it copies the full database from the primary to standby.

pg_basebackup -h  -p 5432 -D /var/lib/pgsql/18/standby_data \
  -U replicator -P -v -R -C -S standby_slot_{{inventory_hostname}}

-h → primary host
-p → primary port
-D → data directory on standby
-U → replication user
-R → automatically creates a standby.signal file for recovery mode
-C and -S → create replication slot on primary for this standby

After this,

The standby data directory becomes a copy of the primary.
Update standby configuration
PostgreSQL uses the new data directory (standby_data_dir)
Standby runs on a different port (5433) so both can run simultaneously on one host (for testing)
Start standby PostgreSQL
- pg_ctl -D /var/lib/pgsql/18/standby_data start
This brings up the standby in recovery mode, which continuously applies WAL logs from the primary.

`inventory.ini`

[primary]
pg_primary ansible_host=172.31.26.90

[standby]
pg_standby1 ansible_host=172.31.23.32
pg_standby2 ansible_host=172.31.19.112

[managed:children]
primary
standby

In the inventory.ini file, we can define and manage any number of nodes like 1 Primary, and n number of Standby nodes.

Why Use PGDG?

The PGDG repository ensures you get the latest stable PostgreSQL versions independent of your OS’s default repositories. This is ideal for testing new features and maintaining consistency across environments.

Key features of the playbooks

Handles multiple standby nodes dynamically
Creates replication slots safely via `pg_basebackup`.
Avoids common standby ports to avoid conflicts.

Example Playbook Snippets

#Add replication entries dynamically

- name: Add replication entries to pg_hba.conf
  blockinfile:
    path: "{{ pg_data_dir }}/pg_hba.conf"
    block: |
      {% for host in groups['standby'] %}
      host replication {{ replication_user }} {{ hostvars[host]['ansible_host'] }}/32 trust
      {% endfor %}
  when: "'primary' in group_names"
```
`Run pg_basebackup safely`
```
- name: Run pg_basebackup on standby
  shell: >
    sudo -iu postgres bash -c 'export PGPASSFILE=/var/lib/pgsql/.pgpass &&
    {{ pg_basebackup_path }} -h {{ primary_ip }} -p {{ pg_port_primary }} -D {{ standby_data_dir }}
    -U {{ replication_user }} -P -v -R -C -S standby_slot_{{ inventory_hostname }} -X stream'
  args:
    executable: /bin/bash
  when: "'standby' in group_names"

Verifying Replication

1. Check replication slots on primary

postgres=# SELECT slot_name, active FROM pg_replication_slots;
        slot_name         | active
--------------------------+--------
 standby_slot_pg_standby1 | t
 standby_slot_pg_standby2 | t
(2 rows)

2. Confirm standby is streaming WAL

postgres=# SELECT * FROM pg_stat_replication;
  pid  | usesysid |  usename   | application_name |  client_addr  | client_hostname | client_port |         backend_start         | backend_xmin |   state   | sent_lsn  | write_lsn | flush_lsn | replay_lsn | write_lag | flush_lag | replay_lag | sync_priority | sync_state |          reply_time
-------+----------+------------+------------------+---------------+-----------------+-------------+-------------------------------+--------------+-----------+-----------+-----------+-----------+------------+-----------+-----------+------------+---------------+------------+-------------------------------
 14668 |    16388 | replicator | walreceiver      | 172.31.23.32  |                 |       50894 | 2025-10-13 07:25:46.699096+00 |              | streaming | 0/B000168 | 0/B000168 | 0/B000168 | 0/B000168  |           |           |            |             0 | async      | 2025-10-13 10:05:10.07311+00
 14669 |    16388 | replicator | walreceiver      | 172.31.19.112 |                 |       53426 | 2025-10-13 07:25:46.708208+00 |              | streaming | 0/B000168 | 0/B000168 | 0/B000168 | 0/B000168  |           |           |            |             0 | async      | 2025-10-13 10:05:09.888348+00
(2 rows)

3. Test read-only queries on standby

[postgres@ip-172-31-23-32 ~]$ psql -p 5433
psql (18.0)
Type "help" for help.

postgres=# SELECT now(), pg_is_in_recovery();
              now              | pg_is_in_recovery
-------------------------------+-------------------
 2025-10-13 10:07:37.997023+00 | t
(1 row)


[postgres@ip-172-31-19-112 ~]$ psql -p 5433
psql (18.0)
Type "help" for help.

postgres=# SELECT now(), pg_is_in_recovery();
              now              | pg_is_in_recovery
-------------------------------+-------------------
 2025-10-13 10:08:07.835761+00 | t
(1 row)

pg_is_in_recovery() returns true on standby

You Can Find GitHub Repo of the Playbooks.
Here, https://github.com/opensource-db
Note : You can go and click the Automation projects repo for this whole setup.

Conclusion

Automating PostgreSQL 18 installation and streaming replication with Ansible transforms what was once a complex, manual process into a fast, reliable, and repeatable workflow. By codifying configurations and replication tasks, DBAs can deploy consistent PostgreSQL environments across multiple nodes and data centers with minimal effort.

This approach not only eliminates human error but also enables scalable, version-controlled, and production-ready PostgreSQL clusters. As PostgreSQL continues to evolve, combining its powerful new features with automation tools like Ansible is the key to modern, resilient database operations.

What’s next?

As we gear up to celebrate Diwali, PG18 Hacktober is also bringing some fireworks of its own. In Day 20, we unwrap a feature that’s truly a cracker for anyone working with foreign data wrappers:
CREATE FOREIGN TABLE … LIKE – a clean, efficient way to define a foreign table by copying the structure of an existing table. Whether you’re federating data across PostgreSQL instances or integrating external sources, this feature will lighten your workload and make your FDW setup sparkle with simplicity. Stay tuned!!