Fixing Standby Slot Problems with pg_auto_failover

Dealing with failover in PostgreSQL clusters has always been tricky, but now there’s a solution: pg_auto_failover. This awesome tool makes it super easy to keep an eye on failovers and resolve the fencing issue.

So, how does it work? Well, pg_auto_failover uses something called the Single Standby Architecture. This fancy term just means it keeps your PostgreSQL services going without any hiccups. And the best part? It’s really easy to use! You’ve got just three main nodes: one primary PostgreSQL node, one standby node that’s always ready to jump in, and a smart monitor node that keeps an eye on everything.

We have an existing primary and running with a hot standby replica. We want to configure the pg_auto_failover for this, as a part of that process. We have created a monitoring node and then added the primary and standby to the cluster.

Create monitor:

nohup /usr/pgsql-14/bin/pg_autoctl create monitor --pgctl /usr/pgsql-14/bin/pg_ctl --pgdata /tmp/monitor_node/ --pgport 5433 --auth trust --run  --no-ssl > /dev/null 2>&1 &

Register primary:

pg_autoctl create postgres --name node2 --pgport 5432 --dbname pg_auto_failover --monitor postgres://autoctl_node@192.168.10.101:5433/pg_auto_failover --auth md5 --ssl-self-signed --hostname 192.168.10.102
-bash-4.2$ pg_autoctl show state
 Name |  Node |          Host:Port |       TLI: LSN |   Connection |      Reported State |      Assigned State
------+-------+--------------------+----------------+--------------+---------------------+--------------------
node1 |     1 | 192.168.10.101:5432 |   1: 0/501D530 |   read-write |        wait_primary |             primary
node2 |     8 |  192.168.10.102:5432 |   1: 0/5000000 |    read-only |           secondary |           secondary

Everything went well and the auto failover is showing the correct status. After a few days of smooth sailing with our PostgreSQL setup, we noticed our pg_wal directory was filling up fast. Upon investigation, we discovered that a replication slot created by pg_auto_failover was to blame. Despite being inactive and unused.

Initially, we tried to drop the replication slot (pgautofailover_standby_8) manually created by pg_auto_failover, but it did not allow us to do the same.

postgres=# select * from pg_replication_slots ;
-[ RECORD 1 ]-------+-------------------------
slot_name           | manual_slot
plugin              |
slot_type           | physical
datoid              |
database            |
temporary           | f
active              | t
active_pid          | 8492
xmin                | 739
catalog_xmin        |
restart_lsn         | 0/501D530
confirmed_flush_lsn |
wal_status          | reserved
safe_wal_size       |
two_phase           | f
-[ RECORD 2 ]-------+-------------------------
slot_name           | pgautofailover_standby_8
plugin              |
slot_type           | physical
datoid              |
database            |
temporary           | f
active              | f
active_pid          |
xmin                |
catalog_xmin        |
restart_lsn         | 0/501D448
confirmed_flush_lsn |
wal_status          | reserved
safe_wal_size       |
two_phase           | f

postgres=# select pg_drop_replication_slot('pgautofailover_standby_8');
-[ RECORD 1 ]------------+-
pg_drop_replication_slot |
postgres=#  select * from pg_replication_slots ;
-[ RECORD 1 ]-------+-------------------------
slot_name           | manual_slot
plugin              |
slot_type           | physical
datoid              |
database            |
temporary           | f
active              | t
active_pid          | 8492
xmin                | 739
catalog_xmin        |
restart_lsn         | 0/501D530
confirmed_flush_lsn |
wal_status          | reserved
safe_wal_size       |
two_phase           | f
-[ RECORD 2 ]-------+-------------------------
slot_name           | pgautofailover_standby_8
plugin              |
slot_type           | physical
datoid              |
database            |
temporary           | f
active              | f
active_pid          |
xmin                |
catalog_xmin        |
restart_lsn         | 0/501D448
confirmed_flush_lsn |
wal_status          | reserved
safe_wal_size       |
two_phase           | f

Additionally, we discovered that slots weren’t always removed during failover or switchover events. Instead, they were only deleted when pg_rewind had something to fix. This happens when pg_rewind clears out various folders, including the one where the slot is stored (called pg_replslot/), leading to the slot’s removal.

Solution:

To resolve our issues, we dismantled the existing disaster recovery setup and reconfigured it with pg_auto_failover. This demonstrates how pg_auto_failover assumes control over the entire replication process.

bash-4.2$ pg_autoctl drop node --destroy
06:02:27 28290 INFO  Removing node with name "node1" in formation "default" from the monitor
06:02:27 28290 INFO  An instance of pg_autoctl is running with PID 5346, waiting for it to stop.
06:02:28 28290 INFO  The pg_autoctl instance with pid 5346 has now terminated.
06:02:28 28290 INFO  This node with id 1 in formation "default" and group 0 has been dropped from the monitor
06:02:28 28290 INFO  Stopping PostgreSQL at "/var/lib/pgsql/14/data"
06:02:28 28290 INFO  /usr/pgsql-14/bin/pg_ctl --pgdata /var/lib/pgsql/14/data --wait stop --mode fast
06:02:28 28290 INFO  /usr/pgsql-14/bin/pg_ctl status -D /var/lib/pgsql/14/data [3]
06:02:28 28290 INFO  pg_ctl: no server running
06:02:28 28290 INFO  pg_ctl stop failed, but PostgreSQL is not running anyway
06:02:28 28290 INFO  Removing "/var/lib/pgsql/14/data"
06:02:28 28290 INFO  Removing "/var/lib/pgsql/.config/pg_autoctl/var/lib/pgsql/14/data/pg_autoctl.cfg"

postgres=#  select * from pg_replication_slots ;
 slot_name | plugin | slot_type | datoid | database | temporary | active | active_pid | xmin | catalog_xmin | restart_lsn | confirmed_flush_lsn |
wal_status | safe_wal_size | two_phase
-----------+--------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+---------------------+-
-----------+---------------+-----------
(0 rows)


-bash-4.2$ pg_autoctl create postgres --name node1 --pgport 5432 --dbname pg_auto_failover --monitor postgres://autoctl_node@192.168.10.101:5433/pg_auto_failover --auth trust --ssl-self-signed --pgdata /var/lib/pgsql/14/data

-bash-4.2$ pg_autoctl show state
 Name |  Node |          Host:Port |       TLI: LSN |   Connection |      Reported State |      Assigned State
------+-------+--------------------+----------------+--------------+---------------------+--------------------
node1 |     1 | 192.168.10.101:5432 |   1: 0/501D530 |   read-write |        wait_primary |             primary
node2 |     8 |  192.168.10.102:5432 |   1: 0/5000000 |    read-only |           secondary |           secondary

postgres=#  select * from pg_replication_slots ;
-[ RECORD 1 ]-------+-------------------------
slot_name           | pgautofailover_standby_9
plugin              |
slot_type           | physical
datoid              |
database            |
temporary           | f
active              | t
active_pid          | 8492
xmin                | 739
catalog_xmin        |
restart_lsn         | 0/501D530
confirmed_flush_lsn |
wal_status          | reserved
safe_wal_size       |
two_phase           | f

pg_auto_failover is built to handle the failure of any single node in the system. If a failure occurs, the Log Sequence Number (LSN) may reference a potentially altered history. Patroni also manages fail-over in a similar manner.

Despite these challenges, we found that pg_auto_failover effectively handles node failures. It’s designed to tolerate the failure of any single node in the system, ensuring continued operation even if one component encounters issues. However, it’s essential to monitor and manage replication slots properly to prevent unnecessary bloat in the pg_wal directory.

In conclusion, while implementing pg_auto_failover may come with its challenges, the benefits far outweigh the drawbacks. By ensuring seamless fail-over management, pg_auto_failover simplifies the task of maintaining PostgreSQL clusters.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>