In our previous blog, we explored the high availability (HA) architecture for PostgreSQL using Patroni
, etcd
, HAproxy
and PgBouncer
. We discussed how these components work together to ensure automatic failover, connection pooling, and efficient traffic distribution. Additionally, we covered the installation of each component to set up the HA environment.In this blog, we will focus on the configuration of Patroni with etcd, detailing how to properly set up and fine-tune these components for a resilient PostgreSQL cluster. We will also conduct a quick test to verify the setup and demonstrate the failover process to ensure seamless database availability.
Prequesties
Here are the IP addresses of the three servers that we used to configure Patroni with etcd.
- server1 – 192.168.80.160
- server2 – 192.168.80.161
- server3 – 192.168.80.162
Certain ports must be opened between the servers to ensure all components function seamlessly together. Here are the commands to add port (need to execute in all the five servers including two HAproxy servers).
#The port used for PostgreSQL client connections
firewall-cmd --zone=public --add-port=5432/tcp --permanent
#The port used for PgBouncer client connections
firewall-cmd --zone=public --add-port=6432/tcp --permanent
#The port used for http API of Patroni
firewall-cmd --zone=public --add-port=8008/tcp --permanent
#The port used for etcd client communication
firewall-cmd --zone=public --add-port=2379/tcp --permanent
#The port used for etcd peer communication
firewall-cmd --zone=public --add-port=2380/tcp --permanent
#The port used for http client connection to PostgreSQL backend
firewall-cmd --zone=public --add-port=5000/tcp --permanent
#The port used for https client connection to PostgreSQL backend
firewall-cmd --zone=public --add-port=5001/tcp --permanent
#Reload the firewalld services
firewall-cmd --reload
Configuration of etcd
After completing the installation of etcd
on all the three nodes, verify it by checking the version.
etcd --version
#Output
etcd Version: 3.4.34
Git SHA: c123b3ea3
Go Version: go1.22.7
Go OS/Arch: linux/amd64
Server1
Edit your etcd configuration file in server1. As we are configuring the 3 nodes patroni cluster ,we are adding 3 IP’s in ETCD_INITIAL_CLUSTER
. Add accordingly to your cluster.
[root@server1 ~]# cd /etc/etcd
[root@server1 etcd]# ls -ltr
total 8
-rw-r--r-- 1 root root 1486 Jan 26 22:33 etcd.conf.orig
-rw-r--r-- 1 root root 881 Feb 6 14:11 etcd.conf
[root@server1 etcd]# vi etcd.conf
#specify the name of an etcd member
ETCD_NAME=server1
# Configure the data directory
ETCD_DATA_DIR="/var/lib/etcd/server1"
# Configure the listen URLs
ETCD_LISTEN_PEER_URLS="http://192.168.80.160:2380,http://127.0.0.1:2380"
ETCD_LISTEN_CLIENT_URLS="http://192.168.80.160:2379,http://127.0.0.1:2379"
# Configure the advertise URLs
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.80.160:2380"
ETCD_ADVERTISE_CLIENT_URLS="http://192.168.80.160:2379"
# Configure the initial cluster
ETCD_INITIAL_CLUSTER="server1=http://192.168.80.160:2380,server2=http://192.168.80.161:2380,server3=http://192.168.80.162:2380"
# Configure the initial cluster state
ETCD_INITIAL_CLUSTER_STATE="new"
# Configure the initial cluster token
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
# Configure v2 API
ETCD_ENABLE_V2="true"
Server2
Edit the etcd configuration file in server2.
[root@server2 ~]# cd /etc/etcd
[root@server2 etcd]# ls -ltr
total 8
-rw-r--r-- 1 root root 1486 Jan 26 22:33 etcd.conf.orig
-rw-r--r-- 1 root root 881 Feb 6 14:12 etcd.conf
[root@server2 etcd]# vi etcd.conf
#specify the name of an etcd member
ETCD_NAME=server2
# Configure the data directory
ETCD_DATA_DIR="/var/lib/etcd/server2"
# Configure the listen URLs
ETCD_LISTEN_PEER_URLS="http://192.168.80.161:2380,http://127.0.0.1:2380"
ETCD_LISTEN_CLIENT_URLS="http://192.168.80.161:2379,http://127.0.0.1:2379"
# Configure the advertise URLs
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.80.161:2380"
ETCD_ADVERTISE_CLIENT_URLS="http://192.168.80.161:2379"
# Configure the initial cluster
ETCD_INITIAL_CLUSTER="server1=http://192.168.80.160:2380,server2=http://192.168.80.161:2380,server3=http://192.168.80.162:2380"
# Configure the initial cluster state
ETCD_INITIAL_CLUSTER_STATE="new"
# Configure the initial cluster token
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
# Configure v2 API
ETCD_ENABLE_V2="true"
Server3
Edit the etcd configuration file in server3.
[root@server3 ~]# cd /etc/etcd/
[root@server3 etcd]# ls -ltr
total 8
-rw-r--r-- 1 root root 1486 Jan 26 22:33 etcd.conf.orig
-rw-r--r-- 1 root root 881 Feb 6 14:13 etcd.conf
[root@server3 etcd]# vi etcd.conf
#specify the name of an etcd member
ETCD_NAME=server3
# Configure the data directory
ETCD_DATA_DIR="/var/lib/etcd/server3"
# Configure the listen URLs
ETCD_LISTEN_PEER_URLS="http://192.168.80.162:2380,http://127.0.0.1:2380"
ETCD_LISTEN_CLIENT_URLS="http://192.168.80.162:2379,http://127.0.0.1:2379"
# Configure the advertise URLs
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.80.162:2380"
ETCD_ADVERTISE_CLIENT_URLS="http://192.168.80.162:2379"
# Configure the initial cluster
ETCD_INITIAL_CLUSTER="server1=http://192.168.80.160:2380,server2=http://192.168.80.161:2380,server3=http://192.168.80.162:2380"
# Configure the initial cluster state
ETCD_INITIAL_CLUSTER_STATE="new"
# Configure the initial cluster token
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
# Configure v2 API
ETCD_ENABLE_V2="true"
After configuring the etcd in all the three servers, enable and start the etcd
service using systemctl
systemctl start etcd
systemctl enable etcd
systemctl status etcd
We must now be able to see that the etcd services are up and running in all the three servers. Now, check the endpoint status using the below command after adding the servers and endpoints details to the bash_profile
of three servers.
#Showing the server3 for your reference
[root@server3 ~]# vi .bash_profile
server1=192.168.80.160
server2=192.168.80.161
server3=192.168.80.162
ENDPOINTS=$server1:2379,$server2:2379,$server3:2379
[root@server3 ~]# . .bash_profile
#After adding, check the status using the command
[root@server3 etcd]# etcdctl endpoint status --write-out=table --endpoints=$ENDPOINTS
+---------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+---------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| 192.168.80.160:2379 | 16660bd8f0b37be2 | 3.5.18 | 234 kB | false | false | 16 | 788 | 788 | |
| 192.168.80.161:2379 | a1b073e6045e4de5 | 3.5.18 | 234 kB | true | false | 16 | 788 | 788 | |
| 192.168.80.162:2379 | e9aa0dcba4631717 | 3.5.18 | 234 kB | false | false | 16 | 788 | 788 | |
+---------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
Now, you can see that the server2
is the leader of the cluster. Let’s breakdown what happens if the server2
is down.
Step1 – Check the endpoint status of etcd cluster and stop the services in server2 as it is the leader of the cluster.
[root@server2 ~]# etcdctl endpoint status --write-out=table --endpoints=$ENDPOINTS
+---------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+---------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| 192.168.80.160:2379 | 16660bd8f0b37be2 | 3.5.18 | 234 kB | false | false | 16 | 788 | 788 | |
| 192.168.80.161:2379 | a1b073e6045e4de5 | 3.5.18 | 234 kB | true | false | 16 | 788 | 788 | |
| 192.168.80.162:2379 | e9aa0dcba4631717 | 3.5.18 | 234 kB | false | false | 16 | 788 | 788 | |
+---------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
[root@server2 ~]# systemctl stop etcd
Step2 – As the etcd
is stopped in server2 , check the endpoint status now.
[root@server2 ~]# etcdctl endpoint status --write-out=table --endpoints=$ENDPOINTS
{"level":"warn","ts":"2025-02-19T14:42:58.041855+0530","logger":"etcd-client","caller":"v3@v3.5.18/retry_interceptor.go:63","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0003fc000/192.168.80.160:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: last connection error: connection error: desc = \"transport: Error while dialing: dial tcp 192.168.80.161:2379: connect: connection refused\""}
Failed to get the status of endpoint 192.168.80.161:2379 (context deadline exceeded)
+---------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+---------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| 192.168.80.160:2379 | 16660bd8f0b37be2 | 3.5.18 | 234 kB | false | false | 17 | 789 | 789 | |
| 192.168.80.162:2379 | e9aa0dcba4631717 | 3.5.18 | 234 kB | true | false | 17 | 789 | 789 | |
+---------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
Note :As etcdctl is a utility used to interact with etcd , we can use it even when the etcd service is not running.
Now , you were able to see server3 is the new leader of the cluster.
Step3 – Now, start the etcd
services in server2
and check whether the server2
is joining the cluster or not.
[root@server2 ~]# systemctl start etcd
[root@server2 ~]# etcdctl endpoint status --write-out=table --endpoints=$ENDPOINTS
+---------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+---------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| 192.168.80.160:2379 | 16660bd8f0b37be2 | 3.5.18 | 234 kB | false | false | 17 | 790 | 790 | |
| 192.168.80.161:2379 | a1b073e6045e4de5 | 3.5.18 | 234 kB | false | false | 17 | 790 | 790 | |
| 192.168.80.162:2379 | e9aa0dcba4631717 | 3.5.18 | 234 kB | true | false | 17 | 790 | 790 | |
+---------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
Now, we were able to see that the server2
is part of the cluster as a replica.
Configuration of Patroni
After completing the installation of patroni
on all the three nodes, verify it by checking the version.
patroni --version
#Output
patroni 4.0.4
Server1
Edit the patroni
configuration file in server1. If you observe the configuration file, we configured the information of etcd, bootstrap , pg_hba , PostgreSQL , parameters & watchdog.
[root@server1 ~]# cd /etc/patroni/
[root@server1 patroni]# ls -ltr
total 4
drwxr-xr-x 2 root root 26 Feb 6 14:46 callbacks
-rw-r--r-- 1 root root 1194 Feb 6 15:10 patroni.yml
[root@server1 patroni]# vi patroni.yml
scope: postgres
namespace: /db/
name: server1
restapi:
listen: 192.168.80.160:8008
connect_address: 192.168.80.160:8008
etcd3:
hosts: 192.168.80.160:2379,192.168.80.161:2379,192.168.80.162:2379
bootstrap:
dcs:
ttl: 30
loop_wait: 10
retry_timeout: 10
maximum_lag_on_failover: 1048576
postgresql:
use_pg_rewind: true
pg_hba:
- host replication replicator 192.168.80.160/0 md5
- host replication replicator 192.168.80.161/0 md5
- host replication replicator 192.168.80.162/0 md5
- host replication all 0.0.0.0/0 md5
- host all all 0.0.0.0/0 md5
postgresql:
listen: 192.168.80.160:5432
connect_address: 192.168.80.160:5432
data_dir: /u01/pgsql/17
bin_dir: /usr/pgsql-17/bin
authentication:
replication:
username: replicator
password: replicator
superuser:
username: postgres
password: postgres
parameters:
unix_socket_directories: '/run/postgresql/'
watchdog:
mode: required
device: /dev/watchdog
safety_margin: 5
tags:
nofailover: false
noloadbalance: false
clonefrom: false
nosync: false
If we have to change any parameters of the postgresql.conf
file like shared_buffer
, work_mem
mention them in the parameters sessions as per your compute and requirement. Watchdog devices are software or hardware mechanisms that will reset the whole system when they do not get a keepalive heartbeat within a specified timeframe. This adds an additional layer of fail safe in case usual Patroni split-brain protection mechanisms fail.
Server2
Edit the configuration file in server2.
[root@server2 ~]# vi /etc/patroni/patroni.yml
scope: postgres
namespace: /db/
name: server2
restapi:
listen: 192.168.80.161:8008
connect_address: 192.168.80.161:8008
etcd3:
hosts: 192.168.80.160:2379,192.168.80.161:2379,192.168.80.162:2379
bootstrap:
dcs:
ttl: 30
loop_wait: 10
retry_timeout: 10
maximum_lag_on_failover: 1048576
postgresql:
use_pg_rewind: true
pg_hba:
- host replication replicator 192.168.80.160/0 md5
- host replication replicator 192.168.80.161/0 md5
- host replication replicator 192.168.80.162/0 md5
- host replication all 0.0.0.0/0 md5
- host all all 0.0.0.0/0 md5
postgresql:
listen: 0.0.0.0:5432
connect_address: 192.168.80.161:5432
data_dir: /u01/pgsql/17
bin_dir: /usr/pgsql-17/bin
authentication:
replication:
username: replicator
password: replicator
superuser:
username: postgres
password: postgres
parameters:
unix_socket_directories: '/run/postgresql/'
watchdog:
mode: required
device: /dev/watchdog
safety_margin: 5
tags:
nofailover: false
noloadbalance: false
clonefrom: false
nosync: false
Server3
Edit the configuration file in server3.
[root@server3 ~]# vi /etc/patroni/patroni.yml
scope: postgres
namespace: /db/
name: server3
restapi:
listen: 192.168.80.162:8008
connect_address: 192.168.80.162:8008
etcd3:
hosts: 192.168.80.160:2379,192.168.80.161:2379,192.168.80.162:2379
bootstrap:
dcs:
ttl: 30
loop_wait: 10
retry_timeout: 10
maximum_lag_on_failover: 1048576
postgresql:
use_pg_rewind: true
pg_hba:
- host replication replicator 192.168.80.160/0 md5
- host replication replicator 192.168.80.161/0 md5
- host replication replicator 192.168.80.162/0 md5
- host replication all 0.0.0.0/0 md5
- host all all 0.0.0.0/0 md5
postgresql:
listen: 192.168.80.162:5432
connect_address: 192.168.80.162:5432
data_dir: /u01/pgsql/17
bin_dir: /usr/pgsql-17/bin
authentication:
replication:
username: replicator
password: replicator
superuser:
username: postgres
password: postgres
parameters:
unix_socket_directories: '/run/postgresql/'
watchdog:
mode: required
device: /dev/watchdog
safety_margin: 5
tags:
nofailover: false
noloadbalance: false
clonefrom: false
nosync: false
After configuring the patroni in all the three servers, start the patroni
service using systemctl
systemctl start patroni
Here, patroni will initialize the database as per the configurations given in patroni.yml
file on the server chosen as the leader. We can see a message in log file as such.
The database cluster will be initialized with locale "en_US.UTF-8"
In the replica
servers, patroni will simply take the basebackup of the leader
. We can see a message in log file as such
INFO: replica has been created using basebackup
To check the list of servers in the patroni cluster , use the following command.
#Checing in server3
[root@server3 ~]# patronictl -c /etc/patroni/patroni.yml list
+ Cluster: postgres (7468243839154533834) ------+----+-----------+
| Member | Host | Role | State | TL | Lag in MB |
+---------+----------------+---------+----------+----+-----------+
| server1 | 192.168.80.160 | Replica | running | 4 | 0 |
| server2 | 192.168.80.161 | Replica | running | 3 | 0 |
| server3 | 192.168.80.162 | Leader | running | 5 | |
+---------+----------------+---------+----------+----+-----------+
Here ,if we observe , server3
is the current leader. We will test the failover after a quick test.
Quick test
In this quick test, let’s create a table in the current leader i.e., server3
and check in one of the replicas
.
[root@server3 ~]# su - postgres
[postgres@server3 ~]$ psql
psql (17.2)
Type "help" for help.
postgres=# create table emp (id int);
CREATE TABLE
postgres=# \dt+
List of relations
Schema | Name | Type | Owner | Persistence | Access method | Size | Description
--------+------+-------+----------+-------------+---------------+---------+-------------
public | emp | table | postgres | permanent | heap | 0 bytes |
(1 row)
Now, let’s whether the table is created or not on the server1
[root@server1 ~]# su - postgres
[postgres@server1 ~]$ psql
psql (17.2)
postgres=# \dt+
List of relations
Schema | Name | Type | Owner | Persistence | Access method | Size | Description
--------+------+-------+----------+-------------+---------------+---------+-------------
public | emp | table | postgres | permanent | heap | 0 bytes |
(1 row)
As, the table is created in the replica automatically, let’s see the failover part now.
Failover
As the current leader is server3
, let’s check whether one of the replica is becoming as a primary or not once the patroni
services is stopped on server3
Step1 – Check the status of patroni
and stop the Patroni service on server3.
[root@server3 ~]# patronictl -c /etc/patroni/patroni.yml list
+ Cluster: postgres (7468243839154533834) ------+----+-----------+
| Member | Host | Role | State | TL | Lag in MB |
+---------+----------------+---------+----------+----+-----------+
| server1 | 192.168.80.160 | Replica | running | 4 | 0 |
| server2 | 192.168.80.161 | Replica | running | 3 | 0 |
| server3 | 192.168.80.162 | Leader | running | 5 | |
+---------+----------------+---------+----------+----+-----------+
#Stop the patroni
[root@server3 ~]# systemctl stop patroni
[root@server3 ~]# patronictl -c /etc/patroni/patroni.yml list
+ Cluster: postgres (7468243839154533834) ------+----+-----------+
| Member | Host | Role | State | TL | Lag in MB |
+---------+----------------+---------+----------+----+-----------+
| server1 | 192.168.80.160 | Replica | running | 4 | 0 |
| server2 | 192.168.80.161 | Leader | running | 6 | |
| server3 | 192.168.80.162 | Replica | stopped | | unknown |
+---------+----------------+---------+----------+----+-----------+
Here, we can observe that the server2
became the new leader and the server3
became replica
and it’s in the stopped state.
Step2 – Now, start the patroni
services on server3
and check the same status again.
[root@server3 ~]# systemctl start patroni
#Check the list
[root@server3 ~]# patronictl -c /etc/patroni/patroni.yml list
+ Cluster: postgres (7468243839154533834) ------+----+-----------+
| Member | Host | Role | State | TL | Lag in MB |
+---------+----------------+---------+----------+----+-----------+
| server1 | 192.168.80.160 | Replica | running | 4 | 0 |
| server2 | 192.168.80.161 | Leader | running | 6 | |
| server3 | 192.168.80.162 | Replica | running | 5 | 16 |
+---------+----------------+---------+----------+----+-----------+
Here we can observe that the server3
is up and running again and following the server2
. The advantage of using patroni
is no manual intervention is required to re-join the failed leader back to the cluster unlike repmgr
.
Summary
In this blog, we covered the complete configuration of Patroni
with etcd
to set up a highly available PostgreSQL cluster. We started by configuring etcd on three servers, ensuring proper cluster formation and leader election. We then proceeded with Patroni installation and configuration, integrating it with etcd to manage automatic failover and replication. After setting up Patroni, we performed a quick test by creating a table on the leader node and verifying replication across replicas. Finally, we simulated a failover scenario, demonstrating how Patroni automatically promotes a replica to leader status when the primary node goes down. The process also highlighted how the failed leader seamlessly rejoins the cluster once it recovers. In the next blog of this series, we will integrate HAProxy
and PgBouncer
with our Patroni cluster.
Stay tuned for a deeper dive into these components and their configurations!
Hello good man to share the knowledge!
When can we expect part 3?
Hi Jasmin ,thank you for following the series. We will publish the Part3 soon. Free feel to reach us out if any questions.