Step-by-Step Guide to PostgreSQL HA with Patroni: Part 3

Hello, fellow PostgreSQL enthusiasts! Welcome back to the final blog of the series “PostgreSQL HA with Patroni” .
In our previous blog of the series, we covered the complete configuration of Patroni with etcd to set up a highly available PostgreSQL cluster. We started by configuring etcd on three servers, ensuring proper cluster formation and leader election. We then proceeded with Patroni installation and configuration, integrating it with etcd to manage automatic failover and replication. After setting up Patroni, we performed a quick test by creating a table on the leader node and verifying replication across replicas. Finally, we simulated a failover scenario, demonstrating how Patroni automatically promotes a replica to leader status when the primary node goes down. The process also highlighted how the failed leader seamlessly rejoins the cluster once it recovers.

Here is a quick recap and summary of previous blogs of the series.

1st Blog : The first blog is all about the PostgreSQL High Availabilty architecture built on top of opensource tools Patroni , etcd, HAProxy and PgBouncer.Here we’ve discussed the importance of each tool, how to install them, and how these components work together.
2nd Blog : In this post, we focused on configuring Patroni and etcd, and did a quick test of the configuration by creating a table. Also , we simulated the failover scenerio by stopping the current leader.

In this blog post, we’ll take it a step further by integrating HAProxy and PgBouncer with our Patroni cluster to enhance high availability, load balancing, and connection pooling.

Why HAProxy and PgBouncer?

HAproxy A high-performance load balancer used to route traffic between the primary and standby PostgreSQL instances. It ensures that database connections are always directed to the current primary, providing failover capabilities by switching connections to a new primary in the event of a failure.

PgBouncer A lightweight connection pooler for PostgreSQL that improves the performance of database connections by pooling and reusing database connections. It is particularly useful in environments with high connection churn and can be used with Patroni to optimize database connection management.

By combining Patroni, HAProxy and PgBouncer, we can achieve a robust, scalable, and highly available PostgreSQL cluster with seamless connection handling and failover capabilities. Installations were covered in Part 1 of this series , please do check it if you need any insights. If we observe the architecture in the first blog , we have PostgreSQL, Patroni, etcd and PgBouncer installed on 3 out of 5 servers. The remaining two servers will be dedicated to HaProxy. For Patroni and etcd configuration, click here.

Configuration of PgBouncer

Once we are done with the installations of the tools and configuration of Patroni and etcd, we will configure pgBouncer in the three servers.

#Configuration file for PgBouncer(pgbouncer.ini)
[databases]
* = port=5432 auth_user=postgres
[pgbouncer]
logfile = /pglog/pgbouncer/pgbouncer.log
pidfile = /pghome/pgbouncer/pgbouncer.pid
listen_addr = 0.0.0.0
listen_port = 6432
auth_type = md5
auth_file = /etc/pgbouncer/userlist.txt
auth_query = SELECT usename, passwd FROM pg_shadow WHERE usename=$1
admin_users = postgres
stats_users = postgres
pool_mode = session
max_client_conn = 100
default_pool_size = 25
reserve_pool_size = 5

Here, we are using pool_mode as session, but this setting needs to be configured depending on your application. The other settings will need to be configured based on your system configuration and workload. Make sure that you have userlist.txt updated. If you need more insights on PgBouncer, click here.

Once the configuration is done, start the PgBouncer services.

systemctl start pgbouncer
systemctl enable pgbouncer
systemctl status pgbouncer

Configuration of HAProxy

Here is the configuration of HAProxy from our setup. Please adjust the settings based on your infrastructure and workload, and apply the HAProxy settings on the remaining 2 nodes.

#Configuration file for HAproxy (haproxy.cfg)
global
    maxconn     500
    user haproxy
    group haproxy
    daemon

defaults
    mode                    tcp
    log                     global
    option                  tcplog
    retries                 3
    timeout queue           1m
    timeout connect         4s
    timeout client          60m
    timeout server          60m
    timeout check           5s
    maxconn                 500

listen stats
    mode http
    bind *:7000
    stats enable
    stats uri /

listen primary
    bind *:5000
    option httpchk OPTIONS /master
    http-check expect status 200
    default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
    server server1 192.168.80.160:6432 maxconn 100 check port 8008
    server server2 192.168.80.161:6432 maxconn 100 check port 8008
    server server3 192.168.80.162:6432 maxconn 100 check port 8008

listen standby
    bind *:5001
    option httpchk OPTIONS /replica
    http-check expect status 200
    default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
    server server1 192.168.80.160:6432 maxconn 100 check port 8008
    server server2 192.168.80.161:6432 maxconn 100 check port 8008
    server server3 192.168.80.162:6432 maxconn 100 check port 8008

Here, we’ve configured HAProxy to manage load balancing by directing write operations to the primary node and read operations to the standby nodes. Once the configuration is done, start the haproxy services.

systemctl start haproxy
systemctl enable haproxy
systemctl status haproxy

Now, HAProxy will be handling connections to the PostgreSQL cluster, load balancing between the nodes. But If we require our application to remain available even in the event of a complete data center failure, we can configure Keepalived on top of HAProxy to provide high availability across multiple data centers. This ensures that if one data center goes down, traffic will automatically failover to the other data center, maintaining uninterrupted access to the application. Let me know if there’s a need for a dedicated blog on keepalived.

Do a quick test by executing some insert and select queries, and verify if they are being routed as per the configurations.

Conclusion

In this blog post, we’ve successfully integrated HAProxy and PgBouncer into our Patroni – PostgreSQL cluster to enhance high availability, load balancing, and connection pooling. By adding these two powerful tools to our PostgreSQL setup, we’ve created a more robust, scalable, and fault-tolerant system capable of handling high traffic and automatic failover scenarios with ease.

As we wrap up this series on PostgreSQL HA with Patroni, remember that setting up a highly available PostgreSQL cluster is a multi-faceted process involving careful configuration of various components. With our expertise, we ensure that your PostgreSQL environment is not only highly available but also optimized for peak performance and reliability. Get in touch, and let us handle the complexities of database architecture so you can focus on driving your business forward with confidence and peace of mind. Happy clustering!

Why HAProxy and PgBouncer?

Configuration of PgBouncer

Configuration of HAProxy

Conclusion

Leave a Comment