How Patroni Brings High Availability to PostgreSQL

PostgreSQL gives you excellent streaming replication. You can have a primary node streaming WAL (Write-Ahead Log) to one or more standbys, and those standbys will stay remarkably close to the primary in terms of data.

But there’s a catch: if your primary goes down, nothing happens automatically. Your standby just sits there, faithfully replaying WAL that is already replicated, and waiting for WAL that will never come, while your application throws connection errors into the void. You need a human or something acting like one to promote the standby, update DNS or connection strings, and restore write availability.

This is where HA tooling comes in. And this is what Patroni solves.

What Patroni Actually Is

We know that Patroni is a Python-based HA template for PostgreSQL. It wraps your Postgres instances and handles the full lifecycle: startup, replication configuration, health checking, and — crucially — automatic failover.

But Patroni doesn’t try to be clever about consensus. It leans on a proven distributed configuration store(DCS) like etcd, Consul, or ZooKeeper to handle the hard parts of leader election. This is an important architectural choice. Distributed consensus is notoriously difficult to get right, and by delegating that responsibility to a battle-tested system, Patroni sidesteps an enormous class of bugs.

The Architecture

The architecture is already discussed in the previous blog.

Each Patroni runs on the same host as a Postgres instance and communicate with the DCS to:

  • Elect a leader — the node that holds the DCS lock is the primary
  • Maintain a heartbeat — the leader must renew its lock on a configurable TTL; if it can’t, it’s considered failed
  • Trigger failover — when the lock expires, eligible standbys race to acquire it; the winner promotes itself

The elegance here is that split-brain — the nightmare scenario where two nodes both believe they’re the primary — is prevented by the DCS. The lock is atomic. There can only be one holder.

Failover in Practice

When a primary fails, here’s what happens:

  • The primary’s Patroni agent stops renewing its lock (because the node is dead, or Patroni itself crashed).
  • The lock TTL expires — configurable, typically 15–30 seconds.
  • Patroni excludes any standbys lagging beyond maximum_lag_on_failover. Among the remaining eligible nodes, whichever atomically acquires the DCS lock first (via compare-and-swap) becomes the new primary. This is not a race based on WAL position — it’s a single atomic DCS operation. Without synchronous replication, the winner may not have every committed transaction, meaning data loss is possible in async mode.
  • The winner runs pg_ctl promote, rewrites its recovery configuration, registers itself as leader in the DCS, and begins accepting writes.
  • Other standbys detect the new leader and attempt to reattach. If their timeline has diverged, they may need pg_rewind to reconcile — or a full re-clone if rewind fails. Ensure use_pg_rewind: true is set in your Patroni config, or this step may require manual intervention.

The whole process typically completes in under 30 seconds. For most applications, this is well within acceptable bounds — especially with proper connection retry logic on the client side.

The Role of HAProxy in Patroni cluster

One thing Patroni doesn’t handle is connection routing. Your application still needs a way to always point at the current primary, without having to know which node that is at any given moment.

The typical solution is HAProxy in front of the cluster, using Patroni’s built-in REST API. Each Patroni node exposes a health endpoint:

  • GET /master — returns 200 if the node is the primary, 503 otherwise
  • GET /replica — returns 200 if the node is a healthy standby

HAProxy polls these endpoints and routes traffic accordingly. Your application connects to HAProxy, and HAProxy figures out the rest. After a failover, HAProxy’s health checks will detect the new primary within seconds and start routing writes there.

For connection pooling, pgBouncer can be layered in as well — either in front of HAProxy, or on each node with HAProxy handling the routing above it.

Configuration Worth Knowing

Patroni is configured via a YAML file. A few settings that matter in production:

  • ttl — How long the DCS lock is held before expiry. Lower values mean faster failover but more sensitivity to network blips. 15–30 seconds is a common range.
  • loop_wait — How frequently Patroni checks cluster state. Should be less than half of ttl.
  • maximum_lag_on_failover — Patroni will refuse to promote a standby that’s too far behind the primary. This reduces the risk of promoting a significantly stale replica, but does not guarantee zero data loss — a replica within the threshold can still be missing recently committed transactions. Only synchronous replication prevents data loss.
  • synchronous_mode — If you need zero data loss, Patroni supports synchronous replication mode. This ensures at least one standby has confirmed receipt of every transaction before it’s committed. The trade-off is write latency. One important caveat: with the default synchronous_mode_strict: false, Patroni can fall back to async if no synchronous standby is available. If you need zero data loss under all conditions including when standbys are unavailable, set synchronous_mode_strict: true, accepting that writes will block rather than fall back.
  • use_pg_rewind — Set this to true. When a failed primary comes back or a standby needs to reattach after timeline divergence, pg_rewind lets Patroni reconcile the node without a full re-clone. Without it, reattachment can require manual intervention.

What Patroni Doesn’t Do

It’s worth being clear about scope. Patroni:

  • Does not manage your DCS — you need to run etcd/Consul/ZooKeeper separately and ensure it is highly available
  • Does not handle connection routing — that’s HAProxy, pgBouncer, or a similar tool
  • Does not replace backup and recovery — you still need pgBackRest, Barman, or equivalent

HA is one part of a resilient Postgres deployment. Patroni handles that part well. The rest is still your responsibility.

Why Patroni Over the Alternatives

  • There are other tools in this space — repmgr, Stolon, Pgpool-II, cloud-managed offerings. Each has its place. But Patroni has become the standard for self-managed Postgres HA for a few reasons:
  • It’s actively maintained and widely deployed.
  • It’s transparent. Patroni doesn’t abstract away Postgres — it wraps it. You can still connect directly to your Postgres instances, inspect them, and reason about their state. Nothing is hidden.
  • It delegates consensus correctly. By using etcd or Consul rather than rolling its own, Patroni inherits years of distributed systems work and testing.
  • And it’s operationally simple. The configuration is readable YAML. The REST API gives you clear visibility into cluster state. The patronictl CLI lets you perform controlled switchovers, restarts, and reinitializations with confidence.

Getting Started

If you’re building out Postgres HA and haven’t looked at Patroni yet, the official documentation at patroni.readthedocs.io is genuinely good. The quickstart with etcd is the right place to begin — get a three-node cluster running locally, kill the primary, and watch what happens. It’s a remarkably satisfying thing to see.

Reference link :
https://opensource-db.com/step-by-step-guide-to-postgresql-ha-with-patroni-part-i/
https://opensource-db.com/step-by-step-guide-to-postgresql-ha-with-patroni-part-2/
https://opensource-db.com/step-by-step-guide-to-postgresql-ha-with-patroni-part-3/

See this in action at PGConf India 2026 – Inside PostgreSQL High Availability: Quorum, Split-Brain, and Failover at Scale presented by Venkat Akhil.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top