Terabyte-Scale PostgreSQL Logical Replication: The Challenges nobody warns us about

PostgreSQL logical replication is a powerful feature — elegant in concept, battle-hardened in design. But when we push it to replicate TBs of data per month across hundreds of customer deployments, elegant starts to show its cracks. This is the story of what happens when we go far beyond the happy path.

Why Does Scale Break Logical Replication?

We all know that PostgreSQL logical replication works by decoding the WAL file into a logical stream of row-level changes — INSERTs, UPDATEs, DELETEs — and shipping those changes to subscribers. Unlike physical replication which clones raw disk blocks, logical replication is schema-aware, human-readable, and flexible.

For Change Data Capture (CDC) use cases, this is a perfect fit. We can stream changes from Postgres into data warehouses, analytics engines, or event pipelines without touching the application layer.

The problem? Every architectural decision that makes logical replication work also becomes a pressure point at scale.

Challenge 1: WAL Accumulation and Replication Slot Lag

The most dangerous failure mode in large-scale logical replication is WAL accumulation.

When a replication slot is created in Postgres, the server holds onto WAL segments until the subscriber confirms it has consumed them. If our subscriber falls behind — due to a network hiccup, a slow downstream consumer, or a schema migration gone long — Postgres keeps accumulating WAL. On a write-heavy database doing tens of terabytes monthly, this can fill our disk within hours.

The core tension: Postgres has no built-in backpressure mechanism for replication slots. We either consume the WAL fast enough, or we face two bad choices — drop the slot (losing our replication position) or crash the source database with a full disk.

Monitoring pg_replication_slots for confirmed_flush_lsn becomes critical operational infrastructure. Setting max_slot_wal_keep_size (introduced in Postgres 13) provides a safety valve — but tuning it requires careful judgment: too small and we risk invalidating slots; too large and we risk the disk-full scenario anyway.

Challenge 2: The Initial Snapshot Problem

Before logical replication can stream ongoing changes, it needs to establish a baseline: an initial snapshot of every table. For small databases, this is a footnote. For a customer with TB’s, this is a multi-day operation. This also includes addition of single table with size in TBs to the logical replicas.

The challenge is keeping this snapshot consistent. Postgres uses REPEATABLE READ isolation for the snapshot, which means long-running transactions on the source can block vacuum and bloat tables during the copy. Meanwhile, changes that arrive after the snapshot LSN must be buffered — either in memory or on disk — until the snapshot completes.

At scale, this creates a dangerous feedback loop: the longer the initial copy takes, the more WAL must be retained, the more disk pressure builds. Large tables with high write rates can make the initial snapshot window nearly impossible to close cleanly.

We need to strategize this process carefully.

Challenge 3: Schema Changes Are Not Our Friend

Logical replication has a well-known Achilles’ heel: DDL is not replicated. Table renames, column additions, type changes — none of these are streamed through the logical replication protocol. The subscriber must apply schema changes independently, and if it falls even slightly out of sync with the source, the replication stream breaks.

At hundreds of tenants, schema drift becomes a statistical certainty. A developer adds a NOT NULL column without a default. A migration renames a table mid-replication. A type is cast from INTEGER to BIGINT. Each of these is an invisible landmine.

Handling this requires building schema-change detection on top of Postgres’s pg_catalog tables, to identify schema OID changes, and developing graceful degradation logic that can pause, re-snapshot, and resume replication around DDL events — without losing data or duplicating rows.

Challenge 4: The `TOAST` Storage Trap

Postgres uses TOAST (The Oversized Attribute Storage Technique) to store large column values — text blobs, JSONB documents, large arrays — out-of-line in a separate storage table. When a row is updated, unchanged TOAST values may not be included in the WAL record unless the table’s REPLICA IDENTITY is set to FULL.

The default REPLICA IDENTITY DEFAULT only includes primary key columns in UPDATE and DELETE records. This means our replication consumer may receive a row change with NULL for large JSONB fields — not because the value is null, but because Postgres didn’t bother including it in the WAL.

The fix — ALTER TABLE ... REPLICA IDENTITY FULL — solves the data completeness problem but dramatically increases WAL volume. For tables with large JSONB or text columns, this can increase WAL generation by 5–10x. At TBs/month of replication, this tradeoff deserves serious measurement before we flip the switch.

Challenge 5: Observability is an Afterthought

Postgres exposes the basics through pg_stat_replication, pg_replication_slots, and pg_stat_activity — but these views tell us what is happening, not why, and they provide almost no signal about downstream consumer health.

At scale, we need end-to-end visibility: WAL lag per slot, bytes consumed per second, time-to-catchup estimates, DDL event detection, slot invalidation alerts, and per-table replication throughput. None of this exists out of the box.

Building production-grade observability means instrumenting both the Postgres side (via custom monitoring queries on pg_replication_slots, pg_stat_replication, and WAL LSN arithmetic) and the consumer side (tracking throughput, error rates, and checkpoint progress independently). Correlating the two gives us the early warning system we need before a lagging slot becomes a disk-full emergency.

The Bigger Picture

Postgres logical replication is genuinely impressive engineering. The pgoutput plugin is stable, the protocol is well-documented, and the foundation is solid. But “solid foundation” and “production-ready at TBs/month” are different things.

The gap is filled by hard operational experience: knowing when to drop and recreate a slot versus waiting for catchup, understanding how TOAST and REPLICA IDENTITY interact, building tooling that can handle the long tail of schema change edge cases, and instrumenting the system deeply enough to act before problems become outages.

The teams operating at this scale aren’t doing anything magic. They’ve just hit every failure mode Postgres logical replication has to offer — and built the scaffolding to survive it.

See this in action at PGConf India 2026 – Operating Postgres Logical Replication at Massive Scale presented by Sai Srirampur.