Unlocking Initial Sync for Logical Replication in AWS RDS for PostgreSQL

PostgreSQL’s logical replication is handled through publications and subscriptions. Publications define what data to replicate, while subscriptions specify which server receives the updates.

Initial synchronization ensures the subscriber has all the data. Common methods include:

Snapshot: Copies all data at once. Simple but slow for large datasets.
Base Backup and Continuous Archiving: Makes a base copy and then tracks changes. More suitable for large datasets.

However, when dealing with massive databases in RDS PostgreSQL, such as those in the terabyte range, using pg_basebackup for base backup might not work due to security restrictions on file systems. pg_basebackup requires deep file system access, which is limited in RDS for security reasons. In such cases, replication needs to be set up using the pg_dump utility instead.

Before proceeding, ensure the following configuration parameters are set and restart the cluster:

max_replication_slots: 10
max_wal_senders: 10
track_commit_timestamp: on
wal_level: logical

Next, create a publication and add tables to it in the source database:

ALTER PUBLICATION lsr_sync_01 ADD TABLE sales_data;
ALTER PUBLICATION lsr_sync_01 ADD TABLE customer_info;
ALTER PUBLICATION lsr_sync_01 ADD TABLE transactions;

Create a replication slot:

To create a replication connection to the database and establish a replication slot, follow these steps:

Open a separate terminal session.

Keep the session active, preferably using a screen session.

Run the following command:

[postgres@primary ~]$ psql -h primary -U repmgr "dbname=osdb_lsr replication=database"
psql (10.23, server 14.10)
WARNING: psql major version 10, server major version 14.
         Some psql features might not work.
Type "help" for help.

osdb_lsr=# CREATE_REPLICATION_SLOT lsr_sync_01 LOGICAL pgoutput;
  slot_name  | consistent_point |    snapshot_name    | output_plugin
 lsr_sync_01 | 0/C000110        | 00000003-00000002-1 | pgoutput
(1 row)

Now let's proceed with backing up the database using pg_dump.

pg_dump -h primary -U repmgr -t sales_data -t customer_info -t transactions -Fd -f osdb_lsr_intial_sync -j 2 --no-publications --no-subscriptions --snapshot=00000003-00000002-1 --compress=0 -v osdb_lsr

Restore the backup on the target for LSR:

pg_restore -h standby -U repmgr -Fd -j 2 -v -d osdb_lsr osdb_lsr_intial_sync

Create a subscription:

Create the subscription on the standby server, referring to the primary database, but keep it in a disabled state.

CREATE SUBSCRIPTION lsr_sync_sub_01 CONNECTION 'host=primary port=5432 dbname=osdb_lsr user=repmgr password=*****' PUBLICATION lsr_sync_01
  copy_data = false,
  create_slot = false,
  enabled = false,
  connect = true,
  slot_name = 'lsr_sync_01'

Advance the replication origin:

Retrieve the external ID from the pg_subscription table and use it to advance the replay position to the consistent point captured during replication slot creation.

standby=# SELECT 'pg_'||oid::text AS "external_id" FROM pg_subscription WHERE subname = 'lsr_sync_sub_01';
(1 row)

Take the position of the LSN from the replication slot creation and make it advance to start the replication

standby=# SELECT pg_replication_origin_advance('pg_32119', '0/C000110') ;
(1 row)

Enable the subscription:


Check the replication status:

SELECT * FROM pg_stat_replication;

This series of steps helps to establish logical replication in PostgreSQL, ensuring data synchronization between databases efficiently.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>