Optimizing PostgreSQL Replication: Moving Tables Between Publications and Subscriptions

In PostgreSQL, logical replication allows you to replicate specific sets of tables from a primary (or source) database to a standby (or target) database. Over time, as your database and its usage evolve, certain tables might become more critical or experience different patterns of access. This can necessitate changes in your replication strategy to maintain optimal performance and manageability. As certain tables become more heavily used, they might generate a significant amount of replication traffic. If a publication contains many tables, the replication load might be unevenly distributed across all tables, leading to performance bottlenecks.

This blog will walk you through the process of separating a table from one publication to another and ensuring that the changes are reflected on the subscriber.

Assumption

Let us assume , we use a publication named pub on the primary database and a subscription named sub on the standby database. We have five tables under the publication pub on your primary database, and we want to move one table (table5) to a new publication called pub1. We also need to update the subscriber to reflect these changes.

Step 1 : On the Primary Database

Create a New Publication for the Table 5

First ,we need to create publication including Table 5.

CREATE PUBLICATION pub1 FOR TABLE table5;

Remove the Table from the Old Publication

Next, modify the existing publication pub to remove table5. This ensures that table5 will only be replicated through the new publication pub1.

ALTER PUBLICATION pub DROP TABLE table5;

Verify the New Publication

Check that the new publication pub1 has been created and contains the correct table.

SELECT * FROM pg_publication WHERE pubname = 'pub1';

Verify the Old Publication

Confirm that table5 has been removed from the old publication pub.

SELECT * FROM pg_publication WHERE pubname = 'pub';

Step 2 : On the Standby Database

Refresh the Subscription

Refresh the subscription sub to ensure it is aware of the changes in the publication. This step will synchronize the subscriber with the current state of the publication on the primary database.

ALTER SUBSCRIPTION sub REFRESH PUBLICATION;

Create a New Subscription for the New Publication

If you want to ensure that table5 is replicated through the new publication pub1, create a new subscription sub1 on the subscriber database. Note that we set copy_data=false to avoid copying existing data, assuming the table is already populated on the subscriber.

CREATE SUBSCRIPTION sub1
CONNECTION 'dbname=mydb host=primary_host user=myuser password=mypassword'
PUBLICATION pub1 WITH (copy_data=false);

Verify the New Subscription Status

Check the status of the new subscription sub1 to ensure it is active and properly connected to the new publication.

SELECT * FROM pg_stat_subscription WHERE subname = 'sub1';

Step3 : Check the data consistency

To ensure that the replication is working correctly, you may want to check the row count on the subscriber and verify that it matches the primary database. Additionally, insert a test row into table5 on the primary database and check if it appears on the subscriber.

-- Check row count on the subscriber
SELECT COUNT(*) FROM table5;

-- On the primary database, insert a test row into table5
INSERT INTO table5 (column1, column2) VALUES ('test', 123);

-- Verify the row count on the subscriber again
SELECT COUNT(*) FROM table5;

If the record count on both the primary and standby databases match, it indicates the setup is working as expected. Otherwise, check the PostgreSQL server logs for errors, and resolve them as appropriate.

Use-Cases

Here are some specific use-cases

  1. Handling Increased Load
  2. Optimizing High-Frequency Replication
  3. Customizing Data Distribution
  4. Ensuring Data Security and Compliance

Conclusion

In Realtime, as data volumes grow and access patterns change, a single publication handling multiple tables can become overloaded, leading to slower replication and increased latency. By isolating high-traffic or frequently updated tables into separate publications, we can optimize replication performance, reduce unnecessary data transfer, and better manage resources. Additionally, separating sensitive data into distinct publications allows for enhanced security and compliance, ensuring that stringent controls are applied without affecting other data.

This blog has detailed the process of managing PostgreSQL logical replication by migrating tables between publications, allowing you to optimize replication performance and data management. By creating new publications for tables that require different replication strategies and updating existing ones, you can better distribute the replication load, customize settings for each table, and control data distribution to specific subscribers. Always remember to test these changes in a staging environment before applying them to production to prevent disruptions, and ensure to monitor the replication setup post-migration to verify data consistency and performance. Stay tuned for more info

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>