Introduction:
Embarking on my first customer engagement to set up PostgreSQL Streaming Replication was an insightful journey. As a Database Engineer, this was my first hands-on experience with implementing streaming replication in a real-world environment. The stakes were high, as the customer required a robust and reliable high-availability solution to ensure their critical applications remained operational without interruption.
From the initial planning stages to the final deployment, every step of the process brought new challenges and learning opportunities. The customer’s environment was complex, with multiple databases running on different nodes, each requiring seamless synchronization to maintain data integrity and consistency. My goal was to create a replication setup that not only met their high availability needs but also provided the scalability and performance necessary to support their growing business.
In this blog post, I’ll share my experience setting up PostgreSQL Streaming Replication. Whether you’re a seasoned DBA or new to the world of database replication, I hope my journey will provide valuable insights and practical tips for successfully implementing streaming replication in your own PostgreSQL environment.
The streaming replication of a standby database server is configured to connect to the primary server, which streams WAL (Write-Ahead Logging) records to the standby as they are generated, without waiting for the WAL file to be filled. By default, streaming replication is asynchronous, where data is written to the standby server after a transaction has been committed on the primary server. This means that there is a small delay between committing a transaction in the master server and the changes becoming visible in the standby server.
Step 1 :Configure Primary Server
Open postgresql.conf file on the primary server and make this changes:
listen_addresses = '*'
wal_level = replica
max_wal_senders = 5
hot_standby=on
Step 2: Enable Replication User
Open pg_hba.conf
file and add an entry to allow replication connections from the standby server:
host replication replication_user standby_ip/32 md5
#Restart PostgreSQL to apply the changes.
Step 3: Create database
postgres=# CREATE DATABASE mydb;CREATE DATABASE
#Switch to database :
\c mydb
Step 4: Create the user
Create a user in master using whichever slave should connect for streaming the WALs. This user must have replication role
mydb=# CREATE ROLE replica WITH REPLICATION ENCRYPTED PASSWORD 'postgres';
CREATE ROLE
Step 5: Taking pg_basebackup
pg_basebackup
helps us to stream the data through the wal sender process from the master to a slave to set up replication.
[postgres@master ~]$ pg_basebackup -h 10.206.0.2 \
-p 5433 -D /var/lib/pgsql/14/data/ \
-U replica -P -v -R -X stream \
-C -S book1
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/2000028 on timeline 1
pg_basebackup: starting background WAL receiver
pg_basebackup: created replication slot "book1"
34874/34874 kB (100%), 1/1 tablespace
pg_basebackup: write-ahead log end point: 0/2000100
pg_basebackup: waiting for background process to finish streaming ...
pg_basebackup: syncing data to disk ...
pg_basebackup: renaming backup_manifest.tmp to backup_manifest
pg_basebackup: base backup completed
Step 6: Check for Standby Signal
Notice that standby.signal
is created and the connection settings are appended to postgresql.auto.conf.
[postgres@master ~]$ cd /var/lib/pgsql/14/data
# Check for postgresql.conf and standby.signal file in the data directory.
[postgres@master data]$ ls -ltr
-rw-------. 1 postgres postgres 225 May 29 02:41 backup_label
-rw-------. 1 postgres postgres 180563 May 29 02:42 backup_manifest
drwx------. 6 postgres postgres 54 May 29 02:42 base
-rw-------. 1 postgres postgres 30 May 29 02:42 current_logfiles
drwx------. 2 postgres postgres 4096 May 29 02:42 global
drwx------. 2 postgres postgres 32 May 29 02:42 log
drwx------. 2 postgres postgres 6 May 29 02:42 pg_commit_ts
drwx------. 2 postgres postgres 6 May 29 02:42 pg_dynshmem
-rw-------. 1 postgres postgres 4929 May 29 02:42 pg_hba.conf
-rw-------. 1 postgres postgres 1636 May 29 02:42 pg_ident.conf
drwx------. 4 postgres postgres 68 May 29 02:42 pg_logical
drwx------. 4 postgres postgres 36 May 29 02:42 pg_multixact
drwx------. 2 postgres postgres 6 May 29 02:42 pg_notify
drwx------. 2 postgres postgres 6 May 29 02:42 pg_replslot
drwx------. 2 postgres postgres 6 May 29 02:42 pg_serial
drwx------. 2 postgres postgres 6 May 29 02:42 pg_snapshots
drwx------. 2 postgres postgres 6 May 29 02:42 pg_stat
drwx------. 2 postgres postgres 6 May 29 02:42 pg_stat_tmp
drwx------. 2 postgres postgres 6 May 29 02:42 pg_subtrans
drwx------. 2 postgres postgres 6 May 29 02:42 pg_tblspc
drwx------. 2 postgres postgres 6 May 29 02:42 pg_twophase
-rw-------. 1 postgres postgres 3 May 29 02:42 PG_VERSION
drwx------. 3 postgres postgres 60 May 29 02:42 pg_wal
drwx------. 2 postgres postgres 18 May 29 02:42 pg_xact
-rw-------. 1 postgres postgres 372 May 29 02:42 postgresql.auto.conf
-rw-------. 1 postgres postgres 28724 May 29 02:42 postgresql.conf
-rw-------. 1 postgres postgres 0 May 29 02:42 standby.signal
Step 7: Check the replication slot
Now connect the master server, you should be able to see the replication slot called book1 when you open the pg_replication_slots
view as follows.
mydb=# select * from pg_replication_slots ;
-[ RECORD 1 ]-------+----------
slot_name | book1
plugin |
slot_type | physical
datoid |
database |
temporary | f
active | t
active_pid | 3506
xmin |
catalog_xmin |
restart_lsn | 0/3018E38
confirmed_flush_lsn |
wal_status | reserved
safe_wal_size |
two_phase | f
#Now start PostgreSQL on a slave(standby) server.
Step 8: Check the status on Standby
We can check the status on standby using the below command.
mydb=# select * from pg_stat_wal_receiver ;
-[ RECORD 1 ]---------+------------------------------------------------------
pid | 3487
status | streaming
receive_start_lsn | 0/3000000
receive_start_tli | 1
written_lsn | 0/3018E38
flushed_lsn | 0/3018E38
received_tli | 1
last_msg_send_time | 2024-05-29 07:15:34.249873+00
last_msg_receipt_time | 2024-05-29 07:15:34.376013+00
latest_end_lsn | 0/3018E38
latest_end_time | 2024-05-29 03:32:03.18865+00
slot_name | book1
sender_host | 10.206.0.2
sender_port | 5433
conninfo | user=replica passfile=/var/lib/pgsql/.pgpass channel_binding=prefer dbname=replication host=10.206.0.2 port=5433 fallback_application_name=walreceiver sslmode=prefer sslcompression=0 sslsni=1 ssl_min_protocol_version=TLSv1.2 gssencmode=prefer krbsrvname=postgres target_session_attrs=any
Conclusion:
Setting up PostgreSQL Streaming Replication for the first time was both a challenging and rewarding experience. This journey taught me the importance of meticulous planning, a thorough understanding of the underlying technology, and the ability to troubleshoot issues as they arise. Here are some key takeaways from my experience:
- Understanding the Basics: Gaining a solid grasp of how streaming replication works is crucial. Knowing the differences between synchronous and asynchronous replication, and when to use each, can make a significant impact on your setup’s reliability and performance.
- Proper Configuration: The configuration of the primary and standby servers is the backbone of a successful replication setup. Ensuring that parameters like
wal_level
,max_wal_senders
, andarchive_mode
are correctly set is vital. - Monitoring and Maintenance: Implementing monitoring tools to track replication status and performance is critical. Regular maintenance, such as vacuuming and analyzing databases, helps keep the replication environment healthy.
- Testing Failover: Regularly testing failover procedures ensures that the replication setup is reliable and that you can quickly recover in case of a primary server failure. This practice builds confidence in the system’s resilience. I’ll be covering this topic in the next blog post.
My first foray into PostgreSQL Streaming Replication was a significant learning curve, but it ultimately reinforced my skills and confidence in managing high-availability database systems. I hope this blog post has provided you with a comprehensive overview and practical insights into setting up streaming replication. Whether you are a novice or a seasoned DBA, these lessons can guide you through your own replication projects, ensuring a more resilient and reliable PostgreSQL environment.
Happy replicating!
Leave a Reply