Troubleshooting PostgreSQL Replication Setup Errors with repmgr

Introduction

Hey All

Managing a PostgreSQL replication setup involves various challenges, especially when encountering errors during configuration. This blog post explores common errors encountered while setting up replication using repmgr, a popular replication manager for PostgreSQL. We’ll delve into the causes of these errors and provide solutions to overcome them, ensuring a seamless replication setup. We had discussed setting up High Availability in Postgres using repmgr in this post. If you did not get a chance to go through it, now might be a good time for it.

These are the some of the errors/issues I faced in my initial work on repmgr as well as with ec2 instances. They might look very basic, but they reinforce the fact that minute details are important.

Error#1

This is the first brake of my smooth journey with ec2 instances & repmgr.

[postgres@ip-172-31-14-99 ~]$ /usr/pgsql-14/bin/repmgr -f /var/lib/pgsql/repmgr.conf primary register
WARNING: the following problems were found in the configuration file:
  parameter "cluster" is deprecated and will be ignored
INFO: connecting to primary database...
ERROR: connection to database failed
DETAIL:
connection to server at "52.64.241.210", port 5432 failed: timeout expired

DETAIL: attempted to connect using:
  user=repmgr connect_timeout=2 dbname=repmgr host=52.64.241.210 fallback_application_name=repmgr options=-csearch_path=

Solution

We need to edit the inbound rules of the security group and add each and every PostgreSQL EC2 instances to allow traffic to flow between those instances.
Steps to do : Instances –> Security–>Security group–>Edit Inbound rules

Add the PostgreSQL to the security group as mentioned below

Result

[postgres@ip-172-31-14-99 bin]$ /usr/pgsql-14/bin/repmgr -f /var/lib/pgsql/repmgr.conf primary register
WARNING: the following problems were found in the configuration file:
  parameter "cluster" is deprecated and will be ignored
INFO: connecting to primary database...
NOTICE: attempting to install extension "repmgr"
NOTICE: "repmgr" extension successfully installed
NOTICE: primary node record (ID: 1) registered

#Verifying Primary registeration 
[postgres@ip-172-31-14-99 bin]$ /usr/pgsql-14/bin/repmgr -f /var/lib/pgsql/repmgr.conf cluster show
WARNING: the following problems were found in the configuration file:
  parameter "cluster" is deprecated and will be ignored
 ID | Name  | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string
----+-------+---------+-----------+----------+----------+----------+----------+----------------------------------------------------------------
 1  | node1 | primary | * running |          | default  | 100      | 1        | host=52.64.241.210 user=repmgr dbname=repmgr connect_timeout=2

Error#2

This is the basic error , we DBA’s deal with this on daily basis. “No Entries in pg_hba.conf” . I encountered this error while performing the Standby clone dry run operatiion.

[postgres@ip-172-31-7-43 ~]$ /usr/pgsql-14/bin/repmgr -h 52.64.241.210 -U repmgr -d repmgr -f /var/lib/pgsql/repmgr.conf standby clone --dry-run

DETAIL:
connection to server at "52.64.241.210", port 5432 failed: FATAL:  no pg_hba.conf entry for replication connection from host "13.210.66.27", user "repmgr", no encryption

ERROR: connection to database failed
DETAIL:
connection to server at "52.64.241.210", port 5432 failed: FATAL:  no pg_hba.conf entry for replication connection from host "13.210.66.27", user "repmgr", no encryption

Solution

It clearly indicates that we need to make an entry in pg_hba.conf of Primary server.

cd $PGDATA
vi pg_hba.conf
host    replication     repmgr          {Standby1 IP}           md5
host    repmgr          repmgr          {Standby1 IP}           md5

# We have to add details of all of the Standby nodes and the Witness node too, if applicable.

Note : As we will clone the data from the Primary to both of the Standby nodes , there is no need to make these changes in the Standby nodes. But on the Witness mode , we need to enter the IP’s in pg_hba.conf file and change the parameters in postgresql.conf file . (For more details, refer to this post)

Result

[postgres@ip-172-31-7-43 ~]$ /usr/pgsql-14/bin/repmgr -h 52.64.241.210 -U repmgr -d repmgr -f /var/lib/pgsql/repmgr.conf standby clone --dry-run

NOTICE: standby will attach to upstream node 1
HINT: consider using the -c/--fast-checkpoint option
INFO: would execute:
  pg_basebackup -l "repmgr base backup"  -D /var/lib/pgsql/14/data -h 52.64.241.210 -p 5432 -U repmgr -X stream
INFO: all prerequisites for "standby clone" are met

Error#3

I had encountered this error when I was setting up the Witness node. It is possible that even after you have done an entry in pg_hba.conf file, you get the following error:

[postgres@ip-172-31-8-219 ~]$ /usr/pgsql-14/bin/repmgr -f /var/lib/pgsql/repmgr.conf witness register -h 52.64.241.210
WARNING: the following problems were found in the configuration file:
  parameter "cluster" is deprecated and will be ignored
INFO: connecting to witness node "node4" (ID: 4)
ERROR: unable to connect to witness node "node4" (ID: 4)
DETAIL:
connection to server at "ec2-13-211-191-92.ap-southeast-2.compute.amazonaws.com" (172.31.8.219), port 5432 failed: FATAL:  role "repmgr" is not permitted to log in

HINT: the witness node must be running before it can be registered

Solution

Login to psql and check the attributes of the user

[postgres@ip-172-31-8-219 ~]$ psql
psql (14.11)
Type "help" for help.
postgres=# \du+

                                          List of roles
 Role name |                         Attributes                         | Member of | Description
-----------+------------------------------------------------------------+-----------+-------------
 postgres  | Superuser, Create role, Create DB, Replication, Bypass RLS | {}        |
 repmgr    | Superuser, Cannot login                                    | {}        |

# It shows Role "repmgr" cannot login
# Simply alter the role 
postgres=# ALTER ROLE repmgr LOGIN;
ALTER ROLE
postgres=# \du+
                                          List of roles
 Role name |                         Attributes                         | Member of | Description
-----------+------------------------------------------------------------+-----------+-------------
 postgres  | Superuser, Create role, Create DB, Replication, Bypass RLS | {}        |
 repmgr    | Superuser                                                  | {}        |
 

Result

After altering the role , retry registering the witness node.

[postgres@ip-172-31-8-219 ~]$ /usr/pgsql-14/bin/repmgr -f /var/lib/pgsql/repmgr.conf witness register -h 52.64.241.210
WARNING: the following problems were found in the configuration file:
  parameter "cluster" is deprecated and will be ignored
INFO: connecting to witness node "node4" (ID: 4)
INFO: connecting to primary node
NOTICE: attempting to install extension "repmgr"
NOTICE: "repmgr" extension successfully installed
INFO: witness registration complete
NOTICE: witness node "node4" (ID: 4) successfully registered

Conclusion

In PostgreSQL replication setups, encountering errors is not uncommon, but with the right knowledge and troubleshooting steps, they can be resolved effectively. In this post, we have discussed some of the common errors encountered during the configuration of replication using repmgr. By understanding the root cause of these errors and implementing the provided solutions, administrators can ensure a smooth and reliable PostgreSQL replication setup, facilitating data redundancy and high availability for their applications. These might already be known to many of the experienced DBA’s, but can serve as a useful guide to novice PostgreSQL enthusiasts, and DBAs with less experience, and save precious time. If you have encountered any other with issues while building an PostgreSQL HA systems using repmgr, please feel free to comment below, and reach out to us. As always, we’re happy to help.

Thank you and Stay tuned…

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>