Customer Success

Essential PostgreSQL Maintenance Activities for Optimal Performance

Maintaining the performance and health of a PostgreSQL database is essential to ensure smooth operations and reliable data management. Regular maintenance activities help optimize performance, prevent issues, and keep your database running efficiently. In this blog post, we’ll explore the essential PostgreSQL maintenance activities that one should incorporate into their routine to achieve optimal performance.

Why Database Maintenance Matters

Database maintenance is essential for ensuring the optimal performance, reliability, and security of your database systems. Regular maintenance activities, such as vacuuming, analyzing, and backing up, help optimize query performance, reclaim disk space, and prevent data corruption. By proactively managing resources and monitoring system health, maintenance reduces the risk of downtime and performance issues, supports scalability, and ensures compliance with security standards. Investing in routine maintenance not only enhances database efficiency but also protects against potential data loss and security vulnerabilities, ultimately safeguarding your business operations and data integrity.

The following list highlights some of the prominent activities

Vacuum

PostgreSQL’s VACUUM command has to process each table on a regular basis for several reasons:

  1. To recover or reuse disk space occupied by updated or deleted rows.
  2. To update data statistics used by the PostgreSQL query planner.
  3. To update the visibility map, which speeds up index-only scans.
  4. To protect against loss of very old data due to transaction ID wraparound or multixact ID wraparound.

Source: https://www.postgresql.org/docs/16/routine-vacuuming.html

Different types of Vacuum

  • VACUUM : Basic operation to reclaim storage and update statistics.
    VACUUM table_name;
  • VACUUM FULL: More intensive operation that rewrites the entire table, reclaiming more space but requiring more resources and locking the table.
    VACUUM FULL table_name;

Automatic VACUUM

  • Autovacuum: PostgreSQL includes an autovacuum process that automatically runs vacuum and analyze operations.

Analyze

ANALYZE collects statistics about the contents of tables in the database, which helps query planner to make better decisions to be more efficient.

ANALYZE table_name;

autovacuum process also performs automatic analyze.

Reindex

Recreating indexes to optimize their performance and reduce bloat or fragmentation. It also helps in resolving the issues like index corruption to ensure data integrity.

REINDEX INDEX index_name;

Backup and Recovery

Having a backup and recovery strategy for databases is very important. Regular backups protect against data loss due to hardware failures, corruption, or human errors and also ensure you can recover data in the event of a system failure or disaster.

There are a plethora of utilities and tools available in the Postgres’ ecosystem to manage database backups. Here, is our detail blog on Postgres backups.

Monitoring and Analysis

Effective database maintenance requires comprehensive monitoring of various critical aspects. Performance metrics such as query performance, index usage, and response time are essential to ensure efficient operation. Resource utilization should be tracked, including CPU and memory usage, disk I/O, and available disk space, to prevent bottlenecks and ensure optimal performance. System health monitoring involves checking log files for errors, verifying backup status, and managing disk space. Security is paramount, and can be well maintained with regular reviews of user access, authentication logs, and patch management. Configuration settings need periodic review and adjustment, while maintenance tasks like vacuuming and reindexing are crucial for performance. Additionally, monitoring replication lag, failover mechanisms, and application activity ensures data consistency and high availability. Finally, capacity planning by analyzing growth trends helps in anticipating future needs and avoiding performance issues.

Tools/Extensions

There are lot of tools/extensions in the market for monitoring and analysing. Here are some of them,

  • pg_stat_statements is used to track the query performance and identify slow running queries.
  • pgBadger is used to analyze PostgreSQL log files and generate the detailed performance report.
  • pgAdmin provides a graphical monitoring of performance , health metrics and also performs maintenance tasks like vacuuming and reindexing.

Update

Regular maintenance and updates are crucial for ensuring the security and stability of your PostgreSQL database. Applying updates and patches addresses security vulnerabilities, enhances stability, and resolves bugs to improve overall functionality.

Start by regularly checking for updates for PostgreSQL and its extensions to stay current with the latest releases. Before applying updates to your production environment, thoroughly test them in a staging environment to ensure compatibility and minimize disruptions. Finally, follow PostgreSQL’s prescribed upgrade process to apply updates and patches methodically, maintaining a robust and secure database system.

Configuration review

Reviewing and adjusting the database configuration settings to match the workload of your environment is important. However , reviewing and adjusting manually the parameters like shared_buffers, work_mem, maintenance_work_mem and effective_cache_size etc. is more complex especially when you have a differential workload on your environment.

DBtune

Leveraging machine learning to understand specific workload requirements, DBtune is an AI-enabled tuning engine that automates and enhances the parameter tuning process using cutting-edge technologies. This advanced methodology allows for precise tuning and continuous improvement of database performance by analysing complex patterns and making data-driven adjustments.

Know more about DBtune ,
Ref1 : https://opensource-db.com/the-dna-of-dba-mastering-parameter-tuning/
Ref2 : https://opensource-db.com/postgresql-performance-tuning-from-manual-to-automatic-dbtune/

Feel free to reach out to us for a demo.

Best Practices/tips for PostgreSQL Maintenance

  1. Establish a Routine: Create a regular maintenance schedule for vacuuming, backups, and configuration reviews.
  2. Monitor Continuously: Use monitoring tools to track performance, detect issues, and make informed adjustments.
  3. Using latest tools: Using the tools like DBtune will increase the performance as well as reduce the resource cost.
  4. Document Procedures: Maintain detailed records of maintenance activities, configurations, and issues.
  5. Test Changes: Test configuration changes and updates in a staging environment before applying them to production.
  6. Bring Expertise : Engaging with seasoned professionals like us to add deep technical knowledge and experience can significantly enhance your database operations.

Conclusion

Effective maintenance is key to ensuring the performance, reliability, and longevity of your PostgreSQL database. By incorporating essential maintenance activities such as vacuuming, analyzing, backing up, and monitoring, you can optimize your database and prevent potential issues.

Adopting a proactive approach to maintenance and leveraging the right tools like DBtune and best practices will help you manage your PostgreSQL database effectively, ensuring it meets the demands of your applications and users.

Happy maintaining!

Unleashing PostgreSQL Performance: Exploring the Power of pg_profile

Introduction

In the evolving landscape of database management, optimizing performance is paramount. For enterprises relying on PostgreSQL, a robust monitoring tool can make all the difference. Using the extension pg_profile , can be a game changer in the realm of PostgreSQL performance tuning. Every millisecond counts, and for PostgreSQL users, navigating the complexities of performance tuning can be daunting. Monitoring and fine-tuning of slow query execution can increase the performance. In this blog , we will delve into the installation, configuration and generating the statistical report using pg_profile .

pg_profile

pg_profile is a extension designed specifically for monitoring and analyzing the performance of PostgreSQL databases. It helps database administrators identify and address performance bottlenecks, optimize queries, and fine-tune configurations to enhance the overall efficiency and responsiveness of PostgreSQL database systems. pg_profile is having some key features like query performance analysis, resource monitoring , index usage analysis and configuration parameter recommendations.

Installation

Here, we are installing version 4.6 of pg_profile and unzip into a folder.

#Installing using wget
[root@localhost ~]# wget https://github.com/zubkov-andrei/pg_profile/releases/download/4.6/pg_profile--4.6.tar.gz

#Unzip into a folder
[root@localhost ~]# tar -xvf pg_profile--4.6.tar.gz --directory /usr/pgsql-16/share/extension
pg_profile--4.6.sql
pg_profile.control
pg_profile--4.5--4.6.sql
[root@localhost ~]# cd /usr/pgsql-16/share/extension
[root@localhost extension]# ls -ltr
total 1680
-rw-r--r--. 1 postgres postgres     193 Apr 27 20:10 pg_profile.control
-rw-r--r--. 1 postgres postgres 1073167 Apr 27 20:10 pg_profile--4.6.sql
-rw-r--r--. 1 postgres postgres  635910 Apr 27 20:10 pg_profile--4.5--4.6.sql

Create extension

pg_profile extension depends on extensions  plpgsql and dblink. The only mandatory requirement for server cluster is the ability to connect from pg_profile database using provided server connection string. All other requirements are optional, but they can improve completeness of gathered statistics. To check the existence of this extensions ,

[root@localhost extension]# yum install postgresql16-contrib

#Login in to psql and create extensions
postgres=# create extension dblink;
CREATE EXTENSION
postgres=# create extension pg_stat_statements;
CREATE EXTENSION
postgres=# create extension pg_profile;
CREATE EXTENSION
postgres=# \dx
                                            List of installed extensions
        Name        | Version |   Schema   |                              Description                               
--------------------+---------+------------+------------------------------------------------------------------------
 dblink             | 1.2     | public     | connect to other PostgreSQL databases from within a database
 pg_profile         | 4.6     | public     | PostgreSQL load profile repository and report builder
 pg_stat_statements | 1.10    | public     | track planning and execution statistics of all SQL statements executed
 plpgsql            | 1.0     | pg_catalog | PL/pgSQL procedural language

Configuration

We need to update few parameters in postgresql.conf file .

shared_preload_libraries = 'pg_stat_statements'
pg_stat_statements.max = 20000
pg_stat_statements.track = top

#Restart the server

To know the objects in pg_profile , use the command \dx+ pg_profile; . It would display all the objects in the extension, including functions, tables and views.

Generating the statistical report

To generate the statistical report , we need to know the list of servers in pg_profile.

#To know the servers list in pg_profile 
postgres=# select * from show_servers() ;
 server_name |          connstr          | enabled | max_sample_age | description 
-------------+---------------------------+---------+----------------+-------------
 local       | dbname=postgres port=5432 | t       |                | 
(1 row)

As a default , there is a local server. In order the generate the report , we need to take samples at regular intervals. We can then generate a report in-between those samples.

# To take samples


postgres=# select * from take_sample();
 server | result |   elapsed   
--------+--------+-------------
 local  | OK     | 00:00:01.38
(1 row)

#To view samples taken
postgres=# select * from show_samples();
 sample |        sample_time        | sizes_collected | dbstats_reset | clustats_reset | archstats_reset 
--------+---------------------------+-----------------+---------------+----------------+-----------------
      1 | 2024-05-17 00:38:59+05:30 | t               |               |                | 
      2 | 2024-05-17 00:40:01+05:30 | t               |               |                | 
      3 | 2024-05-17 00:40:06+05:30 | t               |               |                | 
      4 | 2024-05-17 00:43:51+05:30 | t               |               |                | 
(4 rows)

Here , We have taken four samples . We can generate the report in between these samples. For example , between 1-2,1-3 ,1-4 ,2-3 ,2-4 & 3-4 .

#To generate a report in html

[postgres@localhost ~]$ psql -Aqtc "SELECT get_report('local',1,2)" -o Report.html
[postgres@localhost ~]$ ls -ltr
total 348
drwx------. 4 postgres postgres     51 May 15 07:25 16
-rw-r--r--. 1 postgres postgres 350993 May 17 00:56 Report.html

On viewing the generated html report, we can find SQL query statistics ,schema object statistics ,user function statistics and vacuum related statistics.

Summary

To summarize – in the current landscape of database management , it is important to monitor and optimize the database performance. In this post, we have introduced you to pg_profile, a game-changing tool tailored specifically for PostgreSQL performance monitoring. We’ve delved into its features, installation process, and seamless integration with existing PostgreSQL infrastructures, setting the stage for an immersive exploration of its capabilities.

Thank you and Stay tuned for more…

Mastering pgbench for Database Performance Tuning – Part Ⅱ

Introduction

In the realm of database performance tuning, the pursuit of optimal throughput and responsiveness is a never-ending journey. In our previous blog , we discussed about pgbench , setting up and first run of pgbench , tips, tricks and common pitfalls of benchmarking. Building upon that foundation, we now venture into the realm of advanced optimization strategies, where we delve deeper into the intricacies of pgbench with more options in it. Additionally, we will explore ways to interpret pgbench results effectively, transforming raw metrics into actionable insights that drive tangible performance enhancements.

If you haven’t read our previous blog , read here.

Exploring pgbench with more options

In our previous blog , we explored the basic options of pgbench like -c for number of clients,-j for number of threads, -t for number of transactions, -T for time limit . But, there are lot more options that the pgbench utility offers. Let’s explore a few more of these options and see how they can be used.

No vacuum -n

It is used to avoid the vacuuming before running the test . Usually , this option is used in a custom test run using a script which does not include the default tables.

#Running test run without -n
[postgres@localhost bin]$ ./pgbench db
pgbench (16.3)
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>

#Running test run with -n
[postgres@localhost bin]$ ./pgbench -n db
pgbench (16.3)
transaction type: <builtin: TPC-B (sort of)>

Analyzing the difference , without option -n it is running vacuum. You can see avoiding vacuum with option -n

Connect -C

This option will establish a new connection for each transaction instead of once per client sessions. It is used to measure the client overhead.

#Using -C
[postgres@localhost bin]$ ./pgbench -C db
pgbench (16.3)
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
maximum number of tries: 1
number of transactions per client: 10
number of transactions actually processed: 10/10
number of failed transactions: 0 (0.000%)
latency average = 5.255 ms
average connection time = 2.524 ms
tps = 190.284094 (including reconnection times)

Observing the tps , it is including the time taken for reconnection . If you don’t use this option , you will see tps without initial connection time.

Rate -R & Latency limit -L

These two options are mostly used together. Rate -R used to specify the rate/speed of transactions and Latency limit -L used to count the transaction over the specified limit. These two are calculated in milliseconds.

[postgres@localhost bin]$ ./pgbench -R 10 -L 10 db
pgbench (16.3)
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
maximum number of tries: 1
number of transactions per client: 10
number of transactions actually processed: 10/10
number of failed transactions: 0 (0.000%)
number of transactions skipped: 0 (0.000%)
number of transactions above the 10.0 ms latency limit: 0/10 (0.000%)
latency average = 3.153 ms
latency stddev = 0.480 ms
rate limit schedule lag: avg 0.311 (max 0.518) ms
initial connection time = 2.973 ms
tps = 6.476235 (without initial connection time)

Here, we can see the number of transactions above the specified latency limit and number of transactions skipped. In this case, there are no transactions that crossed the given inputs.

Report per command -r

This option gives us the stats of each command after the benchmark finishes.

[postgres@localhost bin]$ ./pgbench -r db
pgbench (16.3)
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
maximum number of tries: 1
number of transactions per client: 10
number of transactions actually processed: 10/10
number of failed transactions: 0 (0.000%)
latency average = 1.020 ms
initial connection time = 2.315 ms
tps = 980.199961 (without initial connection time)
statement latencies in milliseconds and failures:
         0.001           0  \set aid random(1, 100000 * :scale)
         0.000           0  \set bid random(1, 1 * :scale)
         0.000           0  \set tid random(1, 10 * :scale)
         0.000           0  \set delta random(-5000, 5000)
         0.029           0  BEGIN;
         0.169           0  UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;
         0.053           0  SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
         0.056           0  UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;
         0.043           0  UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;
         0.054           0  INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);
	 0.611           0  END;

It gives us the latency and failures of the default script used by pgbench.

Summary

To summarize, effective benchmarking is crucial for the optimal performance and scalability of databases.By leveraging benchmarking tools such as pgbench, database administrators and developers can simulate real-world workloads, measure performance metrics, and identify areas for optimization. However, successful benchmarking requires careful planning, execution, and analysis. In this blog , we have explored few more options in leveraging the pgbench utility. If you haven’t checked our previous blog on pgbench , here is the blog.

Thank you and stay tuned for more…

Mastering pgbench for Database Performance Tuning

Introduction

In today’s world of databases, performance is one of the key aspects that need to be taken care-of irrespective of the size of the database. If you’re in charge of managing a PostgreSQL database, ensuring peak performance is undoubtedly a top priority. Enter pgbench, PostgreSQL’s own benchmarking powerhouse, designed to help you simulate workloads and test different database strategies to optimize efficiency. Whether you’re a seasoned DBA or just starting out, understanding how to leverage pgbench can be a game-changer for your database operations. This comprehensive guide will cover everything you need to know to get started with pgbench, interpret its results, and implement its insights for optimal database performance.

pgbench

pgbench is a benchmarking tool included with PostgreSQL that simulates a standard workload. A simple program for running benchmark tests on PostgreSQL.. By default, pgbench tests are based on the TPC-B benchmark, which includes executing 5 SELECT, INSERT, and UPDATE commands per transaction. It is available as a part of the PostgreSQL installation. Using pgbench , we can simulate various workloads on PostgreSQL databases by generating multiple client connections and also by predefined scripts or set of parameters.

Setting up and running my first pgbench test

pgbench should be invoked with the -i (initialize) option to create and populate these tables. While initializing pgbench -i  database_name, it creates four tables pgbench_accounts, pgbench_branches, pgbench_history, pgbench_tellers dropping any existing tables of these names.

However it will check for the previous tables of these names and create again whenever -i (initialize) is given. In order a scale up , with a scale factor -s of 20, it will insert data into the tables to represent a workload 20 times larger than the default scale. Each tuple inserted represents a row of data in the database tables as shown below.

There are various parameters in pgbench to perform benchmark tests. The most common options are -c for number of clients,-j for number of threads, -t for number of transactions, -T for time limit and -f for specify a custom script file.

Running pgbench

Simulating the workload with 50 client connections and 100 transactions per client to database mydb

Understanding the Output

Let’s analyze the output, there are some defaults that will be considered such as scaling factor =1 , number of threads =1 if you didn’t give them in input.

Number of clients , Number of transactions per client is given as input to pgbench and number of transactions actually processed is the product of those two. There are zero transactions failed.

Latency average is the time taken to respond for a given request. It is measured in milliseconds . here it is 92.234ms

Initial connection time is the time taken for the client to establish connection with the server. Here it took 416.143ms

Most important is tps i.e transaction per seconds , it refers to the number of transactions processed by the system in one second time interval. It varies based on the complexity of transactions(or query) , system resources, concurrency levels, and database configuration. It is better to have high tps.

Tips and Tricks

  • Understand the workloads – It is important to understand the workloads on application and simulate accordingly using customized SQL scripts and adjusting scale factor.
  • Controlling concurrency level – We can control the concurrency level  in the pgbench test using the options -c and -j accordingly.
  • Monitor System Metrics – Use system monitoring tools such as top to monitor CPU usage, memory utilization, disk I/O, and network activity during pgbench tests. 
  • Tune PostgreSQL configurations – Adjust postgresql.conf parameters , such as shared_buffers, max_connections, work_mem based on results.
  • Run multiple tests – To get the accurate assessment on performance , run the multiple pgbench tests.
  • Documentation – Documenting the methodologies & results of your testing activities will help in further analysis.

Common pitfalls to avoid in database benchmarking

  • Doing the benchmark testing without understanding the workload.
  • Running the test on the insufficient CPU, memory, storage, and network resources to support the expected workload.
  • Incorrect benchmarking tools and methodologies can lead to incorrect results.
  • Not testing on realistic conditions or failing to replicate the environment conditions leads to improper results.
  • Considering the result of single test rather than the average of multiple tests
  • No proper documentation

Summary

To summarize, effective benchmarking is crucial for the optimal performance and scalability of databases.By leveraging benchmarking tools such as pgbench, database administrators and developers can simulate real-world workloads, measure performance metrics, and identify areas for optimization. However, successful benchmarking requires careful planning, execution, and analysis.In this blog, we have explored some features and results of pgbench including tips , tricks and also common pitfalls need to be avoided in database benchmarking.

Thank you and stay tuned for more…

Troubleshooting PostgreSQL Replication Setup Errors with repmgr

Introduction

Hey All

Managing a PostgreSQL replication setup involves various challenges, especially when encountering errors during configuration. This blog post explores common errors encountered while setting up replication using repmgr, a popular replication manager for PostgreSQL. We’ll delve into the causes of these errors and provide solutions to overcome them, ensuring a seamless replication setup. We had discussed setting up High Availability in Postgres using repmgr in this post. If you did not get a chance to go through it, now might be a good time for it.

These are the some of the errors/issues I faced in my initial work on repmgr as well as with ec2 instances. They might look very basic, but they reinforce the fact that minute details are important.

Error#1

This is the first brake of my smooth journey with ec2 instances & repmgr.

[postgres@ip-172-31-14-99 ~]$ /usr/pgsql-14/bin/repmgr -f /var/lib/pgsql/repmgr.conf primary register
WARNING: the following problems were found in the configuration file:
  parameter "cluster" is deprecated and will be ignored
INFO: connecting to primary database...
ERROR: connection to database failed
DETAIL:
connection to server at "52.64.241.210", port 5432 failed: timeout expired

DETAIL: attempted to connect using:
  user=repmgr connect_timeout=2 dbname=repmgr host=52.64.241.210 fallback_application_name=repmgr options=-csearch_path=

Solution

We need to edit the inbound rules of the security group and add each and every PostgreSQL EC2 instances to allow traffic to flow between those instances.
Steps to do : Instances –> Security–>Security group–>Edit Inbound rules

Add the PostgreSQL to the security group as mentioned below

Result

[postgres@ip-172-31-14-99 bin]$ /usr/pgsql-14/bin/repmgr -f /var/lib/pgsql/repmgr.conf primary register
WARNING: the following problems were found in the configuration file:
  parameter "cluster" is deprecated and will be ignored
INFO: connecting to primary database...
NOTICE: attempting to install extension "repmgr"
NOTICE: "repmgr" extension successfully installed
NOTICE: primary node record (ID: 1) registered

#Verifying Primary registeration 
[postgres@ip-172-31-14-99 bin]$ /usr/pgsql-14/bin/repmgr -f /var/lib/pgsql/repmgr.conf cluster show
WARNING: the following problems were found in the configuration file:
  parameter "cluster" is deprecated and will be ignored
 ID | Name  | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string
----+-------+---------+-----------+----------+----------+----------+----------+----------------------------------------------------------------
 1  | node1 | primary | * running |          | default  | 100      | 1        | host=52.64.241.210 user=repmgr dbname=repmgr connect_timeout=2

Error#2

This is the basic error , we DBA’s deal with this on daily basis. “No Entries in pg_hba.conf” . I encountered this error while performing the Standby clone dry run operatiion.

[postgres@ip-172-31-7-43 ~]$ /usr/pgsql-14/bin/repmgr -h 52.64.241.210 -U repmgr -d repmgr -f /var/lib/pgsql/repmgr.conf standby clone --dry-run

DETAIL:
connection to server at "52.64.241.210", port 5432 failed: FATAL:  no pg_hba.conf entry for replication connection from host "13.210.66.27", user "repmgr", no encryption

ERROR: connection to database failed
DETAIL:
connection to server at "52.64.241.210", port 5432 failed: FATAL:  no pg_hba.conf entry for replication connection from host "13.210.66.27", user "repmgr", no encryption

Solution

It clearly indicates that we need to make an entry in pg_hba.conf of Primary server.

cd $PGDATA
vi pg_hba.conf
host    replication     repmgr          {Standby1 IP}           md5
host    repmgr          repmgr          {Standby1 IP}           md5

# We have to add details of all of the Standby nodes and the Witness node too, if applicable.

Note : As we will clone the data from the Primary to both of the Standby nodes , there is no need to make these changes in the Standby nodes. But on the Witness mode , we need to enter the IP’s in pg_hba.conf file and change the parameters in postgresql.conf file . (For more details, refer to this post)

Result

[postgres@ip-172-31-7-43 ~]$ /usr/pgsql-14/bin/repmgr -h 52.64.241.210 -U repmgr -d repmgr -f /var/lib/pgsql/repmgr.conf standby clone --dry-run

NOTICE: standby will attach to upstream node 1
HINT: consider using the -c/--fast-checkpoint option
INFO: would execute:
  pg_basebackup -l "repmgr base backup"  -D /var/lib/pgsql/14/data -h 52.64.241.210 -p 5432 -U repmgr -X stream
INFO: all prerequisites for "standby clone" are met

Error#3

I had encountered this error when I was setting up the Witness node. It is possible that even after you have done an entry in pg_hba.conf file, you get the following error:

[postgres@ip-172-31-8-219 ~]$ /usr/pgsql-14/bin/repmgr -f /var/lib/pgsql/repmgr.conf witness register -h 52.64.241.210
WARNING: the following problems were found in the configuration file:
  parameter "cluster" is deprecated and will be ignored
INFO: connecting to witness node "node4" (ID: 4)
ERROR: unable to connect to witness node "node4" (ID: 4)
DETAIL:
connection to server at "ec2-13-211-191-92.ap-southeast-2.compute.amazonaws.com" (172.31.8.219), port 5432 failed: FATAL:  role "repmgr" is not permitted to log in

HINT: the witness node must be running before it can be registered

Solution

Login to psql and check the attributes of the user

[postgres@ip-172-31-8-219 ~]$ psql
psql (14.11)
Type "help" for help.
postgres=# \du+

                                          List of roles
 Role name |                         Attributes                         | Member of | Description
-----------+------------------------------------------------------------+-----------+-------------
 postgres  | Superuser, Create role, Create DB, Replication, Bypass RLS | {}        |
 repmgr    | Superuser, Cannot login                                    | {}        |

# It shows Role "repmgr" cannot login
# Simply alter the role 
postgres=# ALTER ROLE repmgr LOGIN;
ALTER ROLE
postgres=# \du+
                                          List of roles
 Role name |                         Attributes                         | Member of | Description
-----------+------------------------------------------------------------+-----------+-------------
 postgres  | Superuser, Create role, Create DB, Replication, Bypass RLS | {}        |
 repmgr    | Superuser                                                  | {}        |
 

Result

After altering the role , retry registering the witness node.

[postgres@ip-172-31-8-219 ~]$ /usr/pgsql-14/bin/repmgr -f /var/lib/pgsql/repmgr.conf witness register -h 52.64.241.210
WARNING: the following problems were found in the configuration file:
  parameter "cluster" is deprecated and will be ignored
INFO: connecting to witness node "node4" (ID: 4)
INFO: connecting to primary node
NOTICE: attempting to install extension "repmgr"
NOTICE: "repmgr" extension successfully installed
INFO: witness registration complete
NOTICE: witness node "node4" (ID: 4) successfully registered

Conclusion

In PostgreSQL replication setups, encountering errors is not uncommon, but with the right knowledge and troubleshooting steps, they can be resolved effectively. In this post, we have discussed some of the common errors encountered during the configuration of replication using repmgr. By understanding the root cause of these errors and implementing the provided solutions, administrators can ensure a smooth and reliable PostgreSQL replication setup, facilitating data redundancy and high availability for their applications. These might already be known to many of the experienced DBA’s, but can serve as a useful guide to novice PostgreSQL enthusiasts, and DBAs with less experience, and save precious time. If you have encountered any other with issues while building an PostgreSQL HA systems using repmgr, please feel free to comment below, and reach out to us. As always, we’re happy to help.

Thank you and Stay tuned…

Building PostgreSQL HA Systems with repmgr

Introduction

Welcome to our latest blog post! Today, we’ll delve into the intricacies of setting up PostgreSQL High availability (HA) production cluster for a prominent client in the banking sector. Our task? Designing and implementing a robust HA solution for their PODS, with PostgreSQL as the database backbone. Join us as we navigate through the process of establishing a four-node repmgr cluster, ensuring continuous availability and data integrity for our esteemed client.

Why repmgr

repmgr is a popular tool used for managing PostgreSQL replication and failover in high availability (HA) environments. It simplifies the setup and management of PostgreSQL replication by providing easy-to-use commands and tools for configuration, monitoring, and failover. One of its key features is its ability to automatically detect primary node failures and promote a standby node to become the new primary.

Prerequisites

  • Provision four EC2 instances.
  • Install PostgreSQL 14 on each node (why not the latest – Application compatibility)
  • Install repmgr on each node

Setup

Splitting the four nodes into  1  primary node, 2 standby nodes & 1 witness node. 

Before setting up the repmgr , we need to add the IPv4 address of four nodes  to a single security group in EC2 instances.

In Primary node:

Create the user and database accordingly and also do entries of remaining nodes in pg_hba.conf of primary server.

#Changes of parameters in Postgresql.conf

max_wal_senders = 10
max_replication_slots = 10
wal_level ='logical' #to enable logical decoding in future
hot_standby = on
archive_mode = on
archive_command = '/bin/true'
shared_preload_libraries = 'repmgr'

#Restart the services after changing the parameters in postgresql.conf file.

Then, we need to create a repmgr configuration file with the following parameters:

cluster='clientname_prod'
node_id=1
node_name=node1
conninfo='host=node1 user=myuser dbname=mydb connect_timeout=2'
data_directory='/var/lib/pgsql/14/data/'
failover=automatic
promote_command='/usr/pgsql-14/bin/repmgr standby promote -f /var/lib/pgsql/repmgr.conf --log-to-file'
follow_command='/usr/pgsql-14/bin/repmgr standby follow -f /var/lib/pgsql/repmgr.conf --log-to-file --upstream-node-id=%n'

Now, we need to register the Primary using the command

/usr/pgsql-14/bin/repmgr -f /var/lib/pgsql/repmgr.conf primary register

#To check whether the primary server is registered or not,
/usr/pgsql-14/bin/repmgr -f /var/lib/pgsql/repmgr.conf  cluster show

Once the primary is  registered, we can setup the standby node

In Standby node

Two steps:

One is standby clone and the other is standby register.
Before that, let us create the repmgr configuration file in the standby server.

node_id=2
node_name=node2
conninfo='host=(Standby IP address) user=myuser dbname=mydb connect_timeout=2'
data_directory='/var/lib/pgsql/14/data'
failover=automatic
promote_command='/usr/pgsql-14/bin/repmgr standby promote -f /var/lib/pgsql/repmgr.conf --log-to-file'
follow_command='/usr/pgsql-14/bin/repmgr standby follow -f /var/lib/pgsql/repmgr.conf --log-to-file --upstream-node-id=%n'

Now, let us do a dry run before cloning using the command,

/usr/pgsql-14/bin/repmgr -h (Primary IP address) -U myuser -d mydb -f /var/lib/pgsql/repmgr.conf standby clone --dry-run

If it is successful, let us do the standby clone using the command

/usr/pgsql-14/bin/repmgr -h (Primary IP address) -U myuser -d mydb -f /var/lib/pgsql/repmgr.conf standby clone

Once the cloning is done ,Register the standby now using the command

/usr/pgsql-14/bin/repmgr -f /var/lib/pgsql/repmgr.conf standby register

Make sure that you stop the PostgreSQL server while doing the Standby clone and restart it before registering the standby.

Repeat the same process with the second standby too.

In Witness node

In the witness node, the parameter changes would need to be made in postgresql.conf file, and entries in pg_hba.conf need to be added similar to what we did in Primary; and also create the repmgr configuration file accordingly.

To register the witness node, use the command

/usr/pgsql-14/bin/repmgr -f /var/lib/pgsql/repmgr.conf witness register -h (Primary IP)

Now, we have all of the four nodes ready and registered as needed. You could use the cluster show command to check the nodes and verify that everything is in order.

Automate the failover

To automate the failover, there is a process called repmgrd daemon. It is a background process that runs continuously to manage PostgreSQL replication and failover using repmgr. It is a crucial component of the repmgr toolset, providing automated monitoring, management, and failover capabilities for PostgreSQL replication clusters.

To activate repmgrd, use the command

/usr/pgsql-14/bin/repmgrd -f /var/lib/pgsql/repmgr.conf

#To check the events, use the command:
/usr/pgsql-14/bin/repmgr -f /var/lib/pgsql/repmgr.conf cluster event

Conclusion

In conclusion, the successful implementation of the four-node repmgr cluster enabled our client to achieve HA for their Microservices architecture, bolstering data resilience and ensuring uninterrupted service delivery in a mission-critical banking environment. With repmgr’s robust features and our expertise, we provided a reliable and scalable solution tailored to the client’s specific requirements. To this setup , we also added PgBouncer and keepalived. If you want to know more about the architecture and how it works with a practical approach, please stay tuned for this series..

PostgreSQL OID Datatype: Impacts and Remedies

Introduction

PostgreSQL – The World’s Most Advanced Open Source Relational Database, introduced a significant change with version 12: the discontinuation of the Object Identifier (OID) datatype. This move has left many developers and database administrators pondering over its implications and seeking remedies for databases that heavily relied on OID. Fear not! We’re here to dissect the what, why, and how of this transition, providing you with practical examples and remediation strategies.

PostgreSQL has ramped down from the OID datatype starting version 12, marking a significant change for developers and database administrators. This post delves into what this means for your databases and how to smoothly transition from using OID.

For instance, we have been working with a mid-sized tech company that recently updated its legacy systems to comply with the new standards reported a notable improvement in database performance and manageability. The initial apprehension about migrating large objects and updating application code gave way to relief and satisfaction as the process unfolded more smoothly than anticipated. Their experience underscores a common sentiment: while the shift requires effort, the clarity, scalability, and performance gains on the other side are well worth it. Feedback from SMEs highlight the resilience and adaptability of the PostgreSQL community, reminding us that while change can be challenging, it also brings opportunities for growth and improvement.

Why say goodbye to OID?
The decision to discontinue OIDs didn’t come out of the blue. It was motivated by several factors:

  • Performance and Scalability: OIDs are of fixed size, limiting the scalability of databases. As databases grow, the need for more flexible and scalable solutions becomes apparent.
  • Best Practices: Moving away from OIDs encourages the use of explicit primary keys, aligning with modern database design practices.
  • Simplicity and Clarity: Eliminating OIDs simplifies the PostgreSQL system catalog and reduces the cognitive load on developers and database administrators.

Transitioning Away from OID:
If you’re working with older versions of PostgreSQL and relying on OID, transitioning might seem daunting. However, with a few strategic steps, you can ensure a smooth migration. Here’s how:

  1. Identifying OID Usage
    First, assess your database to identify where and how OIDs are used. You can run a query to find all tables that use OIDs:
  2. Migrating to Explicit Primary Keys
    For tables identified with OID usage, the next step is to introduce explicit primary keys. If a table does not have a primary key, you can add one as follows:

You can simply add a new column called oid that is automatically filled:

ALTER TABLE test ADD oid bigint GENERATED ALWAYS AS IDENTITY NOT NULL;

Using an identity column like this has the advantage that you get an error if you by mistake try to INSERT a value into the column manually. This command adds a new column id that serves as a primary key.

  1. Updating Application Code
    Applications interacting with your database may need updates to work with the new primary key structure. Ensure that your application logic references the explicit primary keys instead of OIDs.
  2. Handling Large Objects (LOs)

For large objects, PostgreSQL provides the lo module, which can replace OID-based large object management. You can migrate large objects using the lo_import and lo_export functions.

Adding to the various reasons for PostgreSQL’s shift away from OIDs is the matter of logical replication. Logical replication, a feature introduced in PostgreSQL 10, allows for the replication of data changes between PostgreSQL databases in a more granular and flexible manner compared to traditional physical replication. However, it’s crucial to note that logical replication does not support the replication of OID values. This limitation underscores a significant challenge for databases relying on OIDs for data identification and relationship management.

Why is this limitation noteworthy? In the age of distributed databases and high availability, logical replication stands as a cornerstone for many PostgreSQL deployment architectures. Its inability to handle OID values seamlessly means that any system relying on OIDs for key data interactions could face hurdles in data consistency and integrity across replicated environments. This constraint further emphasizes the need for transitioning away from OID usage, advocating for a move towards explicit primary keys and more replication-friendly data management strategies.

Conclusion:
Getting away from OID datatype in PostgreSQL 12 signifies a shift towards more modern, scalable, and clearer database practices. While the transition may require some upfront work, particularly for legacy systems, the long-term benefits in performance, scalability, and manageability are undeniable. By understanding the reasons behind this change, assessing your current database usage, and implementing the recommended remediation strategies, you can navigate this transition smoothly. Embrace the change, and you’ll find your PostgreSQL databases are all the better for it.

Unlocking Initial Sync for Logical Replication in AWS RDS for PostgreSQL

PostgreSQL’s logical replication is handled through publications and subscriptions. Publications define what data to replicate, while subscriptions specify which server receives the updates.

Initial synchronization ensures the subscriber has all the data. Common methods include:

Snapshot: Copies all data at once. Simple but slow for large datasets.
Base Backup and Continuous Archiving: Makes a base copy and then tracks changes. More suitable for large datasets.

However, when dealing with massive databases in RDS PostgreSQL, such as those in the terabyte range, using pg_basebackup for base backup might not work due to security restrictions on file systems. pg_basebackup requires deep file system access, which is limited in RDS for security reasons. In such cases, replication needs to be set up using the pg_dump utility instead.

Before proceeding, ensure the following configuration parameters are set and restart the cluster:

max_replication_slots: 10
max_wal_senders: 10
track_commit_timestamp: on
wal_level: logical

Next, create a publication and add tables to it in the source database:

CREATE PUBLICATION lsr_sync_01;
ALTER PUBLICATION lsr_sync_01 ADD TABLE sales_data;
ALTER PUBLICATION lsr_sync_01 ADD TABLE customer_info;
ALTER PUBLICATION lsr_sync_01 ADD TABLE transactions;

Create a replication slot:

To create a replication connection to the database and establish a replication slot, follow these steps:

Open a separate terminal session.

Keep the session active, preferably using a screen session.

Run the following command:

[postgres@primary ~]$ psql -h primary -U repmgr "dbname=osdb_lsr replication=database"
psql (10.23, server 14.10)
WARNING: psql major version 10, server major version 14.
         Some psql features might not work.
Type "help" for help.

osdb_lsr=# CREATE_REPLICATION_SLOT lsr_sync_01 LOGICAL pgoutput;
  slot_name  | consistent_point |    snapshot_name    | output_plugin
-------------+------------------+---------------------+---------------
 lsr_sync_01 | 0/C000110        | 00000003-00000002-1 | pgoutput
(1 row)

Now let's proceed with backing up the database using pg_dump.

pg_dump -h primary -U repmgr -t sales_data -t customer_info -t transactions -Fd -f osdb_lsr_intial_sync -j 2 --no-publications --no-subscriptions --snapshot=00000003-00000002-1 --compress=0 -v osdb_lsr

Restore the backup on the target for LSR:

pg_restore -h standby -U repmgr -Fd -j 2 -v -d osdb_lsr osdb_lsr_intial_sync

Create a subscription:

Create the subscription on the standby server, referring to the primary database, but keep it in a disabled state.

CREATE SUBSCRIPTION lsr_sync_sub_01 CONNECTION 'host=primary port=5432 dbname=osdb_lsr user=repmgr password=*****' PUBLICATION lsr_sync_01
WITH (
  copy_data = false,
  create_slot = false,
  enabled = false,
  connect = true,
  slot_name = 'lsr_sync_01'
);

Advance the replication origin:

Retrieve the external ID from the pg_subscription table and use it to advance the replay position to the consistent point captured during replication slot creation.

standby=# SELECT 'pg_'||oid::text AS "external_id" FROM pg_subscription WHERE subname = 'lsr_sync_sub_01';
 external_id
-------------
 pg_32119
(1 row)

Take the position of the LSN from the replication slot creation and make it advance to start the replication

standby=# SELECT pg_replication_origin_advance('pg_32119', '0/C000110') ;
 pg_replication_origin_advance
-------------------------------
 
(1 row)

Enable the subscription:

ALTER SUBSCRIPTION lsr_sync_sub_01 ENABLE;

Check the replication status:

SELECT * FROM pg_stat_replication;

This series of steps helps to establish logical replication in PostgreSQL, ensuring data synchronization between databases efficiently.

Timestamp-based Change Data Capture

Introduction

Hey all!! Hope you are following the series of topics around Data Integration with PostgreSQL, The World’s Most Advanced Open Source Relational Database. In our previous blog, we explored the Change Data Capture (CDC) and its methods. If you missed it, we recommend reading the blog.

Timestamp-based CDC (Change Data Capture) is one approach to capturing and tracking changes in a database by associating a timestamp with each change. It is a versatile and widely used approach in tracking the changes in databases and supports various use cases like data replication and synchronization. In this blog, we will explore the prerequisites to execute the timestamp-based CDC and the setup with the result set.

Setup the Environment

Before diving into CDC, let’s ensure we have a timestamp column in the table (source) and the necessary components installed and configured.

PostgreSQL Database

  • Install PostgreSQL on the server or your local machine.
  • Create a database and the required tables that will be part of the CDC.

Pentaho Data Integration (ETL tool)

  • Install PDI on the server or your local machine – you may refer to this blog for a step-by-step procedure
  • Connect with the databases.
  • Create the data pipeline for CDC.

Designing the CDC Pipeline

In this CDC pipeline, we will create a new transformation and building a pipeline to extract the data from one table(t1) in one PostgreSQL database and load it in another table(t2) in another PostgreSQL database in incremental basis.

Define the date range

  • Go to Design → Input → Get system info
  • Here, we are getting the start and end range of the transformation.
  • Hop the result to the next step as input.

Export data from Source

  • Go to Design → Input → Table input
  • Here, we will extract the data in the table(t1) using a timestamp column by giving SQL statements.
  • Hop the result to the next step as input.

Import data into Target

  • Go to Design → Output → Insert/update
  • Here, we would insert the data(output from the previous step) into the target table(t2) as per the given conditions.

Input and Output

  • Assuming that both tables(t1 & t2) in different databases with columns (id, name & last_update) have no data.
  • Insert the data in the table(t1) using the insert command.
  • Check the data in table(t2) before running the transformation.
  • Run the transformation. If the transformation is successful, you will get the ✅ at the top of the following steps.
  • Now, check the target table(t2).
  • Compare the data in both the tables(t1&t2).

Note: As discussed in the previous blog, CDC using timestamp will not work for the DELETE option.

Conclusion

In conclusion, the adoption of timestamp-based Change Data Capture (CDC) in PostgreSQL marks a pivotal stride in the realm of data integration. By leveraging this sophisticated approach, you can build systems to capture and propagate real-time changes efficiently. The journey through the nuances of timestamp-based CDC in PostgreSQL+Pentaho DI showcases not just a technical evolution but a strategic leap towards a more responsive and interconnected data ecosystem. As we witness the seamless synchronization of data across diverse platforms and applications, the potential for data-driven decision-making becomes boundless. In the ever-evolving landscape of data integration, embracing timestamp-based CDC in PostgreSQL is not just a best practice—it’s a transformative key to unlocking the true power of real-time data.

Stay tuned for more in Data Integration with PostgreSQL blog post series!!

Unleashing the Power of Change Data Capture

Introduction

Here I am again! Talking about the series of topics around Data Integration with PostgreSQL, The World’s Most Advanced Open Source Relational Database. If you haven’t looked at the previous blog in the series, I’d highly recommend reading the same, it is available here.

In the ever-evolving landscape of data management, organisations are constantly seeking ways to stay ahead of the modifications, additions and deletions in real-time. CDC is a transformative technology that allows organisations to achieve data integrity and consistency across all systems and deployment environments. Additionally, it allows organisations to use the right tool for the right job by moving data from legacy databases to purpose-built data platforms, such as document or search databases and data warehouses.
Change Data Capture (CDC) has emerged as a game-changer, providing a dynamic solution to keep pace with the rapidly changing world of data. This blog explores three W’s of CDC and methods of CDC.

Three W’s of Change Data Capture

What is Change Data Capture?

Change Data Capture (CDC) is a technique used to capture the changes made to data in a source database. So that the changes can be implemented in the target database also. Replication is one of the important aspects where the data is constantly changing, CDC reduces the load by sending only incremental data.

Why Change Data Capture is important?

CDC enables real-time synchronisation of data between source and target systems. This is critical for applications where access to the most current information is essential, such as in financial transactions or healthcare etc

Unlike traditional methods that involve replicating entire datasets, CDC focuses only on the changes. This reduces the processing overhead, minimises network traffic, and optimises resource utilisation.

When to use Change Data Capture?

CDC is particularly used when dealing with datasets that undergo frequent changes. In scenarios where data is constantly updated, capturing only the changes helps maintain system efficiency. 

In environments where data warehouses and analytical systems need to stay current, implementing CDC ensures that updates are propagated swiftly and without the need for extensive data transfers.

Methods of Change Data Capture

Change Data Capture (CDC) comes in various types, each having its own specific use cases and requirements. The primary methods of CDC ,we want to discuss :

Timestamp based CDC

In timestamp based CDC ,we use the timestamp column with the name “last_updated”, “ last_modified” or “date_modified” (as per your convenience) in the source table to identify changes.

Pros: 

  • Simple to implement, especially when timestamps are readily available.
  • No additional load on the source database compared to trigger-based CDC.

Cons:

  • Timestamp columns need to be updated every time.
  • Delete operations will not be identified as there is no timestamp column for deleted rows.

Log based CDC

In log based CDC, changes are captured directly from the transaction logs of the source database. This method is efficient because it reads the logs generated by the database management system (DBMS) to identify changes.

Pros:

  • More reliable and efficient.
  • There is no need to change the schemas of the database system or add additional columns.

Cons:

  • More complex to implement.
  • Dependency on database-specific features.

Trigger based CDC

In trigger based CDC, changes in the source are identified using the database triggers. Triggers are special stored procedures that are automatically executed when a specified event (such as an update, insert, or delete) occurs on a particular table.

Pros:

  • Simple to implement.
  • Directly tied to database events, making it easy to understand.

Cons:

  • Can introduce additional overhead on the source database.
  • May not be as real-time as log-based approaches.
  • Complexity increases with the number of triggers.

Conclusion

As we conclude the exploration of Change Data Capture ,uses and methods of CDC. Change Data Capture is not just a technological tool, it is a strategic asset for organisations navigating the complexities of a data-driven world. By embracing this approach, businesses can not only adapt to change but thrive in an environment where accurate, timely, and reliable data is the cornerstone of success.

If you are interested in exploring CDC in your PostgreSQL database using Pentaho Data Integration (PDI) , please reach out to us, let’s connect and collaborate and extract more from your PostgreSQL database.

Thank you and Stay tuned!!