PostgreSQL 14

The 101 of Postgres Indexes

Introduction

PostgreSQL is a powerful, open-source relational database system that uses and extends the SQL language combined with many features that safely store and scale the most complicated data workloads. One of the critical components of PostgreSQL that significantly enhances its performance is the use of indexes. Indexes are essential for speeding up the retrieval of rows from a table by using a pointer.

Definition

Indexes in PostgreSQL are special lookup tables that the database search engine can use to speed up data retrieval. An index is a database object that provides a fast access path to rows in a table based on the values of one or more columns.

Data Structures

PostgreSQL uses various data structures for its indexes. The choice of data structure impacts the performance and suitability of an index for different types of queries and operations. Understanding these data structures is crucial for optimizing database performance.

Types of PostgreSQL indexes :

PostgreSQL offers various index types, each designed to cater to specific data scenarios and query patterns.By understanding these index types, you can enhance the query performance more effectively.

B-tree index:

B-tree is the default index type in PostgreSQL. B-tree stands for balanced tree. B-tree indexes maintain the sorted values,making them efficient for exact matches and range queries.

Hash index:

Hash indexes maintain 32-bit hash code created from values of the indexed columns.Therefore, hash indexes can only handle simple equality comparisons (=).

GIN index:

GIN indexes are inverted indexes that are suitable for composite values such as arrays,JSONB data, and full-text search.Since a GIN index stores a separate entry for each component, it can handle queries that check for the existence of a specific component.

GiST index:

GiST indexes are versatile and support a wide range of datatypes including geometric and full-text data.GiST indexes allow various search strategies such as nearest-neighbor and partial match searches, making them useful for specialized applications.

SP-GiST index:

SP-GiST indexes are useful for indexing data with hierarchical structures or complex data types.SP-GiST indexes partition the index space into non-overlapping regions, offering efficient search capabilities for specialized data structures.

BRIN (Block Range Index) index :

BRIN indexes are designed for very large tables where indexing every row is impractical.A BRIN index divides the table into ranges of pages and stores summarized information about each range, making them efficient for range queries on large datasets while using minimal space.

Index Operations

PostgreSQL provides several commands to create, modify, and manage indexes:

  • CREATE INDEX command
    • Syntax: CREATE INDEX index_name ON table_name (column_name);
    • Example: CREATE INDEX idx_name ON employee (last_name);
  • DROP INDEX command
    • Syntax: DROP INDEX index_name;
    • Example: DROP INDEX idx_name;
  • REINDEX command
    • Syntax: REINDEX INDEX index_name;
    • Example: REINDEX INDEX idx_name;

Index Usage

Indexes play a crucial role in improving query performance. They are used by the query planner to execute efficient search operations:

  • Query Planning
    • The query planner uses indexes to determine the most efficient way to execute a query.
    • Types of scans:
      • Index Scan: Directly uses the index to fetch rows.
      • Bitmap Index Scan: Combines multiple index scans into a bitmap for efficient row retrieval.
      • Sequential Scan: Full table scan when no suitable index is available.
      • Join conditions: Indexes can speed up join operations by providing fast access to the joined rows.
      • Matching columns in WHERE clause: Indexes improve performance for queries with WHERE clauses.
  • Index Selection
    • Choosing the right index for sorting operations and other query requirements is vital for performance optimization.

Performance Considerations

Several factors influence the performance of indexes in PostgreSQL:

  • Index Size
    • Larger indexes can lead to more maintenance overhead.
    • Balancing index size with performance benefits is essential.
  • Index Selectivity
    • Highly selective indexes offer better performance improvements.
    • Indexes should be selective enough to justify their maintenance costs.
  • Query execution plans
    • The EXPLAIN command helps analyze how queries use indexes.
  • Index usage statistics
    • Tools like pg_stat_user_indexes provide insights into index usage.

Index Monitoring

Monitoring indexes is crucial for maintaining database performance. PostgreSQL offers several tools and commands to help monitor index usage and performance.

Index Maintenance

Routine maintenance tasks are necessary to keep indexes efficient:

  • Vacuuming
    • Reclaims storage and removes dead tuples.
    • Ensures that indexes remain efficient and do not bloat.
  • Automatic Updates
    • Indexes are automatically updated during INSERT, DELETE, and UPDATE operations.

FAQs

  1. What is the primary benefit of using indexes in PostgreSQL?
    • Indexes significantly improve query performance by providing fast access paths to rows based on column values.
  2. How does a B-tree index work?
    • A B-tree index organizes data in a balanced tree structure, allowing for efficient equality and range searches.
  3. When should I use a GIN index?
    • GIN indexes are ideal for columns with multiple values, such as arrays or JSONB, and for full-text search operations.
  4. What are the performance impacts of having too many indexes?
    • Too many indexes can lead to increased maintenance overhead and slower write operations. It’s essential to balance the number of indexes with their performance benefits.
  5. How can I monitor the usage of my indexes?
    • Use tools like pg_stat_user_indexes and the EXPLAIN command to monitor index usage and analyze query execution plans.

Conclusion

Indexes are powerful tools in PostgreSQL that significantly improve query performance. By understanding the different types of indexes, their operations, and maintenance requirements, database administrators can optimize their databases for better performance and efficiency.

 PostgreSQL Automatic failover with repmgr

Introduction

Welcome to our latest blog!! In today’s world of distributed databases, ensuring high availability and reliability is paramount. This blog explores the crucial concepts of automatic failover and switchover using repmgr, a popular tool for managing replication and failover in PostgreSQL environments. We know how important it is for the crucial sectors like banking to have a HA solution for their PODS, with PostgreSQL. In the previous blog , we have seen the designing and implementation part of the HA systems but now we will look into the automatic failover and switchover part of the story.

Automatic failover is a critical feature in database systems that ensures minimal downtime in the event of a primary server failure. When the primary node becomes unavailable due to hardware failure, network issues, or other reasons, an automatic failover mechanism seamlessly promotes a standby node to become the new primary node. This transition happens automatically without manual intervention, thereby reducing downtime and ensuring continuous service availability. Here are the detailed process of achieving HA and also retaining old primary back.

step1: check the daemon status

[postgres@node1 ~]$ /usr/pgsql-15/bin/repmgr -f /var/lib/pgsql/repmgr.conf daemon status
WARNING: the following problems were found in the configuration file:
  parameter "cluster" is deprecated and will be ignored
 ID | Name    | Role    | Status    | Upstream | repmgrd | PID  | Paused? | Upstream last seen
----+---------+---------+-----------+----------+---------+------+---------+--------------------
 1  | node1   | standby |   running | standby | running | 1641 | no      | 0 second(s) ago    
 2  | standby | primary | * running |          | running | 1716 | no      | n/a            

We are able to see the both primary and standby running the daemon status

Step2 : Stopping the Primary Server

Stop the primary PostgreSQL server.

postgres@node1 ~]$ /usr/pgsql-15/bin/pg_ctl stop -D /var/lib/pgsql/15/data/
waiting for server to shut down.... done
server stopped

Confirms the server shutdown.

Step 3:Check the Cluster show 

checking the cluster show on the standby server

[postgres@node1 ~]$ /usr/pgsql-15/bin/repmgr -f /var/lib/pgsql/repmgr.conf  cluster show
WARNING: the following problems were found in the configuration file:
  parameter "cluster" is deprecated and will be ignored
 ID | Name    | Role    | Status        | Upstream  | Location | Priority | Timeline | Connection string                                                               
----+---------+---------+---------------+-----------+----------+----------+----------+----------------------------------------------------------------------------------
 1  | node1   | standby |   running     | ? standby | default  | 100      | 2        | host=192.168.232.150 user=repmgr dbname=repmgr password=repmgr connect_timeout=2
 2  | standby | primary | ? unreachable | ?         | default  | 100      |          | host=192.168.232.151 user=repmgr dbname=repmgr password=repmgr connect_timeout=2

WARNING: following issues were detected
  - unable to connect to node "node1" (ID: 1)'s upstream node "standby" (ID: 2)
  - unable to determine if node "node1" (ID: 1) is attached to its upstream node "standby" (ID: 2)
  - unable to connect to node "standby" (ID: 2)
  - node "standby" (ID: 2) is registered as an active primary but is unreachable

HINT: execute with --verbose option to see connection error messages

Node “standby” (ID: 2) is registered as an active primary but is unreachable

  Step 4 : Restarting the old primary and checking cluster

[postgres@standby ~]$ /usr/pgsql-15/bin/repmgr -f /var/lib/pgsql/repmgr.conf cluster show
 ID | Name    | Role    | Status               | Upstream | Location | Priority | Timeline | Connection string                                                               
----+---------+---------+----------------------+----------+----------+----------+----------+----------------------------------------------------------------------------------
 1  | node1   | standby | ! running as primary |          | default  | 100      | 3        | host=192.168.232.150 user=repmgr dbname=repmgr password=repmgr connect_timeout=2
 2  | standby | primary | * running            |          | default  | 100      | 2        | host=192.168.232.151 user=repmgr dbname=repmgr password=repmgr connect_timeout=2

WARNING: following issues were detected
  - node "node1" (ID: 1) is registered as standby but running as primary

 Step5 : Converting old primary to standby clone

Stop the old primary server to perform this activity.

[postgres@standby ~]$ /usr/pgsql-15/bin/repmgr -h 192.168.232.150 -U repmgr -d repmgr -f /var/lib/pgsql/repmgr.conf standby clone --dry-run
NOTICE: destination directory "/var/lib/pgsql/15/data" provided
INFO: connecting to source node
DETAIL: connection string is: host=192.168.232.150 user=repmgr dbname=repmgr
DETAIL: current installation size is 30 MB
INFO: "repmgr" extension is installed in database "repmgr"
WARNING: target data directory appears to be a PostgreSQL data directory
DETAIL: target data directory is "/var/lib/pgsql/15/data"
HINT: use -F/--force to overwrite the existing data directory
INFO: replication slot usage not requested;  no replication slot will be set up for this standby
INFO: parameter "max_wal_senders" set to 10
NOTICE: checking for available walsenders on the source node (2 required)
INFO: sufficient walsenders available on the source node
DETAIL: 2 required, 10 available
NOTICE: checking replication connections can be made to the source server (2 required)
INFO: required number of replication connections could be made to the source server
DETAIL: 2 replication connections required
WARNING: data checksums are not enabled and "wal_log_hints" is "off"
DETAIL: pg_rewind requires "wal_log_hints" to be enabled
NOTICE: standby will attach to upstream node 1
HINT: consider using the -c/--fast-checkpoint option
INFO: would execute:
  /usr/pgsql-15/bin/pg_basebackup -l "repmgr base backup"  -D /var/lib/pgsql/15/data -h 192.168.232.150 -p 5432 -U repmgr -X stream 
INFO: all prerequisites for "standby clone" are met

Confirms prerequisites for standby clone.

[postgres@standby ~]$ /usr/pgsql-15/bin/repmgr -h 192.168.232.150 -U repmgr -d repmgr -f /var/lib/pgsql/repmgr.conf standby clone -F
NOTICE: destination directory "/var/lib/pgsql/15/data" provided
INFO: connecting to source node
DETAIL: connection string is: host=192.168.232.150 user=repmgr dbname=repmgr
DETAIL: current installation size is 30 MB
INFO: replication slot usage not requested;  no replication slot will be set up for this standby
NOTICE: checking for available walsenders on the source node (2 required)
NOTICE: checking replication connections can be made to the source server (2 required)
WARNING: data checksums are not enabled and "wal_log_hints" is "off"
DETAIL: pg_rewind requires "wal_log_hints" to be enabled
WARNING: directory "/var/lib/pgsql/15/data" exists but is not empty
NOTICE: -F/--force provided - deleting existing data directory "/var/lib/pgsql/15/data"
NOTICE: starting backup (using pg_basebackup)...
HINT: this may take some time; consider using the -c/--fast-checkpoint option
INFO: executing:
  /usr/pgsql-15/bin/pg_basebackup -l "repmgr base backup"  -D /var/lib/pgsql/15/data -h 192.168.232.150 -p 5432 -U repmgr -X stream 
NOTICE: standby clone (using pg_basebackup) complete
NOTICE: you can now start your PostgreSQL server
HINT: for example: pg_ctl -D /var/lib/pgsql/15/data start
HINT: after starting the server, you need to re-register this standby with "repmgr standby register --force" to update the existing node record

Completes the standby clone process.

Step 6: Starting the New Standby Server

[postgres@standby ~]$ /usr/pgsql-15/bin/pg_ctl -D /var/lib/pgsql/15/data start
waiting for server to start....2024-06-19 12:09:05.620 IST [4644] LOG:  redirecting log output to logging collector process
2024-06-19 12:09:05.620 IST [4644] HINT:  Future log output will appear in directory "log".
 done
server started

Confirms the server has started.

Step 7: Registering the New Standby

[postgres@standby ~]$ /usr/pgsql-15/bin/repmgr -f /var/lib/pgsql/repmgr.conf standby register -F
INFO: connecting to local node "standby" (ID: 2)
INFO: connecting to primary database
INFO: standby registration complete
NOTICE: standby node "standby" (ID: 2) successfully registered
Standby registration is complete and successful.

Check the cluster status:

postgres@standby ~]$ /usr/pgsql-15/bin/repmgr -f /var/lib/pgsql/repmgr.conf cluster show
 ID | Name    | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string                                                               
----+---------+---------+-----------+----------+----------+----------+----------+----------------------------------------------------------------------------------
 1  | node1   | primary | * running |          | default  | 100      | 3        | host=192.168.232.150 user=repmgr dbname=repmgr password=repmgr connect_timeout=2
 2  | standby | standby |   running | node1    | default  | 100      | 3        | host=192.168.232.151 user=repmgr dbname=repmgr password=repmgr connect_timeout=2

Step 8: Performing a Switchover

Switch roles between primary and standby server.

[postgres@localhost data]$ /usr/pgsql-15/bin/repmgr -f /var/lib/pgsql/repmgr.conf standby switchover --always-promote
NOTICE: executing switchover on node "standby" (ID: 2)
NOTICE: attempting to pause repmgrd on 2 nodes
NOTICE: local node "standby" (ID: 2) will be promoted to primary; current primary "node1" (ID: 1) will be demoted to standby
NOTICE: stopping current primary node "node1" (ID: 1)
NOTICE: issuing CHECKPOINT on node "node1" (ID: 1) 
DETAIL: executing server command "/usr/pgsql-15/bin/pg_ctl  -D '/var/lib/pgsql/15/data' -W -m fast stop"
INFO: checking for primary shutdown; 1 of 60 attempts ("shutdown_check_timeout")
INFO: checking for primary shutdown; 2 of 60 attempts ("shutdown_check_timeout")
NOTICE: current primary has been cleanly shut down at location 0/27000028
NOTICE: promoting standby to primary
DETAIL: promoting server "standby" (ID: 2) using pg_promote()
NOTICE: waiting up to 60 seconds (parameter "promote_check_timeout") for promotion to complete
NOTICE: STANDBY PROMOTE successful
DETAIL: server "standby" (ID: 2) was successfully promoted to primary
NOTICE: node "standby" (ID: 2) promoted to primary, node "node1" (ID: 1) demoted to standby
NOTICE: switchover was successful
DETAIL: node "standby" is now primary and node "node1" is attached as standby
NOTICE: STANDBY SWITCHOVER has completed successfully

Standby promoted to primary, and node1 demoted to standby. Switchover successful

Step 9: Final Cluster Status Check

[postgres@localhost data]$ /usr/pgsql-15/bin/repmgr -f /var/lib/pgsql/repmgr.conf cluster show
 ID | Name | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string                                              
----+------+---------+-----------+----------+----------+----------+----------+-----------------------------------------------------------------
 1  | node1  | standby |   running | standby    | default  | 100      | 7        |host=192.168.232.150 user=repmgr dbname=repmgr password=repmgr connect_timeout=2
 2  | standby  | primary | * running |          | default  | 100      | 8        | host=192.168.232.151 user=repmgr dbname=repmgr password=repmgr  connect_timeout=2

node1 is now the standby and standby is now the primary.

Conclusion

Knowing the important of having a HA systems for their PODs, with PostgreSQL and using a tool like repmgr will be asset to the organization . In this blog , we have performed automatic failover by detecting failure of the primary and promoting the most suitable standby without any manual intervention. And also process of retaining the old primary to standby later to primary again is a must important to know for the DBA’s.

Thank you and stay tuned!!

PostgreSQL Replication Monitoring: My Learnings

Introduction:

Ensuring the health and performance of your PostgreSQL database involves keeping a close eye on replication. This process, where data is copied from one database server to another, is vital for data redundancy and HA. In this blog, we’ll delve into three effective methods for monitoring PostgreSQL replication: using native Select statements, the `check_postgres` perl script, and understanding WAL internals.

Understanding PostgreSQL Replication Before diving into the monitoring techniques, let’s briefly touch upon what replication in PostgreSQL entails. Replication involves synchronizing data from a primary server (master) to one or more secondary servers (slaves or replicas). This setup ensures data redundancy, high availability, and load balancing.

Using Native Select Statements for Replication Monitoring

PostgreSQL provides several system views and functions that can be queried using Select statements to monitor replication. Here are some essential queries:

1. Checking Replication Slots

Replication slots ensure that the primary server retains WAL files until they are no longer needed by the standby servers. You can monitor these slots using:

mydb=# select * from pg_replication_slots ;
-[ RECORD 1 ]-------+----------
slot_name           | book1
plugin              | 
slot_type           | physical
datoid              | 
database            | 
temporary           | f
active              | t
active_pid          | 5688
xmin                | 
catalog_xmin        | 
restart_lsn         | 0/3018430
confirmed_flush_lsn | 
wal_status          | reserved
safe_wal_size       | 
two_phase           | f

This query returns information about the replication slots, such as the slot name, database, and the WAL positions.

2. Monitoring Replication Lag

Replication lag indicates the delay between the primary and the standby servers. To monitor this, use:


mydb=# SELECT                            
    client_addr,
    application_name,
    state,
    sync_state,
    pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn) AS replication_lag
FROM
    pg_stat_replication;
   client_addr   | application_name |   state   | sync_state | replication_lag 
-----------------+------------------+-----------+------------+-----------------
 192.168.232.151 | walreceiver      | streaming | async      |               0
(1 row)

This query provides details about connected standby servers and the lag in terms of WAL log sequence numbers (LSNs).

3. Checking WAL Receiver Status

The wal _receiver process on the standby server fetches WAL data from the primary server. Monitor its status with:

mydb=# select * from pg_stat_wal_receiver ;
-[ RECORD 1 ]---------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
pid                   | 3997
status                | streaming
receive_start_lsn     | 0/4000000
receive_start_tli     | 1
written_lsn           | 0/48E96C0
flushed_lsn           | 0/4000000
received_tli          | 1
last_msg_send_time    | 2024-06-12 14:34:42.137068+05:30
last_msg_receipt_time | 2024-06-12 14:34:42.13582+05:30
latest_end_lsn        | 0/48E96C0
latest_end_time       | 2024-06-12 14:33:11.812817+05:30
slot_name             | book1
sender_host           | 192.168.232.150
sender_port           | 5434
conninfo              | user=replica password=******** channel_binding=prefer dbname=replication host=192.168.232.150 port=5434 fallback_application_name=walreceiver sslmode=prefer sslcompression=0 sslsni=1 ssl_min_protocol_version=TLSv1.2 gssencmode=prefer krbsrvname=postgres target_session_attrs=any

This query gives insights into the connection status and the latest received WAL position.

Replication Monitoring with `check_postgres` perl Script

Monitoring your PostgreSQL replication is crucial for maintaining database performance and reliability. The `check_postgres` perl script is an excellent tool for this task, offering a robust solution for keeping tabs on your replication status. Let’s dive into how you can use this script to ensure your PostgreSQL databases are running smoothly.

What is `check_postgres`?

`check_postgres` is a perl script that checks various attributes and metrics of your PostgreSQL database. It integrates seamlessly with monitoring systems like Nagios, Icinga, Zabbix, and others. With its comprehensive set of features, `check_postgres` helps you monitor replication lag, database size, connection statuses, and much more.

For setting up `check_postgres` and advanced configurations, be sure to check out the official check_postgres documentation at: https://bucardo.org/check_postgres/

Here’s a simple command to check replication lag:

`check_postgres` --action=replication_state --host=<your_host> --dbuser=<your_user>

Replace <your_host> and <your_user> with your actual database host and user. This command will return the replication state of your PostgreSQL server.

To ensure continuous monitoring, you can set up a cron job that runs the `check_postgres` script at regular intervals. Edit your crontab file with the following command:

For example:

*/5 * * * * /usr/local/bin/`check_postgres` --action=replication_state --host=<your_host> --dbuser=<your_user> >> /var/log/`check_postgres`.log 2>&1
 

This setup ensures that any replication issues are promptly detected, allowing you to take swift corrective action.

Exploring WAL Internals

Understanding the internals of Write-Ahead Logging (WAL) is crucial for effective replication monitoring. WAL is a method used by PostgreSQL to ensure data integrity. Here’s a closer look at its key components:

1. WAL Segments

WAL files are divided into segments, each with a default size of 16MB. These segments store changes made to the database. Monitoring the number of wal_segments can give you insights into the database activity and replication.

mydb=# show wal_segment_size ;
-[ RECORD 1 ]----+-----
wal_segment_size | 16MB

2. WAL Archiving

To ensure data can be recovered in the event of a failure, WAL segments can be archived. Configure your PostgreSQL server to archive WAL segments by setting the archive_mode and archive_command parameters in the postgresql.conf file:

```archive_mode = on
archive_command = 'cp %p /path/to/archive/%f'
```

3. WAL Buffers

WAL buffers are memory areas used to store WAL data before it’s written to disk. Monitoring the buffer usage can help you tune the performance. Use the following query to check the WAL buffer status:

mydb=#   
SELECT * FROM pg_stat_bgwriter;
-[ RECORD 1 ]---------+---------------------------------
checkpoints_timed     | 37
checkpoints_req       | 0
checkpoint_write_time | 213379
checkpoint_sync_time  | 3
buffers_checkpoint    | 2117
buffers_clean         | 0
maxwritten_clean      | 0
buffers_backend       | 5794
buffers_backend_fsync | 0
buffers_alloc         | 3693
stats_reset           | 2024-06-12 10:33:48.003491+05:30

This query returns statistics about the background writer process, including the WAL buffer usage.

Conclusion

Monitoring PostgreSQL replication is essential for maintaining a robust and efficient database environment. By utilizing native Select statements, the check_postgres perl script, and understanding WAL internals, you can gain deep insights into your replication status and performance. Regular monitoring and timely alerts can help you prevent issues before they impact your applications.

To ensure your PostgreSQL replication setup remains healthy, consider implementing the methods discussed in this guide. Regularly review your replication status, adjust configurations as needed, and stay informed about the latest PostgreSQL features and best practices.

The DNA of DBA: Mastering Parameter Tuning

In the realm of database administration, the role of a PostgreSQL DBA is crucial for ensuring the optimal performance of databases. One of the core responsibilities is parameter tuning, an essential skill that can significantly impact database efficiency and resource utilization.

The Importance of Parameter Tuning

Parameter tuning involves adjusting the configuration settings of PostgreSQL to match the specific workload and environment. Proper tuning can lead to improved query performance, better resource management, and overall enhanced database stability. This process is vital for:

  • Performance Optimization: Fine-tuning parameters like memory allocation, connection settings, and query planner options can drastically reduce query execution times and increase throughput.
  • Resource Management: Efficient use of CPU, memory, and I/O resources ensures that the database runs smoothly without overloading the system.
  • Cost Reduction: By optimizing performance and resource usage, organizations can reduce their hardware and cloud infrastructure costs.

The Role of DBAs in Parameter Tuning

DBAs need to have a deep understanding of the database’s behavior under different conditions. This includes:

  • Monitoring Performance Metrics: Keeping an eye on key metrics such as query latency, cache hit ratios, and disk I/O.
  • Adjusting Parameters: Tweaking settings like shared_buffers, work_mem, and maintenance_work_mem to suit the workload.
  • Testing and Validation: Continuously testing changes in a staging environment before applying them to production.

Leveraging AI for Advanced Tuning with DBtune

DBtune offers an AI-enabled tuning engine that automates and enhances the parameter tuning process. Here are some of its key features:

AI-Powered Optimization

DBtune uses machine learning algorithms to analyze your PostgreSQL workload and automatically adjust configuration settings for optimal performance. This results in:

  • Up to 13x Average Query Runtime Performance: Achieve faster transactions and reduced query runtime.
  • Up to 50% reduction in database infrastructure cost: More efficient use of resources translates to lower operational expenses.
  • 25% Increase in Productivity: Automation frees up DBA and developer resources, allowing them to focus on strategic tasks.

Seamless Integration

The platform integrates smoothly with various environments, ensuring that the tuning process is hassle-free. Whether it’s on-premises or cloud-based PostgreSQL instances, DBtune adapts to your setup.

Continuous Learning and Adaptation

DBtune continuously learns from the database’s workload, adapting its tuning strategies to changing conditions. This dynamic approach ensures that your database remains optimally tuned even as workloads evolve.

Comprehensive Reporting

DBtune provides detailed reports on the tuning process and its impact on performance metrics. These insights help DBAs understand the changes made and their benefits, facilitating better decision-making for future optimizations.

Real-World Impact

Businesses using DBtune have reported significant improvements in their database performance. For instance, a company has seen a 34% improvement in latency , showcasing the effectiveness of AI-driven tuning. Another, company has seen a 140x improvement in the latency of one query.

Conclusion

The DNA of every successful DBA includes a mastery of PostgreSQL parameter tuning. By understanding and applying the right tuning techniques, DBAs can ensure their databases perform at their best. Leveraging advanced tools like DBtune can further enhance these efforts, bringing AI-driven insights and automation to the tuning process.

I encourage all DBAs and database engineers to explore DBtune’s capabilities and integrate it into their performance optimization strategies. Visit app.dbtune.com, give it a try, and share your feedback to help shape the future of AI-driven database tuning.

Mastering parameter tuning is not just about improving performance; it’s about unlocking the full potential of your PostgreSQL database and driving business success.

The 101s of Postgres Streaming Replication

Introduction:

Embarking on my first customer engagement to set up PostgreSQL Streaming Replication was an insightful journey. As a Database Engineer, this was my first hands-on experience with implementing streaming replication in a real-world environment. The stakes were high, as the customer required a robust and reliable high-availability solution to ensure their critical applications remained operational without interruption.

From the initial planning stages to the final deployment, every step of the process brought new challenges and learning opportunities. The customer’s environment was complex, with multiple databases running on different nodes, each requiring seamless synchronization to maintain data integrity and consistency. My goal was to create a replication setup that not only met their high availability needs but also provided the scalability and performance necessary to support their growing business.

In this blog post, I’ll share my experience setting up PostgreSQL Streaming Replication. Whether you’re a seasoned DBA or new to the world of database replication, I hope my journey will provide valuable insights and practical tips for successfully implementing streaming replication in your own PostgreSQL environment.

The streaming replication of a standby database server is configured to connect to the primary server, which streams WAL (Write-Ahead Logging) records to the standby as they are generated, without waiting for the WAL file to be filled. By default, streaming replication is asynchronous, where data is written to the standby server after a transaction has been committed on the primary server. This means that there is a small delay between committing a transaction in the master server and the changes becoming visible in the standby server.

Step 1 :Configure Primary Server

Open postgresql.conf file on the primary server and make this changes:

listen_addresses = '*'  
wal_level = replica      
max_wal_senders = 5
hot_standby=on

Step 2: Enable Replication User

Open pg_hba.conf file and add an entry to allow replication connections from the standby server:

host    replication    replication_user    standby_ip/32    md5
#Restart PostgreSQL to apply the changes.

Step 3: Create database

postgres=# CREATE DATABASE mydb;CREATE DATABASE #Switch to database : 
\c mydb

Step 4: Create the user

Create a user in master using whichever slave should connect for streaming the WALs. This user must have replication role

mydb=# CREATE ROLE replica WITH REPLICATION ENCRYPTED PASSWORD 'postgres';
CREATE ROLE

Step 5: Taking pg_basebackup

pg_basebackup helps us to stream the data through the  wal sender process from the master to a slave to set up replication.

[postgres@master ~]$ pg_basebackup -h 10.206.0.2 \
                                   -p 5433 -D /var/lib/pgsql/14/data/ \
                                   -U replica  -P -v -R -X stream \
                                   -C -S book1 
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/2000028 on timeline 1
pg_basebackup: starting background WAL receiver
pg_basebackup: created replication slot "book1"
34874/34874 kB (100%), 1/1 tablespace                                         
pg_basebackup: write-ahead log end point: 0/2000100
pg_basebackup: waiting for background process to finish streaming ...
pg_basebackup: syncing data to disk ...
pg_basebackup: renaming backup_manifest.tmp to backup_manifest
pg_basebackup: base backup completed

Step 6: Check for Standby Signal

Notice that standby.signal is created and the connection settings are appended to postgresql.auto.conf.

[postgres@master ~]$ cd /var/lib/pgsql/14/data
# Check for postgresql.conf and standby.signal file in the data directory.

[postgres@master data]$ ls -ltr
-rw-------. 1 postgres postgres    225 May 29 02:41 backup_label
-rw-------. 1 postgres postgres 180563 May 29 02:42 backup_manifest
drwx------. 6 postgres postgres     54 May 29 02:42 base
-rw-------. 1 postgres postgres     30 May 29 02:42 current_logfiles
drwx------. 2 postgres postgres   4096 May 29 02:42 global
drwx------. 2 postgres postgres     32 May 29 02:42 log
drwx------. 2 postgres postgres      6 May 29 02:42 pg_commit_ts
drwx------. 2 postgres postgres      6 May 29 02:42 pg_dynshmem
-rw-------. 1 postgres postgres   4929 May 29 02:42 pg_hba.conf
-rw-------. 1 postgres postgres   1636 May 29 02:42 pg_ident.conf
drwx------. 4 postgres postgres     68 May 29 02:42 pg_logical
drwx------. 4 postgres postgres     36 May 29 02:42 pg_multixact
drwx------. 2 postgres postgres      6 May 29 02:42 pg_notify
drwx------. 2 postgres postgres      6 May 29 02:42 pg_replslot
drwx------. 2 postgres postgres      6 May 29 02:42 pg_serial
drwx------. 2 postgres postgres      6 May 29 02:42 pg_snapshots
drwx------. 2 postgres postgres      6 May 29 02:42 pg_stat
drwx------. 2 postgres postgres      6 May 29 02:42 pg_stat_tmp
drwx------. 2 postgres postgres      6 May 29 02:42 pg_subtrans
drwx------. 2 postgres postgres      6 May 29 02:42 pg_tblspc
drwx------. 2 postgres postgres      6 May 29 02:42 pg_twophase
-rw-------. 1 postgres postgres      3 May 29 02:42 PG_VERSION
drwx------. 3 postgres postgres     60 May 29 02:42 pg_wal
drwx------. 2 postgres postgres     18 May 29 02:42 pg_xact
-rw-------. 1 postgres postgres    372 May 29 02:42 postgresql.auto.conf
-rw-------. 1 postgres postgres  28724 May 29 02:42 postgresql.conf
-rw-------. 1 postgres postgres      0 May 29 02:42 standby.signal

Step 7: Check the replication slot

Now connect the master server, you should be able to see the replication slot called book1 when you open the pg_replication_slots view as follows.

mydb=# select * from pg_replication_slots ;
-[ RECORD 1 ]-------+----------
slot_name           | book1
plugin              | 
slot_type           | physical
datoid              | 
database            | 
temporary           | f
active              | t
active_pid          | 3506
xmin                | 
catalog_xmin        | 
restart_lsn         | 0/3018E38
confirmed_flush_lsn | 
wal_status          | reserved
safe_wal_size       | 
two_phase           | f

#Now start PostgreSQL on a slave(standby) server.

Step 8: Check the status on Standby

We can check the status on standby using the below command.

mydb=# select * from pg_stat_wal_receiver ;
-[ RECORD 1 ]---------+------------------------------------------------------
pid                   | 3487
status                | streaming
receive_start_lsn     | 0/3000000
receive_start_tli     | 1
written_lsn           | 0/3018E38
flushed_lsn           | 0/3018E38
received_tli          | 1
last_msg_send_time    | 2024-05-29 07:15:34.249873+00
last_msg_receipt_time | 2024-05-29 07:15:34.376013+00
latest_end_lsn        | 0/3018E38
latest_end_time       | 2024-05-29 03:32:03.18865+00
slot_name             | book1
sender_host           | 10.206.0.2
sender_port           | 5433
conninfo              | user=replica passfile=/var/lib/pgsql/.pgpass channel_binding=prefer dbname=replication host=10.206.0.2 port=5433 fallback_application_name=walreceiver sslmode=prefer sslcompression=0 sslsni=1 ssl_min_protocol_version=TLSv1.2 gssencmode=prefer krbsrvname=postgres target_session_attrs=any

Conclusion:

Setting up PostgreSQL Streaming Replication for the first time was both a challenging and rewarding experience. This journey taught me the importance of meticulous planning, a thorough understanding of the underlying technology, and the ability to troubleshoot issues as they arise. Here are some key takeaways from my experience:

  1. Understanding the Basics: Gaining a solid grasp of how streaming replication works is crucial. Knowing the differences between synchronous and asynchronous replication, and when to use each, can make a significant impact on your setup’s reliability and performance.
  2. Proper Configuration: The configuration of the primary and standby servers is the backbone of a successful replication setup. Ensuring that parameters like wal_level, max_wal_senders, and archive_mode are correctly set is vital.
  3. Monitoring and Maintenance: Implementing monitoring tools to track replication status and performance is critical. Regular maintenance, such as vacuuming and analyzing databases, helps keep the replication environment healthy.
  4. Testing Failover: Regularly testing failover procedures ensures that the replication setup is reliable and that you can quickly recover in case of a primary server failure. This practice builds confidence in the system’s resilience. I’ll be covering this topic in the next blog post.

My first foray into PostgreSQL Streaming Replication was a significant learning curve, but it ultimately reinforced my skills and confidence in managing high-availability database systems. I hope this blog post has provided you with a comprehensive overview and practical insights into setting up streaming replication. Whether you are a novice or a seasoned DBA, these lessons can guide you through your own replication projects, ensuring a more resilient and reliable PostgreSQL environment.

Happy replicating!

Unleashing PostgreSQL Performance: Exploring the Power of pg_profile

Introduction

In the evolving landscape of database management, optimizing performance is paramount. For enterprises relying on PostgreSQL, a robust monitoring tool can make all the difference. Using the extension pg_profile , can be a game changer in the realm of PostgreSQL performance tuning. Every millisecond counts, and for PostgreSQL users, navigating the complexities of performance tuning can be daunting. Monitoring and fine-tuning of slow query execution can increase the performance. In this blog , we will delve into the installation, configuration and generating the statistical report using pg_profile .

pg_profile

pg_profile is a extension designed specifically for monitoring and analyzing the performance of PostgreSQL databases. It helps database administrators identify and address performance bottlenecks, optimize queries, and fine-tune configurations to enhance the overall efficiency and responsiveness of PostgreSQL database systems. pg_profile is having some key features like query performance analysis, resource monitoring , index usage analysis and configuration parameter recommendations.

Installation

Here, we are installing version 4.6 of pg_profile and unzip into a folder.

#Installing using wget
[root@localhost ~]# wget https://github.com/zubkov-andrei/pg_profile/releases/download/4.6/pg_profile--4.6.tar.gz

#Unzip into a folder
[root@localhost ~]# tar -xvf pg_profile--4.6.tar.gz --directory /usr/pgsql-16/share/extension
pg_profile--4.6.sql
pg_profile.control
pg_profile--4.5--4.6.sql
[root@localhost ~]# cd /usr/pgsql-16/share/extension
[root@localhost extension]# ls -ltr
total 1680
-rw-r--r--. 1 postgres postgres     193 Apr 27 20:10 pg_profile.control
-rw-r--r--. 1 postgres postgres 1073167 Apr 27 20:10 pg_profile--4.6.sql
-rw-r--r--. 1 postgres postgres  635910 Apr 27 20:10 pg_profile--4.5--4.6.sql

Create extension

pg_profile extension depends on extensions  plpgsql and dblink. The only mandatory requirement for server cluster is the ability to connect from pg_profile database using provided server connection string. All other requirements are optional, but they can improve completeness of gathered statistics. To check the existence of this extensions ,

[root@localhost extension]# yum install postgresql16-contrib

#Login in to psql and create extensions
postgres=# create extension dblink;
CREATE EXTENSION
postgres=# create extension pg_stat_statements;
CREATE EXTENSION
postgres=# create extension pg_profile;
CREATE EXTENSION
postgres=# \dx
                                            List of installed extensions
        Name        | Version |   Schema   |                              Description                               
--------------------+---------+------------+------------------------------------------------------------------------
 dblink             | 1.2     | public     | connect to other PostgreSQL databases from within a database
 pg_profile         | 4.6     | public     | PostgreSQL load profile repository and report builder
 pg_stat_statements | 1.10    | public     | track planning and execution statistics of all SQL statements executed
 plpgsql            | 1.0     | pg_catalog | PL/pgSQL procedural language

Configuration

We need to update few parameters in postgresql.conf file .

shared_preload_libraries = 'pg_stat_statements'
pg_stat_statements.max = 20000
pg_stat_statements.track = top

#Restart the server

To know the objects in pg_profile , use the command \dx+ pg_profile; . It would display all the objects in the extension, including functions, tables and views.

Generating the statistical report

To generate the statistical report , we need to know the list of servers in pg_profile.

#To know the servers list in pg_profile 
postgres=# select * from show_servers() ;
 server_name |          connstr          | enabled | max_sample_age | description 
-------------+---------------------------+---------+----------------+-------------
 local       | dbname=postgres port=5432 | t       |                | 
(1 row)

As a default , there is a local server. In order the generate the report , we need to take samples at regular intervals. We can then generate a report in-between those samples.

# To take samples


postgres=# select * from take_sample();
 server | result |   elapsed   
--------+--------+-------------
 local  | OK     | 00:00:01.38
(1 row)

#To view samples taken
postgres=# select * from show_samples();
 sample |        sample_time        | sizes_collected | dbstats_reset | clustats_reset | archstats_reset 
--------+---------------------------+-----------------+---------------+----------------+-----------------
      1 | 2024-05-17 00:38:59+05:30 | t               |               |                | 
      2 | 2024-05-17 00:40:01+05:30 | t               |               |                | 
      3 | 2024-05-17 00:40:06+05:30 | t               |               |                | 
      4 | 2024-05-17 00:43:51+05:30 | t               |               |                | 
(4 rows)

Here , We have taken four samples . We can generate the report in between these samples. For example , between 1-2,1-3 ,1-4 ,2-3 ,2-4 & 3-4 .

#To generate a report in html

[postgres@localhost ~]$ psql -Aqtc "SELECT get_report('local',1,2)" -o Report.html
[postgres@localhost ~]$ ls -ltr
total 348
drwx------. 4 postgres postgres     51 May 15 07:25 16
-rw-r--r--. 1 postgres postgres 350993 May 17 00:56 Report.html

On viewing the generated html report, we can find SQL query statistics ,schema object statistics ,user function statistics and vacuum related statistics.

Summary

To summarize – in the current landscape of database management , it is important to monitor and optimize the database performance. In this post, we have introduced you to pg_profile, a game-changing tool tailored specifically for PostgreSQL performance monitoring. We’ve delved into its features, installation process, and seamless integration with existing PostgreSQL infrastructures, setting the stage for an immersive exploration of its capabilities.

Thank you and Stay tuned for more…

Mastering pgbench for Database Performance Tuning – Part Ⅱ

Introduction

In the realm of database performance tuning, the pursuit of optimal throughput and responsiveness is a never-ending journey. In our previous blog , we discussed about pgbench , setting up and first run of pgbench , tips, tricks and common pitfalls of benchmarking. Building upon that foundation, we now venture into the realm of advanced optimization strategies, where we delve deeper into the intricacies of pgbench with more options in it. Additionally, we will explore ways to interpret pgbench results effectively, transforming raw metrics into actionable insights that drive tangible performance enhancements.

If you haven’t read our previous blog , read here.

Exploring pgbench with more options

In our previous blog , we explored the basic options of pgbench like -c for number of clients,-j for number of threads, -t for number of transactions, -T for time limit . But, there are lot more options that the pgbench utility offers. Let’s explore a few more of these options and see how they can be used.

No vacuum -n

It is used to avoid the vacuuming before running the test . Usually , this option is used in a custom test run using a script which does not include the default tables.

#Running test run without -n
[postgres@localhost bin]$ ./pgbench db
pgbench (16.3)
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>

#Running test run with -n
[postgres@localhost bin]$ ./pgbench -n db
pgbench (16.3)
transaction type: <builtin: TPC-B (sort of)>

Analyzing the difference , without option -n it is running vacuum. You can see avoiding vacuum with option -n

Connect -C

This option will establish a new connection for each transaction instead of once per client sessions. It is used to measure the client overhead.

#Using -C
[postgres@localhost bin]$ ./pgbench -C db
pgbench (16.3)
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
maximum number of tries: 1
number of transactions per client: 10
number of transactions actually processed: 10/10
number of failed transactions: 0 (0.000%)
latency average = 5.255 ms
average connection time = 2.524 ms
tps = 190.284094 (including reconnection times)

Observing the tps , it is including the time taken for reconnection . If you don’t use this option , you will see tps without initial connection time.

Rate -R & Latency limit -L

These two options are mostly used together. Rate -R used to specify the rate/speed of transactions and Latency limit -L used to count the transaction over the specified limit. These two are calculated in milliseconds.

[postgres@localhost bin]$ ./pgbench -R 10 -L 10 db
pgbench (16.3)
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
maximum number of tries: 1
number of transactions per client: 10
number of transactions actually processed: 10/10
number of failed transactions: 0 (0.000%)
number of transactions skipped: 0 (0.000%)
number of transactions above the 10.0 ms latency limit: 0/10 (0.000%)
latency average = 3.153 ms
latency stddev = 0.480 ms
rate limit schedule lag: avg 0.311 (max 0.518) ms
initial connection time = 2.973 ms
tps = 6.476235 (without initial connection time)

Here, we can see the number of transactions above the specified latency limit and number of transactions skipped. In this case, there are no transactions that crossed the given inputs.

Report per command -r

This option gives us the stats of each command after the benchmark finishes.

[postgres@localhost bin]$ ./pgbench -r db
pgbench (16.3)
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
maximum number of tries: 1
number of transactions per client: 10
number of transactions actually processed: 10/10
number of failed transactions: 0 (0.000%)
latency average = 1.020 ms
initial connection time = 2.315 ms
tps = 980.199961 (without initial connection time)
statement latencies in milliseconds and failures:
         0.001           0  \set aid random(1, 100000 * :scale)
         0.000           0  \set bid random(1, 1 * :scale)
         0.000           0  \set tid random(1, 10 * :scale)
         0.000           0  \set delta random(-5000, 5000)
         0.029           0  BEGIN;
         0.169           0  UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;
         0.053           0  SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
         0.056           0  UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;
         0.043           0  UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;
         0.054           0  INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);
	 0.611           0  END;

It gives us the latency and failures of the default script used by pgbench.

Summary

To summarize, effective benchmarking is crucial for the optimal performance and scalability of databases.By leveraging benchmarking tools such as pgbench, database administrators and developers can simulate real-world workloads, measure performance metrics, and identify areas for optimization. However, successful benchmarking requires careful planning, execution, and analysis. In this blog , we have explored few more options in leveraging the pgbench utility. If you haven’t checked our previous blog on pgbench , here is the blog.

Thank you and stay tuned for more…

Guide on Testing pgEdge Platform Multi-Master Replication

Introduction:

In the realm of database management, ensuring the reliability and efficiency of replication systems is crucial. pgEdge Platform’s Multi-Master Replication (MMR) offers a robust solution for maintaining continuous synchronization across multiple servers, enhancing both availability and fault tolerance. But how do you test this complex system effectively? This blog will guide you through the nuances of testing, ensuring that you can confidently deploy a system that stands up to real-world demands.

When diving into the testing and deployment of database systems like pgEdge Platform Multi-Master Replication (MMR), it’s crucial to consider foundational principles that govern the behaviour of distributed systems. One such principle is the CAP Theorem, which offers critical insights into system design and capabilities.

Understanding the CAP Theorem:

The CAP Theorem, proposed by Eric Brewer, states that a distributed database system can only simultaneously provide two out of the following three guarantees:

  • Consistency (C): Every read receives the most recent write or an error.
  • Availability (A): Every request receives a response, without guarantee that it contains the most recent write.
  • Partition tolerance (P): The system continues to operate despite an arbitrary number of messages being dropped (or delayed) by the network between nodes.

In the context of MMR, this theorem is especially pertinent because it highlights the trade-offs involved in designing and testing a replication system. Since network partitions are a reality that must be handled, choosing between consistency and availability becomes a strategic decision based on the specific requirements of your application.

Here’s what I feel the building blocks of testing:

  • Understanding MMR Basics: What is Multi-Master Replication in PostgreSQL and why is it important?
  • Setting Up Your Test Environment: Steps to prepare an optimal testing landscape.
  • Key Metrics for Testing : What to measure and why.
  • Common Challenges and Solutions: Troubleshooting common issues during testing.
  • Automating MMR Tests: Tools and scripts to streamline the process.
  • Real-World Testing Scenarios: Simulating real user transactions to gauge performance.

MMR:

 multi-master (active-active) distributed PostgreSQL allows you to have   multiple master databases spread across different locations (multiple nodes), each capable of handling read and write traffic simultaneously

MMR improves data access times for your applications by using bi-directional replication and conflict resolution.

This is done through pgEdge’s Spock extension that allows you to create a multi-master (active-active) replication solution for your PostgreSQL database.

Pre-requisties:

git clone https://github.com/pgEdge/cli.git

Test Strategies:

So,what are the strategies that i follow for MMR testing:

  • Test scripts were written in perl(python is also preferred)
  • Test Scripts can run individually or by using a schedule
  • To run a schedule, include the -s flag and the name of the schedule file – like:  ./runner -s schedule_name
  • 4)To run an individual test, include the -t flag, and the name of a test – like: ./runner -t testname
  • 5)As each test script executes (individually or through a schedule), it returns a pass or fail on the command line. 

a)setup scripts

 b)node creation

c)Selective replication

d)Cleanup

e)uninstall pgedge

Conclusion:

Testing PostgreSQL Multi-Master Replication is essential for any business that relies on continuous data availability and integrity. By following this blog, you can ensure that your PostgreSQL databases are resilient, efficient, and prepared for high-demand scenarios. Remember, the key to successful MMR deployment lies in meticulous planning, comprehensive testing, and continuous optimization. In conclusion, one can seamlessly setup MMR with pgEdge Platform which provides solutions for modern applications and can easily point out the failure through script-based testing, which makes the testing process hassle-free.

Mastering pgbench for Database Performance Tuning

Introduction

In today’s world of databases, performance is one of the key aspects that need to be taken care-of irrespective of the size of the database. If you’re in charge of managing a PostgreSQL database, ensuring peak performance is undoubtedly a top priority. Enter pgbench, PostgreSQL’s own benchmarking powerhouse, designed to help you simulate workloads and test different database strategies to optimize efficiency. Whether you’re a seasoned DBA or just starting out, understanding how to leverage pgbench can be a game-changer for your database operations. This comprehensive guide will cover everything you need to know to get started with pgbench, interpret its results, and implement its insights for optimal database performance.

pgbench

pgbench is a benchmarking tool included with PostgreSQL that simulates a standard workload. A simple program for running benchmark tests on PostgreSQL.. By default, pgbench tests are based on the TPC-B benchmark, which includes executing 5 SELECT, INSERT, and UPDATE commands per transaction. It is available as a part of the PostgreSQL installation. Using pgbench , we can simulate various workloads on PostgreSQL databases by generating multiple client connections and also by predefined scripts or set of parameters.

Setting up and running my first pgbench test

pgbench should be invoked with the -i (initialize) option to create and populate these tables. While initializing pgbench -i  database_name, it creates four tables pgbench_accounts, pgbench_branches, pgbench_history, pgbench_tellers dropping any existing tables of these names.

However it will check for the previous tables of these names and create again whenever -i (initialize) is given. In order a scale up , with a scale factor -s of 20, it will insert data into the tables to represent a workload 20 times larger than the default scale. Each tuple inserted represents a row of data in the database tables as shown below.

There are various parameters in pgbench to perform benchmark tests. The most common options are -c for number of clients,-j for number of threads, -t for number of transactions, -T for time limit and -f for specify a custom script file.

Running pgbench

Simulating the workload with 50 client connections and 100 transactions per client to database mydb

Understanding the Output

Let’s analyze the output, there are some defaults that will be considered such as scaling factor =1 , number of threads =1 if you didn’t give them in input.

Number of clients , Number of transactions per client is given as input to pgbench and number of transactions actually processed is the product of those two. There are zero transactions failed.

Latency average is the time taken to respond for a given request. It is measured in milliseconds . here it is 92.234ms

Initial connection time is the time taken for the client to establish connection with the server. Here it took 416.143ms

Most important is tps i.e transaction per seconds , it refers to the number of transactions processed by the system in one second time interval. It varies based on the complexity of transactions(or query) , system resources, concurrency levels, and database configuration. It is better to have high tps.

Tips and Tricks

  • Understand the workloads – It is important to understand the workloads on application and simulate accordingly using customized SQL scripts and adjusting scale factor.
  • Controlling concurrency level – We can control the concurrency level  in the pgbench test using the options -c and -j accordingly.
  • Monitor System Metrics – Use system monitoring tools such as top to monitor CPU usage, memory utilization, disk I/O, and network activity during pgbench tests. 
  • Tune PostgreSQL configurations – Adjust postgresql.conf parameters , such as shared_buffers, max_connections, work_mem based on results.
  • Run multiple tests – To get the accurate assessment on performance , run the multiple pgbench tests.
  • Documentation – Documenting the methodologies & results of your testing activities will help in further analysis.

Common pitfalls to avoid in database benchmarking

  • Doing the benchmark testing without understanding the workload.
  • Running the test on the insufficient CPU, memory, storage, and network resources to support the expected workload.
  • Incorrect benchmarking tools and methodologies can lead to incorrect results.
  • Not testing on realistic conditions or failing to replicate the environment conditions leads to improper results.
  • Considering the result of single test rather than the average of multiple tests
  • No proper documentation

Summary

To summarize, effective benchmarking is crucial for the optimal performance and scalability of databases.By leveraging benchmarking tools such as pgbench, database administrators and developers can simulate real-world workloads, measure performance metrics, and identify areas for optimization. However, successful benchmarking requires careful planning, execution, and analysis.In this blog, we have explored some features and results of pgbench including tips , tricks and also common pitfalls need to be avoided in database benchmarking.

Thank you and stay tuned for more…

Troubleshooting PostgreSQL Replication Setup Errors with repmgr

Introduction

Hey All

Managing a PostgreSQL replication setup involves various challenges, especially when encountering errors during configuration. This blog post explores common errors encountered while setting up replication using repmgr, a popular replication manager for PostgreSQL. We’ll delve into the causes of these errors and provide solutions to overcome them, ensuring a seamless replication setup. We had discussed setting up High Availability in Postgres using repmgr in this post. If you did not get a chance to go through it, now might be a good time for it.

These are the some of the errors/issues I faced in my initial work on repmgr as well as with ec2 instances. They might look very basic, but they reinforce the fact that minute details are important.

Error#1

This is the first brake of my smooth journey with ec2 instances & repmgr.

[postgres@ip-172-31-14-99 ~]$ /usr/pgsql-14/bin/repmgr -f /var/lib/pgsql/repmgr.conf primary register
WARNING: the following problems were found in the configuration file:
  parameter "cluster" is deprecated and will be ignored
INFO: connecting to primary database...
ERROR: connection to database failed
DETAIL:
connection to server at "52.64.241.210", port 5432 failed: timeout expired

DETAIL: attempted to connect using:
  user=repmgr connect_timeout=2 dbname=repmgr host=52.64.241.210 fallback_application_name=repmgr options=-csearch_path=

Solution

We need to edit the inbound rules of the security group and add each and every PostgreSQL EC2 instances to allow traffic to flow between those instances.
Steps to do : Instances –> Security–>Security group–>Edit Inbound rules

Add the PostgreSQL to the security group as mentioned below

Result

[postgres@ip-172-31-14-99 bin]$ /usr/pgsql-14/bin/repmgr -f /var/lib/pgsql/repmgr.conf primary register
WARNING: the following problems were found in the configuration file:
  parameter "cluster" is deprecated and will be ignored
INFO: connecting to primary database...
NOTICE: attempting to install extension "repmgr"
NOTICE: "repmgr" extension successfully installed
NOTICE: primary node record (ID: 1) registered

#Verifying Primary registeration 
[postgres@ip-172-31-14-99 bin]$ /usr/pgsql-14/bin/repmgr -f /var/lib/pgsql/repmgr.conf cluster show
WARNING: the following problems were found in the configuration file:
  parameter "cluster" is deprecated and will be ignored
 ID | Name  | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string
----+-------+---------+-----------+----------+----------+----------+----------+----------------------------------------------------------------
 1  | node1 | primary | * running |          | default  | 100      | 1        | host=52.64.241.210 user=repmgr dbname=repmgr connect_timeout=2

Error#2

This is the basic error , we DBA’s deal with this on daily basis. “No Entries in pg_hba.conf” . I encountered this error while performing the Standby clone dry run operatiion.

[postgres@ip-172-31-7-43 ~]$ /usr/pgsql-14/bin/repmgr -h 52.64.241.210 -U repmgr -d repmgr -f /var/lib/pgsql/repmgr.conf standby clone --dry-run

DETAIL:
connection to server at "52.64.241.210", port 5432 failed: FATAL:  no pg_hba.conf entry for replication connection from host "13.210.66.27", user "repmgr", no encryption

ERROR: connection to database failed
DETAIL:
connection to server at "52.64.241.210", port 5432 failed: FATAL:  no pg_hba.conf entry for replication connection from host "13.210.66.27", user "repmgr", no encryption

Solution

It clearly indicates that we need to make an entry in pg_hba.conf of Primary server.

cd $PGDATA
vi pg_hba.conf
host    replication     repmgr          {Standby1 IP}           md5
host    repmgr          repmgr          {Standby1 IP}           md5

# We have to add details of all of the Standby nodes and the Witness node too, if applicable.

Note : As we will clone the data from the Primary to both of the Standby nodes , there is no need to make these changes in the Standby nodes. But on the Witness mode , we need to enter the IP’s in pg_hba.conf file and change the parameters in postgresql.conf file . (For more details, refer to this post)

Result

[postgres@ip-172-31-7-43 ~]$ /usr/pgsql-14/bin/repmgr -h 52.64.241.210 -U repmgr -d repmgr -f /var/lib/pgsql/repmgr.conf standby clone --dry-run

NOTICE: standby will attach to upstream node 1
HINT: consider using the -c/--fast-checkpoint option
INFO: would execute:
  pg_basebackup -l "repmgr base backup"  -D /var/lib/pgsql/14/data -h 52.64.241.210 -p 5432 -U repmgr -X stream
INFO: all prerequisites for "standby clone" are met

Error#3

I had encountered this error when I was setting up the Witness node. It is possible that even after you have done an entry in pg_hba.conf file, you get the following error:

[postgres@ip-172-31-8-219 ~]$ /usr/pgsql-14/bin/repmgr -f /var/lib/pgsql/repmgr.conf witness register -h 52.64.241.210
WARNING: the following problems were found in the configuration file:
  parameter "cluster" is deprecated and will be ignored
INFO: connecting to witness node "node4" (ID: 4)
ERROR: unable to connect to witness node "node4" (ID: 4)
DETAIL:
connection to server at "ec2-13-211-191-92.ap-southeast-2.compute.amazonaws.com" (172.31.8.219), port 5432 failed: FATAL:  role "repmgr" is not permitted to log in

HINT: the witness node must be running before it can be registered

Solution

Login to psql and check the attributes of the user

[postgres@ip-172-31-8-219 ~]$ psql
psql (14.11)
Type "help" for help.
postgres=# \du+

                                          List of roles
 Role name |                         Attributes                         | Member of | Description
-----------+------------------------------------------------------------+-----------+-------------
 postgres  | Superuser, Create role, Create DB, Replication, Bypass RLS | {}        |
 repmgr    | Superuser, Cannot login                                    | {}        |

# It shows Role "repmgr" cannot login
# Simply alter the role 
postgres=# ALTER ROLE repmgr LOGIN;
ALTER ROLE
postgres=# \du+
                                          List of roles
 Role name |                         Attributes                         | Member of | Description
-----------+------------------------------------------------------------+-----------+-------------
 postgres  | Superuser, Create role, Create DB, Replication, Bypass RLS | {}        |
 repmgr    | Superuser                                                  | {}        |
 

Result

After altering the role , retry registering the witness node.

[postgres@ip-172-31-8-219 ~]$ /usr/pgsql-14/bin/repmgr -f /var/lib/pgsql/repmgr.conf witness register -h 52.64.241.210
WARNING: the following problems were found in the configuration file:
  parameter "cluster" is deprecated and will be ignored
INFO: connecting to witness node "node4" (ID: 4)
INFO: connecting to primary node
NOTICE: attempting to install extension "repmgr"
NOTICE: "repmgr" extension successfully installed
INFO: witness registration complete
NOTICE: witness node "node4" (ID: 4) successfully registered

Conclusion

In PostgreSQL replication setups, encountering errors is not uncommon, but with the right knowledge and troubleshooting steps, they can be resolved effectively. In this post, we have discussed some of the common errors encountered during the configuration of replication using repmgr. By understanding the root cause of these errors and implementing the provided solutions, administrators can ensure a smooth and reliable PostgreSQL replication setup, facilitating data redundancy and high availability for their applications. These might already be known to many of the experienced DBA’s, but can serve as a useful guide to novice PostgreSQL enthusiasts, and DBAs with less experience, and save precious time. If you have encountered any other with issues while building an PostgreSQL HA systems using repmgr, please feel free to comment below, and reach out to us. As always, we’re happy to help.

Thank you and Stay tuned…