PostgreSQL 15

Guide on Testing pgEdge Platform Multi-Master Replication


In the realm of database management, ensuring the reliability and efficiency of replication systems is crucial. pgEdge Platform’s Multi-Master Replication (MMR) offers a robust solution for maintaining continuous synchronization across multiple servers, enhancing both availability and fault tolerance. But how do you test this complex system effectively? This blog will guide you through the nuances of testing, ensuring that you can confidently deploy a system that stands up to real-world demands.

When diving into the testing and deployment of database systems like pgEdge Platform Multi-Master Replication (MMR), it’s crucial to consider foundational principles that govern the behaviour of distributed systems. One such principle is the CAP Theorem, which offers critical insights into system design and capabilities.

Understanding the CAP Theorem:

The CAP Theorem, proposed by Eric Brewer, states that a distributed database system can only simultaneously provide two out of the following three guarantees:

  • Consistency (C): Every read receives the most recent write or an error.
  • Availability (A): Every request receives a response, without guarantee that it contains the most recent write.
  • Partition tolerance (P): The system continues to operate despite an arbitrary number of messages being dropped (or delayed) by the network between nodes.

In the context of MMR, this theorem is especially pertinent because it highlights the trade-offs involved in designing and testing a replication system. Since network partitions are a reality that must be handled, choosing between consistency and availability becomes a strategic decision based on the specific requirements of your application.

Here’s what I feel the building blocks of testing:

  • Understanding MMR Basics: What is Multi-Master Replication in PostgreSQL and why is it important?
  • Setting Up Your Test Environment: Steps to prepare an optimal testing landscape.
  • Key Metrics for Testing : What to measure and why.
  • Common Challenges and Solutions: Troubleshooting common issues during testing.
  • Automating MMR Tests: Tools and scripts to streamline the process.
  • Real-World Testing Scenarios: Simulating real user transactions to gauge performance.


 multi-master (active-active) distributed PostgreSQL allows you to have   multiple master databases spread across different locations (multiple nodes), each capable of handling read and write traffic simultaneously

MMR improves data access times for your applications by using bi-directional replication and conflict resolution.

This is done through pgEdge’s Spock extension that allows you to create a multi-master (active-active) replication solution for your PostgreSQL database.


git clone

Test Strategies:

So,what are the strategies that i follow for MMR testing:

  • Test scripts were written in perl(python is also preferred)
  • Test Scripts can run individually or by using a schedule
  • To run a schedule, include the -s flag and the name of the schedule file – like:  ./runner -s schedule_name
  • 4)To run an individual test, include the -t flag, and the name of a test – like: ./runner -t testname
  • 5)As each test script executes (individually or through a schedule), it returns a pass or fail on the command line. 

a)setup scripts

 b)node creation

c)Selective replication


e)uninstall pgedge


Testing PostgreSQL Multi-Master Replication is essential for any business that relies on continuous data availability and integrity. By following this blog, you can ensure that your PostgreSQL databases are resilient, efficient, and prepared for high-demand scenarios. Remember, the key to successful MMR deployment lies in meticulous planning, comprehensive testing, and continuous optimization. In conclusion, one can seamlessly setup MMR with pgEdge Platform which provides solutions for modern applications and can easily point out the failure through script-based testing, which makes the testing process hassle-free.

PostgreSQL Database Choices : Shared vs Separate for Microservices

In the ever-evolving realm of Microservices, one of the fundamental decisions developers face is whether to adopt a shared database model or opt for separate databases for each service. This blog aims to dissect both approaches, offering insights to help you navigate this critical decision with confidence

Shared Storage Space

Imagine a giant storage room where all your Microservices “live” together. Sounds convenient, right?

Here’s the upside:

Less Cleaning Up: Managing just one storage room means less work

Reliable Data: Special features keep your data dependable, like everyone following clear rules!

But there are some downsides too:

Scheduling Headaches: Services become dependent on each other, making changes tricky and maintenance difficult

One-Size-Fits-None: The same storage room might not be perfect for everyone’s needs (some services might need a lot of storage space, while others might be fine with a smaller area)

Separate Storage Spaces: Freedom with Responsibilities

Now each service has its own little storage space. This offers some perks:

Independent Service Stars: Services can develop, update, and grow on their own schedule

The Perfect Fit: Choose the ideal “storage bin” (database type) for each service (like a big one for user data and a smaller one for post comments)

But freedom comes with chores:

More Spaces, More Cleaning: Managing multiple storage spaces is a lot of maintenance 

Keeping Things Tidy Across Spaces: Ensuring data stays consistent across separate spaces can be messy (like making sure everyone cleans up after themselves)

SQL Example: Shared Storage Space

Let’s see how this works with a social media platform using a shared storage space (database). Imagine separate services for users (UserService) and posts (PostService):

-- Shared Database


-- Using the SocialMediaDB storage

connect "SocialMediaDB";

-- User (UserService)

  "Username" VARCHAR(50),
  "Email" VARCHAR(50),
  "Followers" INT

-- Post  (PostService)

  "PostID" INT,
  "UserID" INT,
  "Content" TEXT,
  "Timestamp" TIMESTAMP,
  FOREIGN KEY ("UserID") REFERENCES "Users" ("UserID") -- posts are linked to users

In this example, the Users and Posts tables live in the same SocialMediaDB storage room. The Posts table has a special key that connects it to the Users table, ensuring data consistency (a post must be linked to a valid user)

Separate Storage Spaces and the Data Pipeline

With separate storage spaces, the PostService needs a way to access user information from the UserService space.

This can involve “data pipelines” to keep things organized:

Calling Neighbors: PostService could call an “API” provided by UserService to get user data

Sharing News: Services could send alerts whenever data changes, allowing others to update their own data accordingly

Finding Your Storage Space Sweet Spot

So, shared or separate? It depends on your project! Here are some tips to find the perfect fit:

Clear Cabinet Rules: Even with a shared storage room, clear rules ensure everyone respects each other’s space (like having personal belongings labeled)

Separate Desks, Shared Utilities: While having separate workspaces is nice (separate schemas), core functionalities shouldn’t be affected (shared core data model)

Needs, Not Trends: Don’t choose an approach based on popularity. Pick what works best for your project (consider factors like data size, how often data is accessed, and your team’s experience)


Choosing between shared and separate storage spaces for your Microservices is a crucial decision. By understanding the advantages and disadvantages of each approach, as well as considering your project’s specific needs, you can make a well-informed choice that sets your project up for success.

PostgreSQL OID Datatype: Impacts and Remedies


PostgreSQL – The World’s Most Advanced Open Source Relational Database, introduced a significant change with version 12: the discontinuation of the Object Identifier (OID) datatype. This move has left many developers and database administrators pondering over its implications and seeking remedies for databases that heavily relied on OID. Fear not! We’re here to dissect the what, why, and how of this transition, providing you with practical examples and remediation strategies.

PostgreSQL has ramped down from the OID datatype starting version 12, marking a significant change for developers and database administrators. This post delves into what this means for your databases and how to smoothly transition from using OID.

For instance, we have been working with a mid-sized tech company that recently updated its legacy systems to comply with the new standards reported a notable improvement in database performance and manageability. The initial apprehension about migrating large objects and updating application code gave way to relief and satisfaction as the process unfolded more smoothly than anticipated. Their experience underscores a common sentiment: while the shift requires effort, the clarity, scalability, and performance gains on the other side are well worth it. Feedback from SMEs highlight the resilience and adaptability of the PostgreSQL community, reminding us that while change can be challenging, it also brings opportunities for growth and improvement.

Why say goodbye to OID?
The decision to discontinue OIDs didn’t come out of the blue. It was motivated by several factors:

  • Performance and Scalability: OIDs are of fixed size, limiting the scalability of databases. As databases grow, the need for more flexible and scalable solutions becomes apparent.
  • Best Practices: Moving away from OIDs encourages the use of explicit primary keys, aligning with modern database design practices.
  • Simplicity and Clarity: Eliminating OIDs simplifies the PostgreSQL system catalog and reduces the cognitive load on developers and database administrators.

Transitioning Away from OID:
If you’re working with older versions of PostgreSQL and relying on OID, transitioning might seem daunting. However, with a few strategic steps, you can ensure a smooth migration. Here’s how:

  1. Identifying OID Usage
    First, assess your database to identify where and how OIDs are used. You can run a query to find all tables that use OIDs:
  2. Migrating to Explicit Primary Keys
    For tables identified with OID usage, the next step is to introduce explicit primary keys. If a table does not have a primary key, you can add one as follows:

You can simply add a new column called oid that is automatically filled:


Using an identity column like this has the advantage that you get an error if you by mistake try to INSERT a value into the column manually. This command adds a new column id that serves as a primary key.

  1. Updating Application Code
    Applications interacting with your database may need updates to work with the new primary key structure. Ensure that your application logic references the explicit primary keys instead of OIDs.
  2. Handling Large Objects (LOs)

For large objects, PostgreSQL provides the lo module, which can replace OID-based large object management. You can migrate large objects using the lo_import and lo_export functions.

Adding to the various reasons for PostgreSQL’s shift away from OIDs is the matter of logical replication. Logical replication, a feature introduced in PostgreSQL 10, allows for the replication of data changes between PostgreSQL databases in a more granular and flexible manner compared to traditional physical replication. However, it’s crucial to note that logical replication does not support the replication of OID values. This limitation underscores a significant challenge for databases relying on OIDs for data identification and relationship management.

Why is this limitation noteworthy? In the age of distributed databases and high availability, logical replication stands as a cornerstone for many PostgreSQL deployment architectures. Its inability to handle OID values seamlessly means that any system relying on OIDs for key data interactions could face hurdles in data consistency and integrity across replicated environments. This constraint further emphasizes the need for transitioning away from OID usage, advocating for a move towards explicit primary keys and more replication-friendly data management strategies.

Getting away from OID datatype in PostgreSQL 12 signifies a shift towards more modern, scalable, and clearer database practices. While the transition may require some upfront work, particularly for legacy systems, the long-term benefits in performance, scalability, and manageability are undeniable. By understanding the reasons behind this change, assessing your current database usage, and implementing the recommended remediation strategies, you can navigate this transition smoothly. Embrace the change, and you’ll find your PostgreSQL databases are all the better for it.

Postgres Materialized Views: A Practical 101 – Part 2


Welcome back to the second installment of our deep dive into the Materialized views! In our previous post, we looked at the basics of materialized views, showcasing how they can help query performance and simplify complex data aggregations with lightning speed. If you haven’t had the chance to explore these insights, we highly recommend taking a moment to catch up on the first part of this blog series. It sets the stage for the advanced strategies we’re about to uncover, ensuring you’re fully equipped to leverage the full power of materialized views in your PostgreSQL environment.

Beyond the Basics: Indexing Strategies

Another internal aspect to consider is the impact of indexing on materialized views. Just like tables, materialized views can have indexes created on them, which are crucial for query performance. PostgreSQL manages these indexes in the same way as table indexes, using the B-tree indexing method by default but also supporting GIN, GiST, and other indexing strategies. These indexes are particularly valuable when materialized views are used for complex queries involving joins, full-text searches, or aggregations, as they can dramatically reduce query execution times.

Crafting the perfect indexing strategy for your PostgreSQL materialized views can optimize your query performance to new heights. Since materialized views are more than just virtual tables—they’re tangible elements residing within your database—they can be optimized with indexes just like tables. Following approaches would help:

Don’t Forget the Primary Key Index: Just like with your regular tables, indexing the primary key of a materialized view keeps your data aligned with the original tables. Remember, unique data in this column is a must, especially if you’re mixing in concurrent refreshes into your strategy.

Pinpoint the Essential Indexes: Dive into the queries hitting your materialized views and identify which columns are the usual suspects in your query conditions. These are your prime candidates for indexing. However, tread lightly—over-indexing can lead to increased storage use and can bog down the refresh rates of your materialized views.

Select the Optimal Index Type: PostgreSQL isn’t a one-trick pony when it comes to indexes. It boasts a variety, from the all-rounder B-tree to the precise hash, and the more specialized GiST and GIN indexes. B-tree indexes are your go-to for those broad range queries, while hash indexes shine in the face of equality checks. Matching the index type to your query patterns is crucial.

Embrace Composite Indexing: When your queries consistently involve multiple columns, a composite index might be your silver bullet. By bundling columns together in a single index, PostgreSQL can streamline operations like joins, filters, and sorting, reducing the need for separate index lookups.

Iterate with Testing and Tuning: The path to indexing nirvana is iterative. Arm yourself with PostgreSQL’s EXPLAIN and ANALYZE commands to peek into your query execution plans. This insight lets you fine-tune your indexes, ensuring they’re always in top form to meet your performance goals.

By incorporating these nuanced strategies into your PostgreSQL environment, you’ll not only boost the performance of your materialized views but also maintain a balanced system where data integrity, query speed, and storage efficiency coexist harmoniously.

Time-series Data and Matview Alternate

One thing we knwo about the dealing with time-series data — it just keeps coming, and before you know it, your database is brimming. This rapid growth can turn data aggregation into a sluggish chore, making timely insights hard to come by. While working on this problem for a customer, we found Continuous Aggregates from Timescale, which turbocharge the process, making data aggregation as swift as a lightning bolt.

Imagine you’re tracking temperature readings every second of the day. You’re drowning in data and just want a simple average temperature for each hour. Traditionally, every query for this average would mean a laborious trudge through the entire dataset, recalculating each time. It’s as time-consuming as it sounds.

Continuous aggregates, however, change the game. They’re a special breed of hypertable designed to be refreshed on the fly, seamlessly integrating new data or updating with changed data in the background. This is like having a diligent assistant who constantly updates your summaries, tracking any changes in your data and refreshing the behind-the-scenes hypertable without any prompt.

These aggregates update themselves—continuously and incrementally, saving you from the heavy lifting. Compared to the traditional PostgreSQL materialized views, which require a full rebuild with each refresh, continuous aggregates are a breath of fresh air. They significantly reduce the maintenance load, allowing you to focus more on leveraging your data and less on database upkeep.

In essence, continuous aggregates offer a streamlined, efficient path through the ever-growing jungle of time-series data, ensuring your data aggregation is not just fast, but lightning fast.

Learn more at:


Materialized views in PostgreSQL offer a robust solution for improving data retrieval times through precomputed query results. By leveraging advanced features like concurrent refreshes and strategic indexing, developers and data analysts can significantly enhance the performance and responsiveness of their applications. Whether it’s simplifying complex data aggregations or ensuring up-to-date data for reports, materialized views provide a flexible and efficient tool in the PostgreSQL arsenal. For Time-series and huge datasets, looking at alternative solutions like Timescale and Materialize is recommended.

Postgres Materialized Views: A Practical 101 – Part 1


In the realm of database management, efficiency and speed are king. PostgreSQL: The World’s Most Advanced Open Source Relational Database, offers a feature that is nothing short of magical for data analysts and developers alike: Materialized Views. Unlike standard views that dynamically calculate results every time they are accessed, materialized views store the result of a query physically and can be refreshed on demand. This means faster query times and a smoother experience with data-heavy applications. But that’s just scratching the surface.

This post is part of a 2 Blog series and will delve into the intricate details of PostgreSQL materialized views, offering practical examples to illustrate how they can transform your data management practices.

What Are Materialized Views?

At its core, a materialized view in PostgreSQL is a snapshot of a query saved physically on disk. This means that instead of recalculating the results of complex queries every time they’re accessed, the database can serve up ready-made results from the materialized view, slashing query times significantly. This is particularly useful for data analysis and reporting on large datasets where efficiency is paramount.

Magic Behind Refreshing Materialized Views
One of the pivotal features of materialized views is their ability to be refreshed. PostgreSQL offers two main ways to refresh a materialized view:

REFRESH MATERIALIZED VIEW: This command refreshes the entire materialized view, essentially rerunning the underlying query to update its content. It’s straightforward but can be resource-intensive for large datasets.

REFRESH MATERIALIZED VIEW CONCURRENTLY: This advanced feature allows the materialized view to be refreshed without locking out access to the view. It means that data can be updated in the background, and the view remains accessible, ensuring that applications relying on the view for data aren’t interrupted. This approach is ideal for high-availability environments where data needs to be as fresh as possible without sacrificing access.

What’s particularly interesting is how PostgreSQL handles the “CONCURRENTLY” option during a refresh. This process involves creating a temporary version of the materialized view, filling it with the latest data, and then swapping it with the existing version in a way that’s almost seamless to the user. This mechanism ensures that the materialized view remains accessible for SELECTs, providing a non-blocking experience.

Behind the scenes, PostgreSQL uses a snapshot mechanism to manage data consistency. When you query a materialized view, you see the data as it existed at the time of the last refresh, regardless of any changes that have occurred in the base tables since. This snapshot capability is crucial for ensuring data consistency and is a fundamental part of PostgreSQL’s MVCC (Multi-Version Concurrency Control) system.

Putting Materialized Views to Work
To truly appreciate the power of PostgreSQL materialized views, let’s dive into a couple of practical examples.

Example: Simplifying Complex Aggregations

Imagine you have a database that stores sales data across multiple regions. Your goal is to quickly access a summary of monthly sales by region, a query that involves complex aggregations across several tables.

Instead of running this aggregation every time you need monthly sales data, you can create a materialized view:

CREATE TABLE monthly_sales (
id bigserial,
region_id integer,
sale_date timestamp,
amount numeric (10,2),
created_at timestamp DEFAULT now(),
created_by text,
updated_at timestamp DEFAULT now(),

CREATE MATERIALIZED VIEW monthly_sales_summary AS
SELECT region_id,
EXTRACT(MONTH FROM sale_date) AS sale_month,
SUM(amount) AS total_sales
FROM monthly_sales
GROUP BY region_id, sale_month;

Now, accessing monthly sales data is as simple as querying the monthly_sales_summary materialized view, a process that is significantly faster than running the original aggregation query.

Materialized views in PostgreSQL offer a robust solution for improving data retrieval times through precomputed query results. By leveraging advanced features like concurrent refreshes and strategic indexing, developers and data analysts can significantly enhance the performance and responsiveness of their applications. Whether it’s simplifying complex data aggregations or ensuring up-to-date data for reports, materialized views provide a flexible and efficient tool in the PostgreSQL arsenal. In the next post, we’ll look at PostgreSQL internals related to Materialized view, how to track changes, and Mat views indexing.

Stay tuned!

Fixing Standby Slot Problems with pg_auto_failover

Dealing with failover in PostgreSQL clusters has always been tricky, but now there’s a solution: pg_auto_failover. This awesome tool makes it super easy to keep an eye on failovers and resolve the fencing issue.

So, how does it work? Well, pg_auto_failover uses something called the Single Standby Architecture. This fancy term just means it keeps your PostgreSQL services going without any hiccups. And the best part? It’s really easy to use! You’ve got just three main nodes: one primary PostgreSQL node, one standby node that’s always ready to jump in, and a smart monitor node that keeps an eye on everything.

We have an existing primary and running with a hot standby replica. We want to configure the pg_auto_failover for this, as a part of that process. We have created a monitoring node and then added the primary and standby to the cluster.

Create monitor:

nohup /usr/pgsql-14/bin/pg_autoctl create monitor --pgctl /usr/pgsql-14/bin/pg_ctl --pgdata /tmp/monitor_node/ --pgport 5433 --auth trust --run  --no-ssl > /dev/null 2>&1 &

Register primary:

pg_autoctl create postgres --name node2 --pgport 5432 --dbname pg_auto_failover --monitor postgres://autoctl_node@ --auth md5 --ssl-self-signed --hostname
-bash-4.2$ pg_autoctl show state
 Name |  Node |          Host:Port |       TLI: LSN |   Connection |      Reported State |      Assigned State
node1 |     1 | |   1: 0/501D530 |   read-write |        wait_primary |             primary
node2 |     8 | |   1: 0/5000000 |    read-only |           secondary |           secondary

Everything went well and the auto failover is showing the correct status. After a few days of smooth sailing with our PostgreSQL setup, we noticed our pg_wal directory was filling up fast. Upon investigation, we discovered that a replication slot created by pg_auto_failover was to blame. Despite being inactive and unused.

Initially, we tried to drop the replication slot (pgautofailover_standby_8) manually created by pg_auto_failover, but it did not allow us to do the same.

postgres=# select * from pg_replication_slots ;
-[ RECORD 1 ]-------+-------------------------
slot_name           | manual_slot
plugin              |
slot_type           | physical
datoid              |
database            |
temporary           | f
active              | t
active_pid          | 8492
xmin                | 739
catalog_xmin        |
restart_lsn         | 0/501D530
confirmed_flush_lsn |
wal_status          | reserved
safe_wal_size       |
two_phase           | f
-[ RECORD 2 ]-------+-------------------------
slot_name           | pgautofailover_standby_8
plugin              |
slot_type           | physical
datoid              |
database            |
temporary           | f
active              | f
active_pid          |
xmin                |
catalog_xmin        |
restart_lsn         | 0/501D448
confirmed_flush_lsn |
wal_status          | reserved
safe_wal_size       |
two_phase           | f

postgres=# select pg_drop_replication_slot('pgautofailover_standby_8');
-[ RECORD 1 ]------------+-
pg_drop_replication_slot |
postgres=#  select * from pg_replication_slots ;
-[ RECORD 1 ]-------+-------------------------
slot_name           | manual_slot
plugin              |
slot_type           | physical
datoid              |
database            |
temporary           | f
active              | t
active_pid          | 8492
xmin                | 739
catalog_xmin        |
restart_lsn         | 0/501D530
confirmed_flush_lsn |
wal_status          | reserved
safe_wal_size       |
two_phase           | f
-[ RECORD 2 ]-------+-------------------------
slot_name           | pgautofailover_standby_8
plugin              |
slot_type           | physical
datoid              |
database            |
temporary           | f
active              | f
active_pid          |
xmin                |
catalog_xmin        |
restart_lsn         | 0/501D448
confirmed_flush_lsn |
wal_status          | reserved
safe_wal_size       |
two_phase           | f

Additionally, we discovered that slots weren’t always removed during failover or switchover events. Instead, they were only deleted when pg_rewind had something to fix. This happens when pg_rewind clears out various folders, including the one where the slot is stored (called pg_replslot/), leading to the slot’s removal.


To resolve our issues, we dismantled the existing disaster recovery setup and reconfigured it with pg_auto_failover. This demonstrates how pg_auto_failover assumes control over the entire replication process.

bash-4.2$ pg_autoctl drop node --destroy
06:02:27 28290 INFO  Removing node with name "node1" in formation "default" from the monitor
06:02:27 28290 INFO  An instance of pg_autoctl is running with PID 5346, waiting for it to stop.
06:02:28 28290 INFO  The pg_autoctl instance with pid 5346 has now terminated.
06:02:28 28290 INFO  This node with id 1 in formation "default" and group 0 has been dropped from the monitor
06:02:28 28290 INFO  Stopping PostgreSQL at "/var/lib/pgsql/14/data"
06:02:28 28290 INFO  /usr/pgsql-14/bin/pg_ctl --pgdata /var/lib/pgsql/14/data --wait stop --mode fast
06:02:28 28290 INFO  /usr/pgsql-14/bin/pg_ctl status -D /var/lib/pgsql/14/data [3]
06:02:28 28290 INFO  pg_ctl: no server running
06:02:28 28290 INFO  pg_ctl stop failed, but PostgreSQL is not running anyway
06:02:28 28290 INFO  Removing "/var/lib/pgsql/14/data"
06:02:28 28290 INFO  Removing "/var/lib/pgsql/.config/pg_autoctl/var/lib/pgsql/14/data/pg_autoctl.cfg"

postgres=#  select * from pg_replication_slots ;
 slot_name | plugin | slot_type | datoid | database | temporary | active | active_pid | xmin | catalog_xmin | restart_lsn | confirmed_flush_lsn |
wal_status | safe_wal_size | two_phase
(0 rows)

-bash-4.2$ pg_autoctl create postgres --name node1 --pgport 5432 --dbname pg_auto_failover --monitor postgres://autoctl_node@ --auth trust --ssl-self-signed --pgdata /var/lib/pgsql/14/data

-bash-4.2$ pg_autoctl show state
 Name |  Node |          Host:Port |       TLI: LSN |   Connection |      Reported State |      Assigned State
node1 |     1 | |   1: 0/501D530 |   read-write |        wait_primary |             primary
node2 |     8 | |   1: 0/5000000 |    read-only |           secondary |           secondary

postgres=#  select * from pg_replication_slots ;
-[ RECORD 1 ]-------+-------------------------
slot_name           | pgautofailover_standby_9
plugin              |
slot_type           | physical
datoid              |
database            |
temporary           | f
active              | t
active_pid          | 8492
xmin                | 739
catalog_xmin        |
restart_lsn         | 0/501D530
confirmed_flush_lsn |
wal_status          | reserved
safe_wal_size       |
two_phase           | f

pg_auto_failover is built to handle the failure of any single node in the system. If a failure occurs, the Log Sequence Number (LSN) may reference a potentially altered history. Patroni also manages fail-over in a similar manner.

Despite these challenges, we found that pg_auto_failover effectively handles node failures. It’s designed to tolerate the failure of any single node in the system, ensuring continued operation even if one component encounters issues. However, it’s essential to monitor and manage replication slots properly to prevent unnecessary bloat in the pg_wal directory.

In conclusion, while implementing pg_auto_failover may come with its challenges, the benefits far outweigh the drawbacks. By ensuring seamless fail-over management, pg_auto_failover simplifies the task of maintaining PostgreSQL clusters.

Performance tuning with Apache Spark – Introduction


Welcome back to our ongoing series on Data transformation with Apache Spark! In our previous posts, we’ve covered essential topics like setting up Apache Spark on Ubuntu, integrating data with Spark, and querying datasets using Apache Drill. Now, we’re stepping into the realm of performance tuning—a critical aspect of maximizing the efficiency and throughput of your Spark applications. In this installment, we’ll look at the intricacies of fine-tuning Apache Spark for optimal performance, specifically focusing on optimizing the transformation of data from a Postgres table to Amazon S3.


Apache Spark stands as a powerhouse for processing vast volumes of data. However, harnessing its full potential requires careful calibration of various parameters. In this blog post, we’ll lead you through the glance steps of performance optimization techniques tailored specifically to the task of seamlessly transferring data from a Postgres database to the cloud-based storage solution, Amazon S3.

Data Serialization:

Data serialization is the process of converting objects into a byte stream and vice versa. This is an optimization technique that allows data to be efficiently passed between different nodes in a cluster and stored in a variety of formats.  Spark uses different methods of serialization, we’ll focus on Kryo serialization:

Kryo Serialization:

Kryo serialization is a technique used by Spark to serialize objects more quickly. It’s designed to be fast and efficient

  • Kryo is faster and more compact than Java serialization, often up to 10 times faster. It also has a smaller memory footprint than Java serialization, which is important when shuffling and caching large amounts of data.
  • Kryo is used by default in Spark, but it requires custom registration and is not natively supported to serialize to the disk.
  • You must set a configuration property i.e. org.apache.spark.serializer.KryoSerializer if you want to set the Kryo serializer as part of a Spark job.

Real-Time Example: Optimizing with Kryo Serialization

Consider a Spark application that processes large datasets from a Postgres table, transforming and storing the results in S3. By switching from Java serialization to Kryo serialization, the application can achieve better performance. Here’s how you can enable Kryo serialization in your Spark configuration:

val sparkConf = new SparkConf() sparkConf.set("spark.serializer", "org.apache.spark.serializer.KryoSerializer") sparkConf.registerKryoClasses(Array(classOf[YourClass])) val spark = SparkSession.builder().config(sparkConf).getOrCreate()

This configuration change can lead to faster processing times, especially when dealing with large datasets.


Spark caching is a feature that helps improve the performance of Spark applications. It stores intermediate results in an accessible place that can be quickly recalled.

  • cache() and persist() are two methods used in this technique
  • cache() stores it in the memory, and persist() stores it in the user-defined storage level.
  •  RDD.Cache()would always store the data in memory. RDD.Persist() allows the storage of some part of data in the memory and some part on the disk.

Real-Time Example: Strategic Caching in Data Transformation

Imagine a scenario where your Spark job involves multiple transformations on data retrieved from a Postgres table before storing it in S3. If certain transformations are reused, caching intermediate datasets can significantly reduce the total execution time. Here’s an example of using caching effectively:

val dataset =, "tableName", connectionProperties) val intermediateDataset = dataset.filter(...).cache()

This intermediate dataset can then be used in multiple actions without the need to recompute it from the original source, thus optimizing the overall performance.

Data Structure Tuning:

Data structure tuning in Apache Spark is a crucial optimization technique that significantly impacts the performance, efficiency, and scalability of Spark applications. The choice of data structure—RDDs (Resilient Distributed Datasets), DataFrames, and Datasets—plays a vital role in how Spark manages and processes data. Each data structure has its own set of features and optimizations that can either leverage Spark’s capabilities to the fullest or become a bottleneck if not used appropriately. Understanding the characteristics and optimal use cases for each data structure allows developers to design Spark applications that are both fast and resource-efficient.

RDDs (Resilient Distributed Datasets) are the fundamental data structure of Spark, providing low-level functionality and fine-grained control over data. They are immutable, distributed collections of objects that enable fault-tolerant, parallel data processing. RDDs are best suited for scenarios where you need full control over data processing and optimization, such as custom partitioning schemes or when performing operations that are not easily expressible with DataFrames or Datasets API.

DataFrames are a distributed collection of data organized into named columns, conceptually similar to tables in a relational database. Built on top of RDDs, DataFrames provide a higher-level API that leverages Spark’s Catalyst optimizer for improved performance and efficiency. DataFrames are ideal for handling structured and semi-structured data, allowing for efficient storage and processing of large datasets. They are particularly well-suited for SQL queries, data aggregation, and columnar operations.

Datasets are a type-safe version of DataFrames that provide the benefits of RDDs (type safety and object-oriented programming) with the optimization advantages of DataFrames. Datasets are particularly useful when you require type safety for compile-time checks or prefer working with strongly typed data structures in Scala or Java. They are also beneficial for custom transformations that are not easily expressed with DataFrame operations.

Practical Example: Data Structure Optimization

Imagine a Spark application that processes large volumes of structured log data stored in a distributed file system. The goal is to perform data cleansing, transformation, and aggregation to generate summary reports.

Initial Approach using RDDs:

val logsRDD = sc.textFile("path/to/logs") val cleanedLogsRDD = logsRDD.filter(_.contains("INFO")).map(transformLogEntry) // Perform aggregation val summaryRDD = cleanedLogsRDD.mapToPair(...).reduceByKey(...)

Optimized Approach using DataFrames:

val logsDF ="path/to/logs") val cleanedLogsDF = logsDF.filter($"level" === "INFO").selectExpr("transformLogEntry(columns)") // Perform aggregation using DataFrame API for better optimization val summaryDF = cleanedLogsDF.groupBy("date").agg(count("message"))

In this example, switching from RDDs to DataFrames allows the application to leverage Spark’s Catalyst optimizer for better performance during filtering, transformation, and aggregation operations. The declarative nature of the DataFrame API not only makes the code more concise but also enables Spark to optimize the execution plan more effectively.

Garbage collection tuning:

Garbage collection is an essential component of any Java-based application, and this is especially true for Apache Spark, which heavily relies on the Java Virtual Machine (JVM) to execute its code. GC is the process of reclaiming memory used by objects that are no longer needed by the application, thus preventing memory leaks and ensuring the efficient use of memory resources. However, aggressive or poorly optimized garbage collection can lead to increased pause times, affecting the overall throughput and latency of Spark applications.

In Java Virtual Machine (JVM)-based systems like Spark, garbage collection can sometimes become a bottleneck. This is particularly true for Spark applications that create and destroy a large number of objects rapidly, such as those involving iterative algorithms, large shuffles, or extensive data transformations. Excessive garbage collection can lead to:

  • Increased Pause Times: GC events can cause the application to pause, waiting for memory to be freed up. These pauses can degrade the performance of Spark applications, especially if they occur frequently or last for long durations.
  • Reduced Throughput: Frequent garbage collection can significantly reduce the throughput of Spark applications, as more CPU cycles are spent collecting garbage rather than executing application logic.
  • Jitter and Latency Variability: Inconsistent GC behaviour can introduce variability in application latency, making performance unpredictable and complicating the tuning of Spark applications.

Tuning Garbage Collection

Consider a Spark application experiencing high latency due to frequent garbage collection pauses. To address this, you might start by switching to the G1 garbage collector and adjusting the heap size based on the application’s memory usage patterns.

val sparkConf = new SparkConf() sparkConf.set("spark.driver.extraJavaOptions", "-XX:+UseG1GC -Xms4g -Xmx4g") sparkConf.set("spark.executor.extraJavaOptions", "-XX:+UseG1GC -Xms4g -Xmx4g") val spark = SparkSession.builder().config(sparkConf).appName("GCTunedApp").getOrCreate()

In this configuration:

  • -XX:+UseG1GC enables the G1 Garbage Collector, which is designed to offer a good balance between throughput and pause times, especially suitable for applications with large heaps.
  • -Xms4g and -Xmx4g set the initial and maximum heap size to 4GB, respectively, providing sufficient memory for the application’s needs while limiting the heap size to avoid excessive GC pauses.

Memory Management:

Apache Spark divides its memory between two main categories: execution memory and storage memory. This division is crucial for balancing the memory usage between computational needs and data storage within an application.

  • Execution Memory: Utilized for computational processes such as joins, shuffles, aggregations, and sorting. This region is dynamic; it can borrow space from storage memory when the demand for execution memory exceeds its initial allocation, provided the storage memory is not being used at its full capacity.
  • Storage Memory: Allocated for caching and persisting RDDs, DataFrames, and Datasets. Unlike execution memory, storage memory cannot borrow space from execution memory. This ensures that cached data remains in memory as long as possible, improving the performance of Spark applications by reducing the need to recompute or fetch data from disk.

Within the total memory allocated to Spark tasks, there’s a shared region where execution and storage memory can coexist. However, there’s a protected sub-region within the storage memory (often referred to as R), designed to safeguard a portion of the cached data from being evicted by execution tasks.

Consider a Spark application that performs complex data transformations requiring extensive shuffling and sorting, alongside caching significant datasets for iterative processing. To optimize the performance of this application, you might adjust both the spark.memory.fraction and spark.executor.memory settings to ensure sufficient memory is allocated to both execution and storage tasks.

val sparkConf = new SparkConf()
// Increase total executor memory to 4GB
sparkConf.set("spark.executor.memory", "4g")
// Increase the memory fraction to 75% of the executor memory
sparkConf.set("spark.memory.fraction", "0.75")
// Apply the configuration to SparkSession
val spark = SparkSession.builder().config(sparkConf).appName("MemoryOptimizedApp").getOrCreate()

In this configuration, increasing the executor memory to 4GB and the memory fraction to 75% ensures that more memory is available for both execution and storage, reducing the need for spilling data to disk and improving overall application performance.


Effective memory management is a cornerstone of optimizing Apache Spark applications for better performance. By carefully configuring memory allocation and leveraging dynamic allocation, you can significantly enhance the efficiency of data transformation processes, such as moving data from a Postgres table to Amazon S3. These strategies ensure that your Spark applications utilize available memory resources optimally, minimizing disk I/O and accelerating processing times for large-scale data analytics tasks.

Using pg_index_watch for PostgreSQL Indexing

Let’s delve into exploring pg_index_watch. In this instalment, I will guide you through the rationale behind its creation and explain its operational nuances.

Meet pg_index_watch – a utility for automagical rebuild of bloated indexes, an absolutely handly tool designed to tackle index bloat on PostgreSQL tables with high velocity of UPDATEs.

Index bloat on frequently updated tables is a common problem in PostgreSQL. Even with autovacuum, which is built-in, index bloat often persists despite different settings. pg_index_watch steps in to address this by automatically rebuilding indexes when necessary, offering a solution where autovacuum falls short.

PostgreSQL 12 brought about the introduction of REINDEX CONCURRENTLY, offering a safer method to rebuild indexes and consequently reduce bloat. However, a lingering question remained:

How do we determine when an index is bloated and if it needs to be rebuilt?

We require a straightforward statistical model to help us gauge the level of index bloat without the necessity of conducting a thorough review of the entire index. 

pg_index_watch tackles this challenge with a unique strategy:

Utilizing PostgreSQL’s capabilities, pg_index_watch leverages two key metrics:

  • The count of rows within the index (accessible via pg_class.reltuples)
  • Size of the index itself

By monitoring this ratio, pg_index_watch can detect when an index is bloated – indicated by a doubling of the ratio from its regular state. This signals that the index is nearing a 100% bloat rate, prompting pg_index_watch to spring into action, initiating a reindexing process without requiring manual intervention from database administrators.

For installing and using pg_index_watch, you’ll need:

  • PostgreSQL version 12.0 or higher
  • Superuser access and Passwordless access to the database as a superuser 


If your server can handle it, consider setting up multiple (i.e., non-zero) max_parallel_maintenance_workers to speed up index rebuilds.

Installation (as a PostgreSQL user): 

Obtain the pg_index_watch code by cloning the repository from GitHub using the command

git clone

Navigate into the pg_index_watch directory. 

Then, execute the SQL script:

to create the necessary table structure, followed by importing the code, including stored procedures, using index_watch_functions.sql via:

psql -1 -d osdb_index_bloat -f index_watch_functions.sql

During the initial launch of pg_index_watch, it’s imperative to prepare for a significant transformation. At this crucial stage, all indexes exceeding 10MB (default setting) undergo a comprehensive rebuilding process. This meticulous approach guarantees that your indexes are finely tuned for peak performance.

However, be mindful that this process may require several hours or even days to complete, particularly for larger databases. We recommend initiating this process manually to ensure a seamless transition.

nohup psql -d osdb_index_bloat -qt -c "CALL index_watch.periodic(TRUE);" >> index_watch.log

Continuous Monitoring: 

Following the initial launch, pg_index_watch seamlessly transitions into continuous monitoring mode. From this point forward, only newly created or bloated indexes trigger alerts, ensuring ongoing database optimisation. Furthermore, pg_index_watch offers the flexibility to utilise current index sizes as a baseline, further enhancing its adaptability to your database’s evolving needs.

Setting Up Cron:

To uphold consistent performance, establishing a cron schedule for regular monitoring is essential. Whether scheduling daily checks at midnight or hourly scans during periods of high activity, cron ensures that pg_index_watch remains vigilant without impeding critical maintenance tasks.

00 * * * * psql -d osdb_index_bloat -AtqXc "select not pg_is_in_recovery();" | grep -qx t || exit; psql -d osdb_index_bloat -qt -c "CALL index_watch.periodic(TRUE);"

In Conclusion

With pg_index_watch by your side, attaining peak PostgreSQL performance becomes a straightforward endeavour. As you embark on this path, have confidence that your database’s well-being and effectiveness are being diligently monitored. Embrace the benefits of continuous monitoring, and witness the remarkable enhancements that pg_index_watch brings to your PostgreSQL databases.

PostgreSQL User Management: Best Practices & Security Nuances

PostgreSQL – The World’s Most Advanced Open Source Relational Database and stands out as a robust and feature-rich solution, offering extensive capabilities for user management. Effective user management is important for ensuring data security, integrity, and accessibility within the platform. Let’s explore the intricacies of PostgreSQL user management, security considerations, and best practices to empower DBAs and use these features effectively.

User Management:

At its core, PostgreSQL employs a role-based access control (RBAC) system for managing users and their privileges. Roles can be categorized into two main types: database roles and login roles. Database roles define permissions within a specific database, while login roles grant access to the PostgreSQL cluster as a whole.

Security Considerations:

PostgreSQL prioritizes security, offering robust mechanisms to safeguard sensitive data and resources. Key security considerations in PostgreSQL user management include:

  • Password Management: Encouraging strong password policies and regularly rotating passwords helps mitigate the risk of unauthorized access. PostgreSQL supports various authentication methods, including latest and greatest scram-sha256 password authentication, certificate-based authentication, LDAP authentication and many more.
  • Role Privileges: Granting roles with minimal privileges necessary for their respective tasks minimizes the potential impact of security breaches. PostgreSQL’s fine-grained privilege system allows granular control over database objects, schemas, and system functions.
  • Encryption: DBAs can leverage PostgreSQL’s built-in encryption features, such as SSL/TLS encryption for secure connections. Transparent Data Encryption (TDE) for data-at-rest encryption, enhances data protection against interception and unauthorized access, although, this feature is not available in PostgreSQL Core, there’re extensions and forks that offer Postgres + TDE.
  • Auditing and Logging: Enabling comprehensive auditing and logging mechanisms provides visibility into user activity, facilitating the detection and investigation of security incidents. PostgreSQL’s logging facilities allow administrators to track authentication attempts, SQL queries, and database modifications.

Internal Nuances:

PostgreSQL’s internal architecture influences user management in several ways, including:

  • pg_catalog Schema: The pg_catalog schema houses system catalogs containing metadata about database objects, roles, and privileges. Understanding the structure and contents of the pg_catalog schema is essential for effective user management and troubleshooting.
  • Understanding Roles: PostgreSQL allows roles to be shared across multiple databases within a cluster (shared roles). DBAs must carefully consider the implications of role scope when designing user management strategies. A well-refined ACL Policy document across the Database deployment is very important.
  • Role Hierarchy: PostgreSQL supports role inheritance, allowing roles to inherit privileges from other roles. This hierarchical relationship simplifies user management by facilitating role composition and delegation of responsibilities.

Best Practices:

To optimize PostgreSQL user management, consider implementing the following best practices:

  • Role Segregation: Segregate user roles based on their functional responsibilities, such as administrators, developers, and analysts, to enforce the principle of least privilege and minimize the risk of unauthorized access.
  • Regular Audits: Conduct regular audits of user roles, privileges, and access patterns to identify potential security vulnerabilities or deviations from established policies. Automated tools and scripts can streamline the auditing process and ensure compliance with security standards.
  • Role Revocation: Immediately revoke access for users or roles that are no longer required or have been compromised. Timely removal of unnecessary privileges reduces the attack surface and mitigates the risk of unauthorized data access or manipulation.
  • Continuous Education: Educate DBAs, DevOps, Developers, and end-users on best practices for user management, password security, and data access control. Regular training sessions and knowledge-sharing initiatives foster a culture of security awareness and accountability.

In conclusion, PostgreSQL user management is a multifaceted aspect of database administration that requires careful consideration of security principles, internal mechanisms, and best practices. By implementing robust security measures, understanding PostgreSQL’s internal nuances, and adhering to established best practices, organizations can effectively manage user access, protect sensitive data, and maintain the integrity of their PostgreSQL databases. With a proactive approach to user management, PostgreSQL administrators can confidently navigate the complexities of database security and ensure the resilience of their database environments.

Open-source Data Engineering with PostgreSQL

Blog-4: Apache Drill Magic across PostgreSQL, Local Parquet, and S3


Welcome back! Following our exploration of data movement between PostgreSQL and Amazon S3 in the previous blog, we now venture into the realm of querying with Apache Drill. In this sequel, we’ll demonstrate the simplicity of querying PostgreSQL tables, local Parquet files, and Parquet files on S3. Join us as we seamlessly connect diverse data sources using the power of Apache Drill and SQL. Let’s continue our data journey.

Configuring the Apache Drill for PostgreSQL

To begin querying PostgreSQL data, the first step is to add the PostgreSQL storage plugin.

  • Open Apache drill web UI using http://localhost:8047/storage. Here we need to add this code as a PostgreSQL plugin providing the necessary details such as database URL, username, and password. Save the configuration, now Apache Drill can recognize PostgreSQL as a data source.
  • An alternate approach is Apache Drill, which also supports adding storage plugins through CLI(command line interface. we can add the plugin to storage-plugins-override.conf file.
  "type": "jdbc",
  "enabled": true,
  "driver": "org.postgresql.Driver",
  "url": "jdbc:postgresql://localhost:5432/spark_db",
  "username": "postgres",
  "password": "postgres",
  "sourceParameters": {
    "minimumIdle": 0,
    "autoCommit": true,
    "connectionTestQuery": "select version() as postgresql_version",
    "dataSource.cachePrepStmts": true,
    "dataSource.prepStmtCacheSize": 250


  • Now we are ready to explore SQL power across PostgreSQL, We can visualize the PostgreSQL across Drill using commands such as SHOW SCHEMASSHOW DATABASES;

  • We can use the specific schema as per our need (example ‘public’) using the command ‘USE postgresql.public’, then we can see the tables present in that database using the ‘SHOW TABLES’ command

  • We can now run the SQL query to read the data in the desired table

Working on Local Parquet Files:

  • Now, let’s move on to querying Parquet files stored locally
  • For querying parquet files we don’t need any specific plugins to be added as it has parquet file support inbuilt
  • The beauty of Drill is that it allows direct querying for extracting data from parquet format and displaying the query output in a  readable format

Working on S3 parquet Files:

  • As we previously mentioned the Parquet files reading doesn’t need any specific configuration; and Drill seamlessly interacts with S3 as well, but a few things have to be updated in the existing configuration file like the AWS Access Key ID, the Secret Key, and the bucket name for Drill to be able to access data stored in Amazon S3. 
  • Here is the query for extracting data from Parquet files in the S3 bucket

Notice the simplicity – no complex configurations are needed. We can effortlessly translate the data to formats with SQL queries using Apache Drill.


In this blog, we delved into the magic of Apache Drill, starting our journey from configuring PostgreSQL storage plugins to minimal configuration updates for S3 interactions, we highlighted the simplicity of querying the PostgreSQL, local Parquet files, and S3-residing Parquet files.

Stay tuned for the final installment in our series! The upcoming blog focuses on optimizing performance using Apache Spark and Apache Drill. Get ready to elevate your data engineering game with insights into enhancing query speed and maximizing efficiency. Until then, happy querying and exploring.