Scheduling Extension

PostgreSQL OID Datatype: Impacts and Remedies

Introduction

PostgreSQL – The World’s Most Advanced Open Source Relational Database, introduced a significant change with version 12: the discontinuation of the Object Identifier (OID) datatype. This move has left many developers and database administrators pondering over its implications and seeking remedies for databases that heavily relied on OID. Fear not! We’re here to dissect the what, why, and how of this transition, providing you with practical examples and remediation strategies.

PostgreSQL has ramped down from the OID datatype starting version 12, marking a significant change for developers and database administrators. This post delves into what this means for your databases and how to smoothly transition from using OID.

For instance, we have been working with a mid-sized tech company that recently updated its legacy systems to comply with the new standards reported a notable improvement in database performance and manageability. The initial apprehension about migrating large objects and updating application code gave way to relief and satisfaction as the process unfolded more smoothly than anticipated. Their experience underscores a common sentiment: while the shift requires effort, the clarity, scalability, and performance gains on the other side are well worth it. Feedback from SMEs highlight the resilience and adaptability of the PostgreSQL community, reminding us that while change can be challenging, it also brings opportunities for growth and improvement.

Why say goodbye to OID?
The decision to discontinue OIDs didn’t come out of the blue. It was motivated by several factors:

  • Performance and Scalability: OIDs are of fixed size, limiting the scalability of databases. As databases grow, the need for more flexible and scalable solutions becomes apparent.
  • Best Practices: Moving away from OIDs encourages the use of explicit primary keys, aligning with modern database design practices.
  • Simplicity and Clarity: Eliminating OIDs simplifies the PostgreSQL system catalog and reduces the cognitive load on developers and database administrators.

Transitioning Away from OID:
If you’re working with older versions of PostgreSQL and relying on OID, transitioning might seem daunting. However, with a few strategic steps, you can ensure a smooth migration. Here’s how:

  1. Identifying OID Usage
    First, assess your database to identify where and how OIDs are used. You can run a query to find all tables that use OIDs:
  2. Migrating to Explicit Primary Keys
    For tables identified with OID usage, the next step is to introduce explicit primary keys. If a table does not have a primary key, you can add one as follows:

You can simply add a new column called oid that is automatically filled:

ALTER TABLE test ADD oid bigint GENERATED ALWAYS AS IDENTITY NOT NULL;

Using an identity column like this has the advantage that you get an error if you by mistake try to INSERT a value into the column manually. This command adds a new column id that serves as a primary key.

  1. Updating Application Code
    Applications interacting with your database may need updates to work with the new primary key structure. Ensure that your application logic references the explicit primary keys instead of OIDs.
  2. Handling Large Objects (LOs)

For large objects, PostgreSQL provides the lo module, which can replace OID-based large object management. You can migrate large objects using the lo_import and lo_export functions.

Adding to the various reasons for PostgreSQL’s shift away from OIDs is the matter of logical replication. Logical replication, a feature introduced in PostgreSQL 10, allows for the replication of data changes between PostgreSQL databases in a more granular and flexible manner compared to traditional physical replication. However, it’s crucial to note that logical replication does not support the replication of OID values. This limitation underscores a significant challenge for databases relying on OIDs for data identification and relationship management.

Why is this limitation noteworthy? In the age of distributed databases and high availability, logical replication stands as a cornerstone for many PostgreSQL deployment architectures. Its inability to handle OID values seamlessly means that any system relying on OIDs for key data interactions could face hurdles in data consistency and integrity across replicated environments. This constraint further emphasizes the need for transitioning away from OID usage, advocating for a move towards explicit primary keys and more replication-friendly data management strategies.

Conclusion:
Getting away from OID datatype in PostgreSQL 12 signifies a shift towards more modern, scalable, and clearer database practices. While the transition may require some upfront work, particularly for legacy systems, the long-term benefits in performance, scalability, and manageability are undeniable. By understanding the reasons behind this change, assessing your current database usage, and implementing the recommended remediation strategies, you can navigate this transition smoothly. Embrace the change, and you’ll find your PostgreSQL databases are all the better for it.

Introducing pg_cron – The Automation task master

The pg_cron extension is a PostgreSQL extension that allows you to schedule regular database tasks using the familiar cron syntax. Any tasks that need to be done at regular intervals, such as data archiving, partitioning, creating or deleting backup sets, etc. can be automated using pg_cron.

Installation the pg_cron extension:

Install the pg_cron extension dependencies: The pg_cron extension depends on the libpq and libcron libraries. Install these dependencies using the package manager of your operating system.

Modify PostgreSQL configuration: Open the postgresql.conf file of your PostgreSQL installation and add the following line to enable the pg_cron extension:

postgres=# alter system set shared_preload_libraries = 'pg_cron';

postgres=# Alter system set cron.databasename = ‘postgres’;

Restart the PostgreSQL service to apply the configuration changes.

Systemctl restart postgresql.service

Create the pg_cron extension:

postgres=# create extension pg_cron;

Verifying pg_cron installation

Below is an example to schedule a vacuum task. The expression used for the scheduler is similar to any other scheduled cron job on the linux operating system.

postgres=# SELECT cron.schedule('2 * * * *', 'vacuum');

schedule 
----------
        2

We can check the output of the schedule 2 as following in the log

2023-05-16 21:32:00.015 IST [60387] LOG:  cron job 2 starting: vacuum
2023-05-16 21:32:00.064 IST [60387] LOG:  cron job 2 COMMAND completed: VACUUM
2023-05-16 21:33:26.019 IST [60381] LOG:  checkpoint starting: time

2023-05-16 21:33:28.333 IST [60381] LOG:  checkpoint complete: wrote 23 buffers (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=2.308 s, sync=0.002 s, total=2.315 s; sync files=10, longest=0.001 s, average=0.001 s; distance=107 kB, estimate=464 kB

We can check the history in the cron.job_run_details table as follows.

postgres=# select * from cron.job_run_details ;

jobid | runid | job_pid | database | username |      command       |  status   |  return_message   | 

——-+——-+———+———-+———-+——————–+———–+——————-+-

     3 |    28 |   60541 | postgres | postgres | vacuum             | succeeded | VACUUM           

|      1 |     1 |         | postgres | postgres | select * from emp; | failed    | connection failed |  

We can check all the scheduled cron jobs as following command

postgres=# select * from cron.job;

To unschedule a job using the job id

Conclusion:

You have successfully installed the pg_cron extension in your PostgreSQL database. You can now start scheduling tasks using pg_cron to automate your routine database activities on Postgres by following the examples above. Using pg_cron can help save on some of the developers and DBAs time, which could then be used for something more critical or productive,  and also add a layer of efficiency to the business processes by reducing human dependency.