Automating PostgreSQL Tasks with pg_cron

In the world of database management, automation is key to maintaining efficiency and ensuring that routine tasks are executed consistently. For PostgreSQL users, the pg_cron extension provides a powerful way to schedule jobs directly within the database. In this post, we’ll explore what pg_cron is, how to install it, and how to use it effectively.

What is pg_cron?

pg_cron is an extension for PostgreSQL that allows you to run scheduled jobs using familiar cron syntax. With this, you can automate various tasks, such as database maintenance, data processing, and reporting, all from within your PostgreSQL environment. This eliminates the need for external cron jobs or scripts, making your setup cleaner and more integrated.

Installing pg_cron

Identify the version of pg_cron that is compatible with your PostgreSQL installation. For example, to install the pg_cron (pg_cron_16) extension on PostgreSQL 16, run:

#For RHEL / Fedora Derivatives:
sudo yum install -y pg_cron_16

#For Ubuntu / Debian Derivatives:
sudo apt-get install -y postgresql-16-cron

Setting Up pg_cron

To start the pg_cron background worker when PostgreSQL starts, you need to add pg_cron to shared_preload_libraries in postgresql.conf . Note that pg_cron does not run any jobs while the server is in hot standby mode, but it automatically starts when the server is promoted.

By default, the pg_cron background worker expects its metadata tables to be created in the postgres database. However, you can configure this by setting the cron.database_name configuration parameter in postgresql.conf.

#Parameters need to be change for pg_cron 
shared_preload_libraries = 'pg_cron'
cron.database_name = 'postgres'

After changing the parameter and restarting the PostgreSQL server.

You can create the pg_cron and metadata tables using

postgres=# CREATE EXTENSION pg_cron;
CREATE EXTENSION

After creating extension, grant the usage privileges to user postgres

postgres=# GRANT USAGE ON SCHEMA cron TO postgres;
GRANT

Let’s schedule a vacuum operation on the postgres database at 10 P.M every day.

postgres=# SELECT cron.schedule('0 22 * * *', 'VACUUM');
 schedule 
----------
        1
(1 row)

Let’s see how we can manage the scheduled jobs

postgres=# SELECT * FROM cron.job;
 jobid |  schedule   | command | nodename  | nodeport | database | username | active | jobname
-------+-------------+---------+-----------+----------+----------+----------+--------+---------
     1 | 0 22 * * *  | VACUUM  | localhost |     5432 | postgres | postgres | t      |
(1 row)

To check the logs of the scheduled activities.

postgres=# select * from cron.job_run_details;
-[ RECORD 1 ]--+----------------------------------------------------------------------------------------------------------
jobid | 1
runid | 1
job_pid | 35683
database | postgres
username | postgres
command | VACUUM
status | succeeded
return_message | VACUUM
start_time | 2024-10-01 13:27:29.473088+05:30
end_time | 2024-10-01 13:27:29.47479+05:30

Here, the breakdown of each field.

  • jobid : This is the unique identifier for the scheduled job. Each job scheduled in pg_cron has a distinct jobid.
  • runid : This indicates the unique identifier for this specific execution of the job. If the job is scheduled to run multiple times, each execution will have a different runid.
  • job_pid : This is the process ID of the PostgreSQL backend process that executed the job. It can be useful for tracking the job’s execution in system logs.
  • database : This shows the database context in which the job was run. In this case, the job was executed in the postgres database.
  • username : This indicates the username under which the job was executed. Here, the job ran as the postgres user.
  • command : This is the SQL command that was executed as part of the job. In this instance, it indicates that a VACUUM operation was performed.
  • status : This shows the outcome of the job execution. A status of succeeded indicates that the job completed successfully without errors.
  • return_message : This message reflects the output from the executed command. In this case, it confirms that the VACUUM command was executed.
  • start_time : This is the timestamp when the job started executing. It includes the time zone offset (+05:30), indicating the local time zone.
  • end_time : This is the timestamp when the job finished executing. It also includes the time zone offset.

To unschedule a previously scheduled job in pg_cron, run:

SELECT cron.unschedule(job_id);

Conclusion

pg_cron is a powerful tool that can significantly enhance your PostgreSQL experience by simplifying job scheduling and automation. By integrating routine tasks directly within the database, you can save time and reduce complexity. Whether you’re handling maintenance, data processing, or reporting, pg_cron provides the flexibility and control you need.

Ready to automate your PostgreSQL tasks? Give pg_cron a try, and see how it can streamline your database operations! Contact us today to know more about our database maintenance services.

Happy Automation!!

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top