PgBouncer

PgCat – Connection Pooling Decoded – Part 1

Introduction

PostgreSQL – The World’s Most Advanced Open Source Relational Database holds its ground as a formidable player in the Open-source Databases market. With respect to Connection scalability & Connection concentration layers, optimising connection pooling, load balancing, and replica failover presents significant challenges. To address these complexities, a novel solution has emerged – PgCat, an open-source PostgreSQL proxy.

Introducing PgCat: An All-in-One PostgreSQL Connection Pooling software

PgCat. Think of it as a versatile helper for PostgreSQL developed in Rust – a programming language that emphasizes performance, type safety, and concurrency. It’s not just about managing connections—it’s a whole package that takes care of connection pooling, load balancing, and making sure everything runs smoothly, even when replicas hiccups.

The PgCat Advantage

Efficient Resource Utilisation: PgCat smartly manages a pool of established connections, ensuring that resources are not exhausted by repeatedly opening and closing connections.

Enhanced Performance: By reusing connections, PgCat minimizes the time spent on setting up new connections, resulting in improved response times and reduced latency.

Connection Multiplexing: PgCat allows multiple applications to share a single connection, effectively reducing the overall number of connections and saving resources.

Load Balancing: PgCat seamlessly distributes incoming connections across multiple PostgreSQL servers, preventing bottlenecks and optimizing workload distribution.

Idle Connection Management: PgCat monitors idle connections and reclaims them if they remain unused for a specified period, preventing connection leaks and freeing up resources.

Failover and High Availability:  PgCat supports failover scenarios, automatically redirecting connections to a standby server in case the primary server fails, ensuring uninterrupted service.

Configuration Flexibility: PgCat provides customizable configurations to fine-tune connection pooling based on your application’s unique requirements.

While PgBouncer and PgPool-II offer connection pooling and some related features, PgCat aims to provide a more comprehensive and sophisticated solution. Its focus on enhanced connection management, load balancing, failover support, configuration flexibility, and seamless integration makes it a compelling option for optimising PostgreSQL environments. Here’s a quick summary of the features offered by PgCat in comparison to it’s predecessors

FeaturesPgCatPgBouncerPgPool-II
Load BalancingYesLimited Yes
Replica Failover YesNo Yes
Language SupportAllSpecificAll
Transaction Pooling YesNo Yes
Statement Pooling NoYes Yes
Connection LimitsDynamicFixedFixed
Sharding Support YesNo Yes
Multi-Threaded YesNo Yes
Load Balancing AlgosRandom, Least ConnectionsRound RobinRound Robin, Weighted
Control Commands Yes Yes Yes
SecuritySSL/TLSBasic AuthSSL/TLS, Kerberos
Session Tracking YesNoYes

Architectural changes

The distinct architectural feature of PgCat is its multi-threaded design, distinguishing it from the single-threaded nature of PgBouncer. Its deployment configurations, capacity to accommodate multiple users and pools, and utilisation of advanced load-balancing algorithms (random and least connections) provide administrators with an exceptional level of flexibility.

Sharding and Beyond: Expanding Horizons

Highlighting its adaptability, PgCat seamlessly handles sharded databases. This empowers applications to select particular shards using sharding keys or numbers—a crucial advantage for complex configurations.

A logical culmination:

We’ve found a reliable companion in PgCat, that effortlessly meets our needs for connection management, load distribution, and replica recovery. With its smart Rust-based design, impressive features, and knack for handling many tasks at once, PgCat has transformed the way we manage PostgreSQL.

Anticipate a Remarkable Shift: Brace yourself for Part 2, as exciting revelations await in the next installment of our blog series. In the upcoming segment, we will delve into the practical aspects, delving into the implementation intricacies and unraveling the pivotal parameters that ensure a seamless transition from non-PgCat to the PgCat environment.

Stay tuned…..

Connection Pooling with pgBouncer: A Primer

Imagine that you were running a promotion for your blog or website, and it caught the attention of Sarah Connor, who immediately tweeted a link to your website on her channel. How would you feel about it? Would you feel elated and excited that your website has gone viral and could potentially attract thousands or millions of visitors? Or would this thought make you nervous wondering if your application and database would be able to handle all the visitors and the traffic that is pouring like rain in the Amazon

Well, if you truly are in such a situation, it is probably a good thing, and I promise we will do our part to alleviate your anxiety.

When it comes to scalability, application architects resort to different tools and techniques, and one of the most rudimentary, yet under-utilized techniques is Connection pooling.

In this post, we discuss what Connection Pooling is, and then talk about how we could implement Connection Pooling in a Postgres ecosystem using pgBouncer.

Ready to go on the ride? Strap on …

What is Connection Pooling?

Typically, when a user interacts with an application, the application, in turn,  requests a connection to the database that serves as a conduit for all the exchanges that happen between the application/user and the database. This connection would be maintained as long as the user remains active, and the application automatically disconnects from the database when the user logs out or his session gets timed out.

Now, opening a database connection and closing it each time we need to pull some data out of the database can be a costly affair, and people have come up with the idea of a connection pool as a way to offset some of the costs and gain some performance benefit while at it.

What exactly is a connection pool then?

Well, a connection pool is a set of always-on, shared database connections that are recycled for use across multiple users. You could think of it like a bunch of hotlines – when you need to communicate, you just need to pick up the phone, and you would be instantly connected to the other side.

Since these are typically always-on connections, the overheads associated with establishing a connection to the database, as well as releasing the connection after use, are greatly minimized.

What is pgBouncer?

pgBouncer is an open-source, lightweight, single-binary connection pooler for PostgreSQL. It can pool connections to one or more databases (on possibly different servers) and serve clients over TCP and Unix domain sockets. pgBouncer either creates a new database connection for a client or reuses an existing connection for the same user and database. When the client disconnects, pgBouncer returns the connection to the pool for re-use.

Why do we need pgBouncer?

Here are some reasons why you may need pgBouncer:

  • Improve performance: pgBouncer can help improve database performance by reducing the overhead associated with establishing and closing connections.
  • Handle connection limits: pgBouncer can handle connection limits, which can be useful in situations where the database server has a limited number of connections available.
  • Manage connections: pgBouncer allows you to manage connections to the database, such as setting up idle timeout thresholds or limiting the number of connections per user.
  • Security: pgBouncer can also enhance database security by providing authentication and authorization mechanisms that can help prevent unauthorized access to the database.

How does pgBouncer work?

pgBouncer shares connections in one of three pool modes:

  • Session pooling – When a client connects, a connection is assigned to it as long as it remains connected. When the client disconnects, the connection is placed back into the pool.
  • Transaction pooling – A connection is assigned to a client for the duration of a transaction. When pgBouncer notices the transaction is done, the connection is placed back into the pool. This mode can be used only with applications that do not use features that depend upon a session.
  • Statement pooling – Statement pooling is like transaction pooling, but multi-statement transactions are not allowed. This mode is intended to enforce auto-commit mode on the client and is targeted for PL/Proxy on PostgreSQL.

How to Install pgBouncer?

  • pgBouncer can be installed just like any other Linux packages using apt or yum depending on the Linux distribution.
  • Here’s the command we used to install pgBouncer on an Ubuntu 22.04 system running Postgres 15.0:

sudo apt-get install -y pgbouncer

How to configure pgBouncer?

  1. Create a configuration file: Next, you need to create a configuration file for pgBouncer. The configuration file is usually named pgbouncer.ini (located in the /etc/pgbouncer folder) and contains various settings such as the database server hostname, port, database name, username, and password.
  2. Configure pgBouncer settings: After creating the configuration file, you need to configure pgBouncer settings. This includes setting the maximum number of connections, timeouts, and other parameters that control how pgBouncer operates.

Here is an example configuration file for pgBouncer:

[databases]
template1 = host=127.0.0.1 port=5432 dbname=template1
[pgbouncer]
listen_addr = *
listen_port = 6432
auth_type = md5
auth_file = /etc/pgbouncer/user.txt
pool_mode = session
max_client_conn = 100
default_pool_size = 20

This configuration file specifies a single database (template1) with the hostname 127.0.0.1 and port 5432. It also sets the maximum number of client connections to 100 and the default pool size to 20. The auth_file setting points to a file containing the usernames and passwords allowed to connect to pgBouncer.

Reload pgBouncer:

We can use reload command to reload the configuration file without interrupting the active connections.

postgres@localhost:/etc/postgresql/15/main $  psql -p 6432 -U pgbouncer pgbouncer

pgbouncer=# reload;

Restart pgBouncer:

To stop pgBouncer:

pgbouncer -d /etc/pgbouncer/pgbouncer.ini -R -s

To start pgBouncer:

pgbouncer -d /etc/pgbouncer/pgbouncer.ini

When making changes that affect a large number of database users, it is often recommended to stop and start pgBouncer instead of restarting the Postgres database service:

Performance Benefits – Does pgBouncer really help?

As members of the tech community, we took it upon ourselves to validate if using a Connection Pooling application really helps to improve the database performance, and to what extent, and here is what we found:

(We ran a bare-bones pgBench test with 100 clients and 10000 transactions per client to simulate the load)

Connection Management without pgBouncer:

Given below output is the result of the without pgBouncer

postgres@ubuntu-vm:~$ pgbench -c 100 -j 2 -t 10000 osdb_pgb_test
pgbench (15.0 (Ubuntu 15.0-1.pgdg22.04+1))
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 50
query mode: simple
number of clients: 100
number of threads: 2
maximum number of tries: 1
number of transactions per client: 10000
number of transactions actually processed: 500000/500000
number of failed transactions: 0 (0.000%)
latency average = 96.766 ms
initial connection time = 526.886 ms
tps = 516.713001 (without initial connection time)

Connection Management with pgBouncer:

Given output is the result of the with pgBouncer

postgres@ubuntu-vm:~$ pgbench -c 100 -j 2 -t 10000 osdb_pgb_test
pgbench (15.0 (Ubuntu 15.0-1.pgdg22.04+1))
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 50
query mode: simple
number of clients: 100
number of threads: 2
maximum number of tries: 1
number of transactions per client: 10000
number of transactions actually processed: 500000/500000
number of failed transactions: 0 (0.000%)
latency average = 60.036 ms
initial connection time = 203.134 ms
tps = 832.838500 (without initial connection time)

As you can see from the above test, the initial connection time and average latency are both reduced by about 50%, and the tps increased by more than 50% in the case of pgBouncer which would be enough to validate our claims.

Conclusion:

pgBouncer is a lightweight open-source connection pooling tool for PostgreSQL that can improve the database performance and reduce the server load by optimizing database connections. It provides different pooling methods for different requirements and provides the users the flexibility to configure it as per their business needs.