Welcome to the 16th blog post of PG18 Hacktober!
PostgreSQL 18 is bringing a host of anticipated features, from the beautifully ordered UUIDs of UUIDv7 to the enhanced query analysis of pg_over_explain. But the feature generating the most buzz and promising a significant leap in performance is the introduction of Asynchronous i/o (AIO).
This wasn’t an overnight change; the PostgreSQL team has been carefully working on this complex implementation for over five years. AIO, though often described as a “necessary evil” because it adds complexity, delivers the performance needed for modern, high-demand database operations.
Let’s dive into what Asynchronous i/o is and how PostgreSQL 18 finally brings it to life.
Synchronous vs. Asynchronous: The core difference
To understand AIO, we must first revisit the concepts of synchronous and asynchronous execution.
- Synchronous Execution: Think of a client calling an API. If that call blocks the execution of the rest of the program until a result is returned, it’s synchronous. Your process is essentially waiting and unable to execute further instructions. In the context of database file i/o (reading or writing to disk), a synchronous system call, like a read operation, forces the database process to halt and wait for the kernel to fetch the data from the i/o controller and into the memory page cache before copying it to the user-space buffer.
- Asynchronous Execution: With an asynchronous call, you delegate the job to another party (like the kernel or a worker thread) and immediately move on to the next instruction. Sometime later, you receive a notification or check to see if the result is ready. The key is that your main process is unblocked, freeing up valuable CPU time to handle other tasks while the i/o is happening.
The problem with synchronous i/o in PostgreSQL (Pre-v18)
Before version 18, a client connection to PostgreSQL would spawn a dedicated process, known as a backend. This backend process would execute queries using synchronous i/o.
Imagine a simple sequential scan (like a COUNT(*)). The backend would:
- Issue a read for Page 1.
- Block and wait for the page to be returned by the kernel.
- Process Page 1.
- Issue a read for Page 2.
- Block and wait… and so on.
This blocking behavior means the backend process is an inefficient user of the CPU. If the kernel finds the process blocked and not doing any work, it can switch the process out, delaying the overall query execution.
While PostgreSQL 17 introduced improvements like combining multiple page reads into a single system call (read-ahead heuristics) to mitigate this, the core issue of synchronous blocking remained.
PostgreSQL 18’s AIO implementation: Three paths
1. Worker pool (default)
- How it works: A backend offloads the I/O request to a shared worker pool.
- Blocking I/O happens in a background process—not in the backend.
- Benefit: CPU time in the backend is freed up for computation.
- Platform: Cross-platform, enabled by default (io_method = ‘worker’).
- Pool Size: Small (default: 3 workers), shared globally. Future versions may make this dynamic.
This method is portable and safe, though it introduces some inter-process communication overhead.
2.io_uring (Linux-specific)
- How it works: Uses submission and completion queues shared with the kernel.
- Zero context switching: No threads or blocking—the kernel handles async I/O natively.
- Design Choice: PostgreSQL uses private io_uring instances per backend to avoid contention.
- Requirements: Linux kernel ≥ 5.5, io_method = ‘io_uring’.
Offers the best performance, but only available on modern Linux systems.
3. Sync (legacy fallback)
- Preserves the legacy synchronous blocking behavior.
- Used for compatibility, testing, or when AIO is causing issues.
- io_method = ‘sync’ is essentially PostgreSQL’s pre-18 behavior.
The road ahead: writes and direct i/o
While PostgreSQL 18’s AIO is a massive step forward, the journey isn’t complete:
- Reads Only: The current AIO implementation only supports reads. All write operations (like those performed by background writers and checkpoint processes) are still synchronous. Support for asynchronous writes is a key focus for future versions.
- Direct i/o: A major long-term goal is to implement Direct i/o. This would allow PostgreSQL to completely bypass the operating system’s page cache. While this eliminates the CPU overhead of copying data from the kernel’s page cache to PostgreSQL’s shared buffers (eliminating “double caching”), it’s a huge undertaking. Since every read or write would go straight to disk, it requires extensive changes to avoid disastrous performance, especially given PostgreSQL’s current architecture of many small i/o operations.
The introduction of Asynchronous i/o marks a pivotal moment for PostgreSQL, delivering a fundamental architectural change that promises to future-proof its performance for years to come. By moving i/o operations off the critical path, PostgreSQL is set to become even more responsive and efficient under heavy load.
Limitations (for now)
- Read-only: AIO in v18 only supports reads.
- Writes are still synchronous (e.g., checkpoints, WAL writes).
- No Direct I/O Yet: Skipping the OS page cache is a future goal but will require major changes.
What’s Next?
In Part 2, we’ll explore:
- How to configure and test io_method
- Performance differences between worker and io_uring
- Monitoring with pg_stat_io (new in PG18!)
- Real-world tuning tips from production scenarios
