PG18 Hacktober: 31 Days of New Features : Internationalization LIKE comparisons

In PostgreSQL 18, the LIKE operator becomes more powerful and more compliant with the SQL standard, it now supports use with nondeterministic collations.

If you’re dealing with case-insensitive, accent-insensitive or language-specific comparisons, this change is a game-changer for your search logic.

This post unpacks what changed, why it matters, and how to use it in your multilingual applications.

What’s the problem?

Prior to PostgreSQL 18:

If a column had a nondeterministic collation (e.g. und-x-icu or en-u-ks-level2),

You couldn’t use LIKE, ILIKE, or regex operators (~, ~*)

The query would throw an error like:

“nondeterministic collations are not supported for LIKE comparisons“

This was because nondeterministic collations can’t determine exact character equivalence, making pattern matching ambiguous, or so it seemed.

What changed in PostgreSQL 18?

Now, PostgreSQL 18 allows LIKE and ILIKE comparisons to be performed against columns that use nondeterministic collations, as long as the database can perform the operation meaningfully.

This brings PostgreSQL’s behavior closer to the SQL standard, which does not forbid LIKE on nondeterministic collations, and makes text search far more natural in:

Case-insensitive search
Accent-insensitive text filtering
Multilingual applications

SQL Standard perspective

According to the SQL standard:

The LIKE predicate is a pattern-matching operator that compares a string against a pattern, based on the character collation of the input.

The standard does not require deterministic comparisons, it leaves implementation up to the database system, assuming the collation supports meaningful ordering and comparison.

PostgreSQL 18 now complies with that expectation.

Test Cases:

Test Case 1: Prior to PostgreSQL 18

1. Create Collation

pg_18_hacktober=# CREATE COLLATION case_accent (provider = icu, locale = 'und-u-ks-level1', deterministic = false);
CREATE COLLATION

2. Create Table & Inserting Test Data

pg_18_hacktober=# CREATE TABLE foo ( i text COLLATE case_accent);
CREATE TABLE 

pg_18_hacktober=# INSERT INTO foo VALUES ('kush'), ('KUSH'),('kùsh'),('kùšh');
INSERT 0 4

3. Output

pg_18_hacktober=# SELECT * FROM foo WHERE i = 'kush';
  i   
------
 kush
 KUSH
 kùsh
 kùšh
(4 rows)

pg_18_hacktober=# SELECT * FROM opensource_db WHERE i LIKE  'k%';

ERROR:  nondeterministic collations are not supported for LIKE

Test Case 2: PostgreSQL 18

1. Create Collation

pg_18_hacktober=# CREATE COLLATION case_accent (provider = icu, locale = 'und-u-ks-level1', deterministic = false);
CREATE COLLATION

2. Create Table & Inserting Test Data

pg_18_hacktober=# CREATE TABLE foo ( i text COLLATE case_accent);
CREATE TABLE 
pg_18_hacktober=# INSERT INTO foo VALUES ('kush'), ('KUSH'),('kùsh'),('kùšh');
INSERT 0 4

3. Output

pg_18_hacktober=# SELECT * FROM foo WHERE i = 'kush';
  i   
------
 kush
 KUSH
 kùsh
 kùšh
(4 rows)

pg_18_hacktober=# SELECT * FROM opensource_db WHERE i LIKE  'k%';
  i   
------
 kush
 KUSH
 kùsh
 kùšh
(4 rows)

Summary

PostgreSQL 18 brings a small but mighty change to string matching:

You can now use LIKE on columns with nondeterministic collations, enabling:

Case-insensitive matching
Accent-insensitive filtering
Multilingual-friendly search

This brings PostgreSQL closer to the SQL standard and makes building internationalized apps easier than ever.

What’s Next?

In Day 29 of PG18 Hacktober, we shift from what you can compare to how fast you can do it, thanks to pg_unicode_fast, a brand-new collation available for encoding UTF8 in PostgreSQL 18.

What’s the problem?

Prior to PostgreSQL 18:

What changed in PostgreSQL 18?

SQL Standard perspective

According to the SQL standard:

Test Cases:

Test Case 1: Prior to PostgreSQL 18

1. Create Collation

2. Create Table & Inserting Test Data

3. Output

Test Case 2: PostgreSQL 18

1. Create Collation

2. Create Table & Inserting Test Data

3. Output

Summary

What’s Next?

Leave a Comment