After the previous blog post on PostgreSQL 14 New features for DBAs, I am here to present a few interesting Developer centric new features with this post.
Improved Data types
Multirange datatypes
Multiranges are basically sorted arrays of non-overlapping ranges with set-theoretic operations defined over them.
PostgreSQL 14 now adds support for multi-range types. These are data types that support grouping together non-contiguous ranges. There are both manual and automatic mechanisms for naming multirange types. Once can specify multirange type name using multirange_type_name attribute in CREATE TYPE.
The optional multirange_type_name parameter specifies the name of the corresponding multirange type. If not specified, this name is chosen automatically as follows. If the range type name contains the substring range, then the multirange type name is formed by the replacement of the range substring with multirange in the range type name. Otherwise, the multirange type name is formed by appending a multirange suffix to the range type name.
You can create multi ranges from the existing range types by adding the multirange suffix to the end of the range type name, e.g.:
- int4multirange, int8multirange: ranges of int and bigint range types
- nummultirange: multirange of numeric range types
- tstzmultirange, datemultirange, tsmultirange, : multirange of timestamptz, date, and time range types
Subscripting improved
PostgreSQL 14 now adds support for data type subscripting. Traditional variable-length (varlena) arrays all use array_subscript_handler(), while the existing fixed-length types that support subscripting use raw_array_subscript_handler(). This means that the support for subscripting is for the arbitrary types, not only arrays.
The data types – hstore and jsonb, in particular, would be a great pick:
EXTRACT function
The previous implementation of EXTRACT mapped internally to date_part(), which returned type double precision (since it was implemented long before the numeric type existed). This can lead to imprecise output in some cases, so returning numeric would be preferable.
The following minor changes of behavior result from the new implementation:
- The column name from an isolated EXTRACT call is now “extract” instead of “date_part”.
- Extract from date now rejects inappropriate field names such as HOUR. It was previously mapped internally to extract from timestamp, so it would silently accept everything appropriate for timestamp.
- Return values when extracting fields with possibly fractional values, such as second and epoch, now have the full scale that the value has internally (so, for example, ‘1.000000’ instead of just ‘1’).
JSON
What is JSON?
JSON stands for “JavaScript Object Notification” which is a type of data format popularly used by many applications. This means the data would be transmitted between web applications and servers in such a format. JSON was introduced as an alternative to the XML format. In the “good old days” the data used to get transmitted in XML format which is a heavyweight data type compared to JSON.
Before we get to the examples, here is a list of the major aspects of the operators:
Operator ->
Allows you to select an element based on its name.
Allows you to select an element within an array based on its index.
Can be used sequentially: ::json->'elementL'->'subelementM'->…->'subsubsubelementN'.
Return type is json and the result cannot be used with functions and operators that require a string-based datatype. But the result can be used with operators and functions that require a json datatype.
Operator ->>
Allows you to select an element based on its name.
Allows you to select an element within an array based on its index.
Cannot be used sequentially.
Return type is text and the result can be used with functions and operators that require a string-based datatype. For the same reason, sequential usage of the operator is not supported.
Operator #>
Allows you to select an element based on its path within the main JSON object. The path can consist of element names and array indexes, depending on what is needed.
Can be used sequentially: ::json#>'{elementname1,elementname2,index1,index2}'#>'{elementname3}'.
Return type is json and the result cannot be used with functions and operators that require a string-based datatype. But the result can be used with operators and functions that require a json datatype.
Operator #>>
Allows you to select an element based on its path within the main JSON object. The path can consist of element names and array indexes, depending on what is needed.
Cannot be used sequentially.
Return type is text and the result can be used with functions and operators that require a string-based datatype. For the same reason, sequential usage of the operator is not supported.
Basically, you can create a 'chain' of -> and #> operators, provided that you point to valid elements and indexes. Such a chain can end with each of the four operators. The last one determines whether the result can be used as input for other specific functions.
Note that if you want to use the result in combination with a function or operator that requires a text datatype, you have to use ->> or #>> as the last one in the chain. This might be important if you have a WHERE clause or a sub statement that refers to the result of the json operators. Let us proceed with some examples to see these operators in action.
For example, json subscripting:
SELECT ('{ "postgres": { "release": 14 }}'::jsonb)['postgres']['release'];
SELECT ('{"a": {"b": {"c": 1}}}'::jsonb)['a']['b']['c'] ;
SELECT ('[1, "2", null]'::jsonb)[1];
Another example:
CREATE TABLE cards (
id integer NOT NULL,
board_id integer NOT NULL,
data jsonb
);
In PostgreSQL 13 and early if you wanted to find all cards data with certain criteria you write some queries:
SELECT * FROM cards WHERE data->>'finished' = 'true';
SELECT count(*) FROM cards WHERE data ? 'ingredients';
With PG 14, you can simply use the quotation marks, though, it expects JSON-style string for the comparison.
SELECT * FROM cards
WHERE data['attributes']['status'] = '"finished"';
SQL-standard function body
This adds support for writing CREATE FUNCTION and CREATE PROCEDURE statements for language SQL with a function body that conforms to the SQL standard and is portable to other implementations.
Instead of the PostgreSQL-specific AS $$ string literal $$ syntax, this allows writing out the SQL statements making up the body unquoted, either as a single statement:
CREATE FUNCTION add(a integer, b integer) RETURNS integer
LANGUAGE SQL
RETURN a + b;
or as a block
CREATE PROCEDURE insert_data(a integer, b integer)
LANGUAGE SQL
BEGIN ATOMIC
INSERT INTO tbl VALUES (a);
INSERT INTO tbl VALUES (b);
END;
You may refer to the commit message for more information.
SQL syntax changes
PostgreSQL common table expressions or CTEs – A common table expression is a temporary result set that you can reference within another SQL statement including SELECT, INSERT, UPDATE or DELETE.
Common Table Expressions are temporary in the sense that they only exist during the execution of the query. The following shows the syntax of creating a CTE:
WITH cte_name (column_list) AS (
CTE_query_definition
)
SELECT statement;
Nice things to note:
- It just works! Easy to read the query
- You can reference the CTE several times in the query, and it will be calculated only once
- Using DML statements in the CTE with a RETURNING clause
From PG14 the WITH RECURSIVE a syntactic convenience into a feature that accomplishes things not otherwise possible in standard SQL.
The new syntax additions are SEARCH order and CYLE detection.
When computing a tree traversal using a recursive query, you might want to order the results in either depth-first or breadth-first order.
For example:
WITH RECURSIVE search_tree(column_list) AS (
SELECT statement
) SEARCH DEPTH FIRST BY id SET ordercol
SELECT * FROM search_tree ORDER BY ordercol;
Tomorrow, we will be wrapping up our 9-part blog series with some Cool Little things of PG14. Stay tuned!
Leave a Reply