Enhancing data security with PostgreSQL’s pgcrypto Extension Part – 2

In the previous blog post, we discussed how to use pgcrypto for encryption and decryption to secure sensitive data within your PostgreSQL databases. In this blog, we will explore other important cryptographic features provided by the pgcrypto extension, including hashing and digital signatures. These functions offer additional layers of security for protecting data integrity, authentication, and verification.

What is Hashing?

Hashing is a process that transforms any input (such as passwords, files, or other data) into a fixed-size string of characters, typically represented as a hexadecimal number. The key property of a good hash function is that it is computationally infeasible to reverse (one-way transformation), meaning that you cannot obtain the original input from its hash value.

Hashing is commonly used for storing passwords securely and verifying data integrity. Even if an attacker gains access to the hash, they cannot easily reverse-engineer it to obtain the original data.

How Hashing Works in pgcrypto

pgcrypto offers multiple hashing functions, including popular algorithms such as MD5, SHA-1, SHA-256, and SHA-512. These functions are designed to produce cryptographic hash values from input data, providing a unique fingerprint for that data.

Key Hashing Functions in pgcrypto:

  • One-way function: You can easily compute the hash from the input data, but it’s computationally infeasible to reconstruct the original input from just the hash.
  • Deterministic: The same input will always produce the same hash output.
  • Fast computation: Hashing algorithms are designed to be efficient.
  • Avalanche effect: Even a tiny change in the input data creates a significantly different hash output.
  • Fixed output size: Regardless of input size, the hash output has a fixed length.
  • Collision resistance: It should be extremely difficult to find two different inputs that produce the same hash output.

Hashing with pgcrypto in PostgreSQL

Let’s start by enabling the pgcrypto extension:

CREATE EXTENSION IF NOT EXISTS pgcrypto;

CREATE EXTENSION

The digest() Function

digest(data, type): This function computes the hash of the input data using the specified hash algorithm type (such as SHA-256 or MD5).

The digest() function can be used to generate a hash of a given input, using various hashing algorithms. It can be applied to both text and bytea data types, and the hash is returned as a bytea.

Common hashing algorithms include:

  • MD5 (now considered insecure)
  • SHA-1 (also considered weak)
  • SHA-256, SHA-384, and SHA-512
CREATE TABLE users (                                                                 
    user_id SERIAL PRIMARY KEY,
    username VARCHAR(100),
    password BYTEA                                               
);
CREATE TABLE
Time: 51.945 ms

INSERT INTO users (username, password)
VALUES (
    'Rohit',
     digest('rohit45@', 'md5')
);
INSERT 0 1
Time: 23.677 ms

When we query the table, we'll see the hashed password:
SELECT * FROM users WHERE username = 'Rohit';
 user_id │ username │              password              
─────────┼──────────┼────────────────────────────────────
       1 │ Rohit    │ \xe83f41f2cf366852f40f1bdc96e8300a
(1 row)
Time: 3.194 ms

Password Authentication

To verify a password using these hash functions, you can use the following query:

SELECT (digest('rohit45@', 'md5') = password) AS password_matches 
FROM users 
WHERE username = 'Rohit';
 password_matches 
──────────────────
 t
(1 row)
Time: 0.423 ms

The HMAC Function:
The hmac(data, key, type) function in PostgreSQL is used to generate a Hash-based Message Authentication Code (HMAC), which is typically used for verifying the integrity and authenticity of data. HMAC is not a traditional encryption and decryption mechanism; it’s a method for creating a cryptographic hash from a message and a secret key. The process is one-way (just like hashing), and it’s not designed to be reversible, so it does not provide decryption.

Key Points About HMAC:

  • HMAC is not encryption: It generates a hash based on a secret key and the data, but you cannot decrypt the hash to retrieve the original data.
  • HMAC is useful for integrity and authentication: It ensures the data has not been tampered with, and it proves the authenticity of the sender if the key is known.

Use Case for HMAC:

  • Message authentication: HMAC ensures that the data hasn’t been altered and that the sender was who they claimed to be. It’s often used in API authentication and digital signatures.

HMAC in PostgreSQL:

In PostgreSQL, the hmac() function generates an HMAC using a cryptographic algorithm (md5, sha256, sha512, etc.) and a secret key.

INSERT INTO users (username, password)
VALUES (
    'virat',
    hmac('virat18@'::bytea, 'mysecretkey'::bytea, 'sha256')
);

INSERT INTO users (username, password)
VALUES (
    'msdhoni',
    hmac('msd07@'::bytea, 'mysecretkey'::bytea, 'md5')
);

SELECT * FROM users;
 user_id │ username │                              password                              
─────────┼──────────┼────────────────────────────────────────────────────────────────────
       1 │ virat    │ \x0dfd5b7ffe9e645854491923613c61fa43b6cf64b4d304324c1102e3b9ff02c2
       2 │ msdhoni  │ \x6b61a86fe795c8cac9f7bdabfbbb5c38
(2 rows)
Time: 0.705 ms

Password Authentication

To verify a password using these hash functions, you can use the following query:

SELECT (hmac('virat18@', 'mysecretkey'::bytea, 'md5') = password) AS password_matches 
FROM users 
WHERE username = 'virat';
 password_matches 
──────────────────
 t
(1 row)
Time: 0.871 ms
Featuredigest()hmac()
PurposeComputes a hash of input data.Computes a HMAC using a secret key.
InputData and hash function.Data, secret key, and hash function.
OutputHash value (data integrity).HMAC value (data integrity + authenticity).
Use CaseData integrity checks, unique IDs.Authentication, secure message verification.

Password Hashing Functions

For even better password security, pgcrypto provides specialized functions:

crypt() and gen_salt()

The crypt() function is specifically designed for password hashing with built-in salting:

The gen_salt() function creates a random salt with the specified algorithm:

  • ‘bf’ – Blowfish (most secure, recommended)
  • ‘md5’ – MD5 (legacy)
  • ‘xdes’ – Extended DES (legacy)

To verify a password hashed with crypt():

SELECT crypt('password', gen_salt('bf'));
                            crypt                             
──────────────────────────────────────────────────────────────
 $2a$06$qJwxmgWJCDU8PavtS.AQDuhD87jP.OEspdfcDRyEZ8Asn4Q91qhYm
(1 row)
Time: 15.687 ms

SELECT crypt('password', gen_salt('bf'));
                            crypt                             
──────────────────────────────────────────────────────────────
 $2a$06$68CfObeY0j0oZBAH1eq2w.YRmibYKyjfUsAEkL6x/vZVd7Lm.8YhW
(1 row)
Time: 5.089 ms

Note on Salt and Hash Generation:

The crypt() function generates a new hash each time it’s called, even if the same password is provided. This is because the gen_salt('bf') function generates a random salt each time, ensuring that the hash values differ, which improves security by preventing identical hashes for identical passwords

CREATE TABLE users (
    user_id SERIAL PRIMARY KEY,
    username VARCHAR(100),
    password TEXT
);
CREATE TABLE
Time: 26.376 ms

INSERT INTO users (username, password)
VALUES ('sachin', crypt('sachin10@', gen_salt('bf', 10)));
INSERT 0 1
Time: 50.064 ms

SELECT * FROM users;
 user_id │ username │                           password                           
─────────┼──────────┼──────────────────────────────────────────────────────────────
       1 │ sachin   │ $2a$10$IJL/OG8ycYrRK5hzvnBhM.RbI4/013WJQ2/rMQdTu7.rZC.j0UtKe
(1 row)
Time: 0.384 ms

SELECT (password = crypt('sachin10@', password)) AS password_matches
FROM users
WHERE username = 'sachin';
 password_matches 
──────────────────
 t
(1 row)
Time: 46.207 ms

Salt and Hash Generation

The crypt() function generates a new hash every time it is called, even if the same password is used. This is because gen_salt() generates a random salt each time. The use of salt prevents attackers from using precomputed rainbow tables to crack passwords.

Security Best Practices

  1. Use strong algorithms: Prefer SHA-256 or above for general hashing, and Blowfish (bf) for password hashing
  2. Avoid deprecated algorithms: MD5 and SHA-1 are considered cryptographically weak
  3. Always use salts for passwords: This prevents rainbow table attacks
  4. Use appropriate key lengths: Longer keys provide better security for HMAC
  5. Rotate secret keys periodically: This limits the impact of potential key compromise
FeatureGeneral HashingPassword Hashing
PurposeCreate unique fingerprints for data.Securely store and verify user passwords.
OutputFixed-size hash value.Fixed-size hash value, often with salt.
Common Use CasesData integrity, unique identifiers.User authentication, password storage.
AlgorithmsMD5, SHA-1 (Weak, hence no longer recommended)
SHA-256, SHA-512.(Recommended)
Blowfish (bf), bcrypt, etc.
SaltingTypically not used.Always used to enhance security.
VerificationNot applicable.Includes a verification step for login.

Conclusion

In this blog, we explored how the pgcrypto extension in PostgreSQL can be used for a variety of cryptographic functions such as hashing, HMAC, and password hashing. These functions are crucial for ensuring data integrity, authenticating users, and securely storing sensitive information like passwords. By following best practices—like using strong hashing algorithms, salting passwords, and securely managing secret keys—you can significantly enhance the security of your PostgreSQL database. Whether you are securing user passwords or verifying data integrity, pgcrypto provides powerful tools to help protect your data from unauthorized access.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top