PostgreSQL Extensions: pg_trgm, pgcrypto, pg_stat_statements and Others

Java · Go · Node · Python

What PostgreSQL extensions are, which ones are worth enabling on any project and why: pg_stat_statements, pgcrypto, pg_trgm, btree_gist, citext, pg_partman, pg_repack — explained from scratch.

Published 27 июня 2026 ~8 min read Author: Вадим Викулин

← Back to the section

PostgreSQL is one of the few databases you can extend from the inside. Instead of baking everything into the core, the developers moved a lot of useful functionality into extensions: separate modules that install with a single command and add new functions, data types, and index algorithms.

Let's look at how this mechanism works, which extensions are worth knowing, and where they help.

What an extension is and how to enable it

An extension is a package of SQL objects (functions, types, operators, index methods) that are added to a specific database with a single command:

CREATE EXTENSION IF NOT EXISTS pgcrypto;

The IF NOT EXISTS keyword protects against an error if the extension is already installed. The operation itself is safe and fast.

To see what's already installed:

SELECT name, default_version, installed_version
FROM pg_available_extensions
WHERE installed_version IS NOT NULL;

Most standard extensions ship with PostgreSQL and don't require anything extra to be installed on the server — you just need to enable them in the right database.

Three extensions worth enabling on any project

There are three extensions that cost almost nothing in resources but are needed regularly:

CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
CREATE EXTENSION IF NOT EXISTS pgcrypto;
CREATE EXTENSION IF NOT EXISTS pg_trgm;

It's best to enable them right away — that way you won't have to remember them at the critical moment.

pg_stat_statements — see which queries are slow

Without this extension, finding a slow query in production is extremely hard: PostgreSQL doesn't keep query history by default. pg_stat_statements fixes this: it accumulates statistics for each unique query — how many times it ran, how much total time it took, how much data it read.

One caveat: the extension requires a single additional setting in postgresql.conf and a cluster restart:

shared_preload_libraries = 'pg_stat_statements'

After the restart, enable the extension and you can start looking at the statistics:

CREATE EXTENSION pg_stat_statements;

-- top 10 queries by total execution time
SELECT query, calls, total_exec_time, rows
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;

This is the first tool to reach for when investigating performance problems.

pgcrypto — UUIDs, password hashes, cryptographic functions

The pgcrypto extension adds cryptographic functions directly to SQL.

Generating UUIDs. Starting with PostgreSQL 13, the gen_random_uuid() function is available without any extension, but for compatibility with older versions it's convenient to have pgcrypto:

SELECT gen_random_uuid();
-- e.g. a9f1a2b3-1c2d-4e5f-8a9b-0c1d2e3f4a5b

Hashing passwords. pgcrypto supports bcrypt — an algorithm specifically designed for storing passwords (deliberately slow, to make brute-forcing harder):

-- store a password
INSERT INTO account (email, password_hash)
VALUES ('user@example.com', crypt('plaintext', gen_salt('bf', 10)));

-- verify the password on login
SELECT id FROM account
WHERE email = 'user@example.com'
  AND password_hash = crypt('plaintext', password_hash);

In practice, hashing is more often done on the application side so the plaintext password never reaches the database. But for small projects the pgcrypto option works just fine.

Data hashes. You can compute SHA-256 and other hashes right in a query:

SELECT digest('some text', 'sha256');

An important caveat: encrypting columns directly in the database via pgp_sym_encrypt is not a great idea. The database then knows the keys, which creates an extra vector for leaks. For storing sensitive data it's better to encrypt on the application side or use disk encryption at the infrastructure level.

pg_trgm — substring search and fuzzy search

A plain LIKE '%ivan%' in PostgreSQL doesn't use indexes — the database is forced to scan every row of the table. On large tables this is very slow.

pg_trgm solves the problem with trigrams: the text is split into three-character fragments, and a special GIN index is built on them. After that, substring search starts using the index and runs fast:

CREATE EXTENSION pg_trgm;

CREATE INDEX ix_customer_name_trgm
    ON customer USING gin (full_name gin_trgm_ops);

-- now this query goes through the index
SELECT * FROM customer WHERE full_name ILIKE '%ivan%';

An additional capability is fuzzy search that tolerates typos. The similarity function returns a number between 0 and 1: the closer to 1, the greater the match:

-- find similar names, even with a typo
SELECT full_name, similarity(full_name, 'ivann') AS score
FROM customer
WHERE similarity(full_name, 'ivann') > 0.4
ORDER BY score DESC
LIMIT 10;

This is useful for autocomplete forms and fuzzy lookups against reference tables.

btree_gist — non-overlap constraint

PostgreSQL has an EXCLUDE mechanism: an integrity constraint that forbids rows from "overlapping" on a given condition. The classic example is bookings: one room cannot be booked twice for an overlapping period.

By default, EXCLUDE works only with geometric and range types via a GiST index. btree_gist adds support for ordinary scalar types (integers, text) to the same index, which lets you combine them with ranges:

CREATE EXTENSION btree_gist;

CREATE TABLE booking (
    id      bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    room_id bigint NOT NULL,
    period  tstzrange NOT NULL,
    EXCLUDE USING gist (room_id WITH =, period WITH &&)
);

This database-level constraint guarantees that for the same room_id there won't be two records with an overlapping period. No application logic needed — the database itself won't let a conflict through.

citext — case-insensitive text

A typical problem with email addresses: a user signed up as User@Example.com but logs in as user@example.com. For the comparison to work correctly, you usually have to write LOWER(email) = LOWER($1) everywhere or force the address into lowercase before storing.

citext (case-insensitive text) is a data type that makes comparisons case-insensitive automatically:

CREATE EXTENSION citext;

CREATE TABLE account (
    id     bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    email  citext NOT NULL UNIQUE
);

-- both queries return the same row
SELECT * FROM account WHERE email = 'user@example.com';
SELECT * FROM account WHERE email = 'USER@EXAMPLE.COM';

The unique index is case-insensitive too: User@Example.com and user@example.com are treated as one value.

pgstattuple — how much space is wasted

Over time, "dead" rows accumulate in PostgreSQL tables — deleted or updated records that VACUUM hasn't cleaned up yet. This takes up disk space and slows down queries.

pgstattuple lets you precisely measure how "bloated" a table or index is:

CREATE EXTENSION pgstattuple;

-- table statistics: how many live rows, how many dead, how much free space
SELECT * FROM pgstattuple('orders');

-- index statistics: leaf page density
SELECT * FROM pgstatindex('ix_orders_customer');

-- fast approximate estimate (doesn't read the whole table)
SELECT * FROM pgstattuple_approx('orders');

When dead_tuple_percent is high or an index's avg_leaf_density is low — it's time to run VACUUM or consider a rebuild.

unaccent — search that ignores diacritics

Diacritical marks (accent marks) are umlauts, accents, and similar symbols in European languages: é, ü, ñ. unaccent strips them, reducing the text to basic ASCII:

CREATE EXTENSION unaccent;

SELECT unaccent('Naïve café résumé');
-- result: Naive cafe resume

This is useful when you need a search that finds cafe for the query café and vice versa. The extension is usually used together with full-text search:

SELECT * FROM article
WHERE search_doc @@ plainto_tsquery('russian', unaccent($1));

pg_partman — automating partitions

Partitioning lets you split a large table into physical pieces (partitions) — for example, by month. PostgreSQL supports this natively, but creating new partitions by hand every month is inconvenient.

pg_partman automates the process: it creates future partitions ahead of time and, if needed, drops old ones:

CREATE EXTENSION pg_partman;

-- configure automatic monthly partitioning
SELECT partman.create_parent(
    p_parent_table => 'public.event_log',
    p_control      => 'occurred_at',
    p_type         => 'native',
    p_interval     => '1 month',
    p_premake      => 4   -- create 4 partitions ahead
);

-- run on a schedule (for example, via pg_cron)
SELECT partman.run_maintenance('public.event_log');

Without pg_partman, the same thing requires manual DDL or writing maintenance scripts.

pg_repack — reorganize a table without stopping the database

VACUUM FULL removes table bloat by rebuilding the table completely — but it holds an exclusive lock while doing so. For a large table this means several minutes when nobody can read or write.

pg_repack does the same thing but without the long lock: it builds a new copy of the table in the background and quickly switches over to it at the end:

pg_repack -d mydb -t order_doc        # reorganize the table
pg_repack -d mydb -i ix_order_status  # rebuild the index

This is an external utility installed separately, but it's worth it for databases that can't be taken offline for long.

Common mistakes

Enabling everything "just in case". Each extension registers objects in the database schema and slightly increases the complexity of the environment. Only enable what you actually need.

Using hstore in new code. hstore is a legacy type for storing key-value pairs that predates JSONB. For new code always use jsonb — it's more powerful and better supported.

Using uuid_generate_v4() from uuid-ossp. This extension was the standard before PostgreSQL 13. Nowadays, for UUID v4, use gen_random_uuid() from pgcrypto (or the built-in function in PG 13+). If you need UUID v7 (with time-based ordering), generate it on the application side.

Adding pg_stat_statements without shared_preload_libraries. Without this setting the extension will be created but won't work — it silently collects nothing until you restart with the correct config.

In short

Extensions are enabled with the CREATE EXTENSION IF NOT EXISTS <name> command and add new functions, types, and index algorithms to a specific database.
On any project it's worth enabling pg_stat_statements (see slow queries), pgcrypto (UUIDs, bcrypt), and pg_trgm (indexed substring search).
pg_stat_statements requires shared_preload_libraries in the config and a cluster restart — do it once when starting the project.
btree_gist is needed for EXCLUDE constraints that combine scalar fields with ranges (for example, bookings).
citext removes the need for constant LOWER() when comparing emails and logins.
pg_partman automates the scheduled creation of partitions.
pg_repack reorganizes bloated tables and indexes without locking — unlike VACUUM FULL.
hstore and uuid-ossp are legacy; for new code use jsonb and gen_random_uuid().