PostgreSQL Composite Indexes and the Leftmost Prefix Rule

Java · Go · Node · Python

PostgreSQL composite indexes: why an index exists but does not work, the leftmost prefix rule, column order (equality → range → ORDER BY), covering INCLUDE, and FK without an index.

Published 27 июня 2026 ~8 min read Author: Вадим Викулин

← Back to the section

The most common mistake with indexes is not "there is no index" but "the index exists, yet it does not work for this query." The cause is almost always the wrong column order in a composite index.

How a composite index is built

Imagine a phone book sorted by last name, then by first name within a last name, then by middle name within a first name. Finding all the "Smiths" is easy. Finding all the "Smith Alexanders" is easy too — you just jump to the right "Smiths." But finding all the "Alexanders" without a last name means flipping through the entire book.

A composite B-tree index works exactly the same way. An index (a, b, c) is physically sorted by a, then by b within each a, then by c within that.

This is the leftmost prefix rule: the index is used only if the condition includes the leftmost column (or several consecutive columns from the start).

Query	Uses the index?
`WHERE a = ?`	yes, efficiently
`WHERE a = ? AND b = ?`	yes
`WHERE a = ? AND b = ? AND c = ?`	yes
`WHERE a = ? AND c = ?`	on `a` — yes, `c` is checked by a separate filter
`WHERE b = ?`	no (full table scan)
`WHERE c = ?`	no
`WHERE b = ? AND c = ?`	no

For WHERE b = ? without a, the planner will choose a Seq Scan, because the tree has no entry point by b.

Order in WHERE does not matter

The order of conditions in a WHERE clause does not affect index usage — the PostgreSQL optimizer arranges the conditions in the right order itself.

-- index (a, b, c)
WHERE a = 1 AND b = 2 AND c = 3   -- uses the index
WHERE c = 3 AND b = 2 AND a = 1   -- uses the index in exactly the same way
WHERE b = 2 AND a = 1             -- works like WHERE a = 1 AND b = 2

In EXPLAIN you can see the difference between a condition that goes through the tree and a condition used as a filter:

Index Cond: ((a = 1) AND (b = 2))   ← went through the index tree
Filter:     (c = 3)                  ← checked after reading the rows

Index Cond is navigation through the tree. Filter is an extra check on rows that have already been read.

How to choose the column order

Three rules apply here, and they work together.

Equality columns first

Columns that most queries filter with = go first. This gives the most precise navigation through the tree.

-- If 90% of queries filter by status:
CREATE INDEX ix_orders_status_created ON orders (status, created_at);

The popular advice "put the most selective column first" is misleading. The more accurate rule: the first column is the one that most often has an equality condition, and only after that do you look at selectivity.

Example: an orders table with filters on status (5 values, but in 90% of queries) and customer_id (a million values, but only in 10% of queries). It is better to create (status, created_at) for the main queries and a separate (customer_id) for the rest, than a single complex (customer_id, status, created_at).

Range conditions last

After a range condition (>, <, BETWEEN, LIKE 'prefix%'), the tree stops being used for the following columns.

-- good: range last
CREATE INDEX ix_orders_status_created ON orders (status, created_at);
WHERE status = 'NEW' AND created_at > now() - interval '1 day'
-- Index Cond: (status = 'NEW') AND (created_at > ...)

-- bad: range first, status falls into Filter
CREATE INDEX ix_orders_created_status ON orders (created_at, status);
WHERE status = 'NEW' AND created_at > now() - interval '1 day'
-- Index Cond: (created_at > ...)
-- Filter:     (status = 'NEW')

In the second case PostgreSQL will read every row for the required period and only then filter by status — instead of jumping straight to the right branch of the tree.

Order matches ORDER BY

If a query often sorts its results, it makes sense to reflect that in the index — then PostgreSQL will not perform a separate sort.

CREATE INDEX ix_msg_user_at ON messages (user_id, created_at DESC);

-- the index works for both the filter and the sort — with no extra Sort operation
SELECT * FROM messages WHERE user_id = ? ORDER BY created_at DESC LIMIT 20;

If the index is created with ASC but the query asks for DESC, PostgreSQL can walk the index backwards (Index Scan Backward). This works, but it is slightly slower.

Duplicate indexes — extra overhead

If you already have an index (a, b, c), a separate index (a) is not needed: any query that would use (a) will use (a, b, c) in exactly the same way by the leftmost prefix rule.

Duplicate indexes:

take up extra disk space;
slow down INSERT, UPDATE, DELETE — every operation updates all indexes on the table;
confuse the planner.

You can check for similar indexes like this:

SELECT indexrelname, indrelid::regclass, pg_size_pretty(pg_relation_size(indexrelid))
FROM pg_index
JOIN pg_stat_user_indexes USING (indexrelid)
ORDER BY indrelid, indkey;

Index Only Scan is not always fast

When PostgreSQL uses an Index Only Scan, it seems like "the index worked." But that does not always mean a fast query.

Example: there is an index (status, created_at) and a query:

SELECT count(1) FROM orders WHERE created_at > now() - interval '7 days';

In EXPLAIN it looks like this:

Index Only Scan using ix_orders_status_created on orders
  Filter: (created_at > '2026-06-20')
  Rows Removed by Filter: 8420134
  Heap Fetches: 0

Here created_at is the second column. Without a condition on status, PostgreSQL scans the entire index and applies a Filter. Rows Removed by Filter: 8420134 means 8 million throwaway rows. This is a full scan of the index, not a precise lookup.

For such a query you need a separate index (created_at) or (created_at, status).

One more thing: Heap Fetches > 0 in an Index Only Scan means that PostgreSQL still went to the table for some of the rows — apparently the visibility map is stale. This is fixed by running VACUUM.

Covering index with INCLUDE

An ordinary composite index stores all of its columns in the tree. Sometimes you need to add columns for reading only — without affecting the order in the tree and without bloating the key. That is what INCLUDE is for.

CREATE INDEX ix_orders_customer_inc
    ON orders (customer_id) INCLUDE (status, created_at, total);

-- PostgreSQL can answer the query from the index alone, without touching the table
SELECT customer_id, status, created_at, total
FROM orders
WHERE customer_id = ?;

Columns from INCLUDE:

do not affect the sort order in the tree;
are stored only in the index leaves;
let you avoid an extra table read (heap fetch).

Such an index is called a covering index. This is useful for report queries that read a fixed set of columns by a single condition.

FK without an index — a hidden trap

PostgreSQL does not create an index on a foreign key automatically. This means that when a parent row is deleted, PostgreSQL fully scans the child table — to check whether there are any references.

CREATE TABLE order_item (
    id       bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    order_id bigint NOT NULL REFERENCES order_doc(id) ON DELETE CASCADE
);

-- without this index, DELETE FROM order_doc WHERE id = ? = a full scan of order_item
CREATE INDEX ix_order_item_order ON order_item (order_id);

On a large order_item table without an index, deleting a single parent row can take seconds and block other operations. You should always add an index on the FK column.

Functional index

An ordinary index stores column values as they are. If queries filter by the result of a function — for example, lower(email) for case-insensitive search — you need a functional index.

CREATE INDEX ix_account_email_lower ON account (lower(email));

-- the query must match the expression in the index exactly
SELECT * FROM account WHERE lower(email) = lower('IVAN@EXAMPLE.COM');

The same works with COALESCE, EXTRACT, and computed expressions. The key condition: the query must use the same expression as in the index definition.

In short

A composite index (a, b, c) is a phone book by a → b → c. A search without a = flipping through everything.
It works only on the leftmost prefix: WHERE a works, WHERE b does not.
The order of conditions in WHERE does not matter — the optimizer will rearrange them.
Equality columns first, then range (>, <, BETWEEN). Range first — the remaining columns fall into Filter.
If a query sorts often — align ORDER BY with the order and direction in the index.
(a) when (a, b, c) exists is redundant: it takes up space and slows down writes.
An Index Only Scan with a large Rows Removed by Filter is a full scan of the index, not a precise lookup.
INCLUDE — a covering index: extra columns without bloating the key.
FK without an index → a full scan of the child table when the parent is deleted.
A functional index is needed if you filter by lower(), COALESCE(), etc. — the query must use the same expression.