Covering Index in PostgreSQL: why you need INCLUDE

Java · Go · Node · Python

What a covering index is and why you need the INCLUDE syntax: how PostgreSQL reads data from an index without touching the table, why an index-only scan sometimes still goes to the heap, and how to fix it.

Published 25 мая 2026 updated 27 июня 2026 ~5 min read Author: Вадим Викулин

← Back to the section

An index helps you find the rows you need quickly. But after PostgreSQL has found them in the index, it usually goes one more time — to the table itself — to read the column values. A covering index lets you avoid that: all the data you need is right there in the index, and there's no need to touch the table.

Why PostgreSQL fetches data twice

A standard index in PostgreSQL is built as a tree (B-tree). The leaves of the tree store only the key columns and the row's address in the table (ctid). When PostgreSQL finds a row through the index, it reads that address and goes to the table for the remaining values.

This is called a heap fetch — an access to the heap (the heap is what the row storage in PostgreSQL is called). If the table is large and the data is scattered across many pages, each such access is a separate I/O operation.

An example scenario: a product listing. The query often looks like this:

SELECT name, price FROM product WHERE category_id = 1;

There's an index on category_id, PostgreSQL quickly finds the right rows, but then it goes to the table for name and price. On millions of rows, this is noticeable.

What an index-only scan is

An index-only scan is a mode in which PostgreSQL answers the query entirely from the index, without touching the table. There's one condition: every column the query needs (both in the WHERE and in the SELECT) must be in the index.

The problem with the naive approach: if you add name and price to the index key, they'll end up in every level of the tree, including the internal nodes. The tree becomes physically wider, takes up more space, and gets slower to search.

INCLUDE solves this elegantly: the additional columns are placed only in the leaves of the tree, not in the tree itself.

INCLUDE vs an extended key

Let's look at two versions of the same index:

-- Option 1: name and price go into the key — into every level of the tree
CREATE INDEX product_category_name_price_idx
  ON product (category_id, name, price);

-- Option 2: name and price only in the leaves — the tree stays narrow
CREATE INDEX product_category_idx
  ON product (category_id) INCLUDE (name, price);

The difference is fundamental:

	Key columns `(category_id, name, price)`	Covering `(category_id) INCLUDE (name, price)`
Where the extra columns live	Tree + leaves	Leaves only
Tree size	Larger	Smaller
Affects sorting	Yes — `ORDER BY category_id, name` with no sort	No
Participates in `UNIQUE`	Yes — over the whole set	No — only over the key part
Filtering by `name`	Index Cond	Filter (slower)

Conclusion: if name and price are needed only in the SELECT, and not in the WHERE or the ORDER BY — use INCLUDE. If you also filter or sort by them, add them to the key.

How to check that the index-only scan works

EXPLAIN (ANALYZE, BUFFERS)
SELECT name, price FROM product WHERE category_id = 1;

A good result looks like this:

Index Only Scan using product_category_idx on product
  Index Cond: (category_id = 1)
  Heap Fetches: 0

Heap Fetches: 0 — PostgreSQL didn't go to the table. That's the goal.

Why Heap Fetches can be non-zero

There's an important subtlety here. An index-only scan works only when PostgreSQL is certain that the data in the index is up to date. That certainty comes from a special structure — the visibility map: it marks the table pages in which all rows are visible to any transaction.

The problem: the visibility map is updated by VACUUM. On tables with frequent updates and deletes, autovacuum may not keep up. Then PostgreSQL is forced to peek at the table anyway — to check whether the row is visible to the current transaction. As a result, Heap Fetches ≠ 0, and the whole point of the covering index is lost.

Diagnosis is simple: look at the number in the Heap Fetches line. If it's comparable to the number of rows returned, autovacuum isn't keeping up.

The fix is to make autovacuum more aggressive for that specific table:

ALTER TABLE product SET (
    autovacuum_vacuum_scale_factor = 0.05,
    autovacuum_vacuum_insert_scale_factor = 0.1
);

UNIQUE + INCLUDE — the most common case

A typical task: you need name to be unique within a category, but you always want to return price too when selecting.

Without INCLUDE, you have to choose:

-- Bad option: extend the uniqueness — but then the same name with a different price
-- becomes two different records. The logic is broken.
CREATE UNIQUE INDEX ON product (category_id, name, price);

-- Awkward option: two indexes instead of one — double the write load
CREATE UNIQUE INDEX ON product (category_id, name);
CREATE INDEX        ON product (category_id, name, price);

With INCLUDE, one index does both jobs:

CREATE UNIQUE INDEX product_category_name_uq
  ON product (category_id, name) INCLUDE (price);

Uniqueness is checked over (category_id, name). price is available for an index-only scan and doesn't participate in the uniqueness check. This is the most indisputable use of INCLUDE.

Limitations

INCLUDE is supported by B-tree, GiST, SP-GiST. GIN does not support it — and GIN is used for full-text search and JSONB.
You can't put expressions or functions in INCLUDE: INCLUDE (lower(name)) won't work. Only plain table columns.
If the table is small and fits entirely in PostgreSQL's memory — the heap pages are hot anyway, and the gain from a covering index is minimal.
SELECT * makes an index-only scan pointless: you can't put every column into the index.

In short

A regular index finds rows, but for the column values it goes to the table. INCLUDE lets you put the needed columns in the index leaves and avoid that trip.
INCLUDE places columns only in the leaves of the B-tree — the tree stays narrow and searches stay fast.
Key columns participate in sorting, uniqueness, and filtering. Columns in INCLUDE are only for returning in the SELECT.
The goal is an index-only scan (Heap Fetches: 0). It works when the visibility map marks pages as fully visible.
If Heap Fetches grows — autovacuum isn't keeping up; tune autovacuum_vacuum_scale_factor for the table.
UNIQUE (a, b) INCLUDE (c) is the only way to return c without extending the uniqueness.
GIN indexes don't support INCLUDE.