← Back to the section

You've noticed that a table takes up 10 GB on disk, even though the real data in it is three times smaller. Or PostgreSQL suddenly starts slowing down for no obvious reason. Most likely the culprit is bloat — accumulated dead tuples. Let's look at why this happens and how to deal with it.

Why rows aren't deleted immediately

PostgreSQL uses an approach called MVCC (Multi-Version Concurrency Control). The idea is this: when you run an UPDATE or DELETE, the old row isn't erased right away. It's marked as "dead" and stays in the table file.

Why? Because at that moment another transaction may be reading the old version of the data — and must see it consistently. As long as even one open transaction can still see the old row, PostgreSQL isn't allowed to remove it.

As a result, after intensive work with data, dead tuples accumulate in tables. They are exactly what inflates the table size — this is bloat.

What VACUUM does

VACUUM is a cleanup command. It walks through the table and does three things:

Reclaims space from dead tuples. The space isn't returned to the operating system — the table file doesn't shrink. Instead, the freed pages are recorded in the FSM (Free Space Map), and new rows are written there, so the file doesn't keep growing.

Updates the visibility map. This is a map of pages where all rows are guaranteed to be visible to every transaction. It's needed for an efficient Index Only Scan — without it the planner has to reach into the table itself even for an index-only query.

Prevents XID wraparound. More on this in a separate section below.

VACUUM doesn't block reads and writes — it works with a SHARE UPDATE EXCLUSIVE lock. Only DDL commands like ALTER TABLE will have to wait.

Three forms of the command

-- Plain VACUUM: reclaims dead tuples, doesn't lock the table
VACUUM order_doc;

-- With analyze: also refreshes statistics for the planner
VACUUM ANALYZE order_doc;

-- Full table rewrite: returns space to the OS, but locks everything
VACUUM FULL order_doc;

VACUUM and VACUUM ANALYZE are safe to run at any time.

VACUUM FULL is a fundamentally different operation. It fully rewrites the table from scratch and takes an ACCESS EXCLUSIVE lock while doing so. This means: while VACUUM FULL is running, the table is unavailable for both reads and writes. On large tables this can take hours. In production it's almost never used.

A non-blocking alternative to VACUUM FULL is the pg_repack utility (or pg_squeeze). It rebuilds the table in the background without getting in the way of the application:

pg_repack -d mydb -t order_doc

VACUUM ANALYZE is worth running manually after a bulk UPDATE or DELETE — don't wait for the automation to kick in: the planner can work with stale statistics for a long time.

autovacuum — automatic cleanup

PostgreSQL runs VACUUM automatically through the autovacuum daemon. It watches every table and triggers cleanup once enough dead tuples have accumulated.

The default trigger threshold:

dead tuples > 50 + 0.2 × number of live rows

On a table of one million rows autovacuum will kick in at ~200,000 dead tuples. For most tables that's enough.

But for large and busy tables the default 20% is too much. The setting can be configured right on the table:

ALTER TABLE order_doc SET (
    autovacuum_vacuum_scale_factor = 0.05,
    autovacuum_analyze_scale_factor = 0.05
);

Now autovacuum will trigger at ~50,000 dead tuples instead of 200,000 — there will be less bloat.

How to detect bloat

A quick look through the system statistics:

SELECT relname,
       n_live_tup,
       n_dead_tup,
       round(100.0 * n_dead_tup / NULLIF(n_live_tup, 0), 1) AS dead_pct,
       last_autovacuum,
       last_autoanalyze
FROM pg_stat_user_tables
WHERE n_live_tup > 1000
ORDER BY n_dead_tup DESC
LIMIT 10;

What to look at:

  • dead_pct > 20% — the table is a candidate for a manual VACUUM.
  • last_autovacuum long ago (more than a day on a hot table) — autovacuum isn't keeping up.

For a precise picture there's the pgstattuple extension:

CREATE EXTENSION pgstattuple;
SELECT * FROM pgstattuple('order_doc');
-- dead_tuple_percent > 30% — the table is clearly bloated

Indexes bloat too — separately from tables. This is fixed with the REINDEX CONCURRENTLY command (available since PostgreSQL 12) — it doesn't lock the table while rebuilding.

fillfactor — room for updates

By default PostgreSQL fills pages with data all the way (fillfactor = 100). On an UPDATE the updated row most often doesn't fit on the same page and is written to a new one — and an extra entry appears in the indexes.

If a table is updated heavily, it makes sense to leave some room on the pages:

ALTER TABLE order_doc SET (fillfactor = 85);
VACUUM FULL order_doc;  -- applies the new fillfactor (once, in a maintenance window)

With fillfactor = 85 pages are filled only up to 85%. When an UPDATE arrives, the updated row often fits on the same page — this is called a HOT update (Heap-Only Tuple). The index isn't touched in that case, so the load is lower.

When autovacuum can't keep up

There are several situations where autovacuum is working, but dead tuples still keep accumulating.

A long transaction. MVCC forbids deleting rows that even one open transaction can see. If you have a transaction hanging for hours — bloat will grow regardless of autovacuum.

SELECT pid,
       age(now(), xact_start) AS xact_age,
       state,
       query
FROM pg_stat_activity
WHERE xact_start IS NOT NULL
ORDER BY xact_age DESC
LIMIT 10;

If xact_age is more than 5 minutes — it's worth figuring out what's going on there. Typical causes: an unclosed session in the code, a stuck background process.

A good practice is to set the timeout idle_in_transaction_session_timeout = '30s': it automatically closes transactions that were started but are doing nothing.

A stuck replication slot. If a replica or subscriber has fallen far behind, PostgreSQL holds on to the WAL files until they've been read. This also prevents autovacuum from reclaiming old rows.

Prepared transactions. The PREPARE TRANSACTION command creates "dangling" transactions that live until an explicit COMMIT PREPARED or ROLLBACK PREPARED.

SELECT * FROM pg_prepared_xacts;

If something is stuck in there — it blocks cleanup the same way a long ordinary transaction does.

autovacuum is turned off. Sometimes it's disabled before a bulk data load and people forget to turn it back on.

SELECT name, setting FROM pg_settings WHERE name = 'autovacuum';

SELECT relname, reloptions
FROM pg_class
WHERE relkind = 'r' AND reloptions::text LIKE '%autovacuum%';

XID wraparound — a critical situation

PostgreSQL has a 32-bit transaction counter (XID). When it exhausts its ~2.1 billion values and "wraps around," the database can confuse old and new transactions. To prevent this, VACUUM "freezes" old rows — marks them as visible to everyone forever.

To check how close you are to the danger mark:

SELECT datname,
       age(datfrozenxid)         AS xid_age,
       2147483647 - age(datfrozenxid) AS xids_left
FROM pg_database;

If xid_age is more than 1.5 billion — autovacuum isn't keeping up, and you need to intervene manually. In this state PostgreSQL starts throttling writes itself, warning of the danger.

What to do if a table is bloated

A step-by-step diagnostic algorithm:

  1. Check n_dead_tup > 20% of n_live_tup — autovacuum isn't doing its job.
  2. Check last_autovacuum — if it was long ago, look for the cause: long transactions, replication slots, disabled autovacuum.
  3. pgstattuple dead_tuple_percent > 30% — run VACUUM <table> manually. If that doesn't help — lower autovacuum_vacuum_scale_factor for this table.
  4. Default fillfactor on a heavily updated table — change it to 80–90.
  5. Need to return space to the OS — pg_repack during working hours instead of VACUUM FULL.
  6. Index bloat — REINDEX CONCURRENTLY.

Common mistakes

VACUUM FULL in production during working hours. It locks the table completely. Use pg_repack.

autovacuum disabled globally. Sometimes it's disabled to speed up a bulk load, but people forget to bring it back. Without autovacuum, bloat grows uncontrollably.

Default scale_factor 0.2 on large tables. On a table of 10 million rows that's 2 million dead tuples — too many. Lower it to 0.05 for busy tables.

Long transactions without a timeout. A single hanging transaction blocks cleanup for the whole database. Set idle_in_transaction_session_timeout.

Not running VACUUM ANALYZE after a bulk update. The planner will work with stale statistics until autovacuum gets to it on its own — and that can take a while.

In short

  • PostgreSQL doesn't delete rows immediately — old versions accumulate as dead tuples (MVCC).
  • VACUUM reclaims space inside the file, updates the visibility map, and prevents XID wraparound.
  • VACUUM doesn't return space to the operating system — the file doesn't shrink.
  • VACUUM FULL returns space but locks the table completely. In production, use pg_repack instead.
  • autovacuum runs automatically, but on large tables the default scale_factor of 0.2 is too high — lower it to 0.05.
  • Long transactions, stuck replication slots, and prepared transactions get in the way of cleanup.
  • fillfactor 80–90 on heavily updated tables lowers the load through HOT updates.
  • XID wraparound is a critical situation: keep an eye on age(datfrozenxid).
  • Indexes in PostgreSQL — how Index Only Scan and the visibility map work.
  • Transaction isolation levels — MVCC and row visibility.
  • Monitoring PostgreSQL — pg_stat_user_tables and other metrics.