← Back to the section

A database that "loses money on a crash" is not a database — it's a cache with ambitions. Let's start from scratch: what a transaction is, what ACID guarantees, and why choosing an isolation level is not theory but a practical decision with consequences.

What a transaction is and why you need it

Without transactions, any crash in the middle of an operation leaves the data in a broken state. Imagine a money transfer:

UPDATE account SET balance = balance - 100 WHERE id = 1;
-- the server crashes right here
UPDATE account SET balance = balance + 100 WHERE id = 2;

The first line ran, the second did not. The money vanished.

A transaction wraps several operations into a single indivisible action: either all of them apply, or none.

BEGIN;
UPDATE account SET balance = balance - 100 WHERE id = 1;
UPDATE account SET balance = balance + 100 WHERE id = 2;
COMMIT;

Now on any crash PostgreSQL either applies both lines or rolls both back.

ACID — the four transaction guarantees

ACID is an acronym describing exactly what a database guarantees.

A — Atomicity

A transaction is an indivisible action. If something goes wrong before COMMIT, all changes are rolled back automatically.

PostgreSQL provides this through the WAL (Write-Ahead Log) — a log of changes. Every operation is first written to the log, then applied to the data. On a server crash PostgreSQL reads the log and either replays the transaction to the end (if there was a COMMIT) or pretends it never happened.

C — Consistency

After a transaction the database stays in a correct state — all constraints are satisfied: NOT NULL, FOREIGN KEY, UNIQUE, CHECK.

INSERT INTO product (category_id, price, name) VALUES (99, 200, 'Cake');
-- ERROR: insert or update on table "product" violates foreign key constraint
-- The whole transaction is rolled back

It's important to understand: PostgreSQL only checks the constraints declared in the schema. What counts as "correct" data is decided by the architect, not the database.

I — Isolation

Concurrent transactions must not interfere with each other. This is the trickiest letter — in practice there are several isolation levels with different trade-offs between speed and strictness. We'll cover it in detail below.

D — Durability

If COMMIT returned success, the changes survive any failure: a server crash, a power outage, a reboot.

The same WAL: on COMMIT the log is flushed to disk (fsync), and only then does the client get a response. The data is on disk — even if the data pages haven't been written yet.

The synchronous_commit = off parameter speeds up commits but breaks this guarantee: a confirmed transaction can be lost on a crash. Appropriate only for non-critical data — metrics, access logs.

How PostgreSQL keeps transactions from blocking each other's reads — MVCC

The naive way to isolate transactions is locks: while one transaction writes, another can't read. That's slow.

PostgreSQL uses a different approach — MVCC (Multi-Version Concurrency Control). Every row stores several versions. An UPDATE doesn't change the row in place — it creates a new version, and the old one remains.

-- T1: updates the price
BEGIN;
UPDATE product SET price = 180 WHERE id = 3;
-- On disk there are now TWO versions of the row id=3:
--   old: price=150 (visible to transactions that started before T1)
--   new: price=180 (visible only after COMMIT of T1)

A concurrent transaction T2 that started before T1 committed sees the old version. No read locks — each transaction sees its own consistent snapshot of the data.

Old row versions are removed by the background VACUUM process once they become invisible to all active transactions.

The four isolation levels

Full isolation is the ideal, but an expensive one. The SQL standard defines four levels: the higher the level, the stricter the isolation and the larger the overhead.

LevelDirty ReadNon-Repeatable ReadPhantom ReadWrite Skew
Read Uncommittedin PG — noyesyesyes
Read Committed (default)noyesyesyes
Repeatable Readnononoyes
Serializablenononono

The anomalies in the table are concrete "surprises" a transaction may get during concurrent work. Let's go through each.

Read Uncommitted — in PostgreSQL behaves like Read Committed

The standard allows this level to read uncommitted changes (dirty read): a transaction sees data that another hasn't saved yet. PostgreSQL never does this at any level — thanks to MVCC only committed versions are visible. So Read Uncommitted in PostgreSQL = Read Committed.

Read Committed — the default level

Each query inside a transaction sees the data committed as of the start of that query. Between queries the data may change.

Non-repeatable read — the same query returns different values twice:

-- T1
BEGIN;
SELECT price FROM product WHERE id = 3;
-- → 150

-- T2 in the meantime
BEGIN;
UPDATE product SET price = 200 WHERE id = 3;
COMMIT;

-- T1 continues
SELECT price FROM product WHERE id = 3;
-- → 200  ← same row, different value within the same transaction
COMMIT;

Phantom read — the same range returns a different number of rows:

-- T1
BEGIN;
SELECT COUNT(*) FROM product WHERE category_id = 1;
-- → 3

-- T2 inserted a new row and committed

-- T1 continues
SELECT COUNT(*) FROM product WHERE category_id = 1;
-- → 4  ← within the same transaction, more rows
COMMIT;

When it fits: most CRUD services with short transactions — read one row, update it, commit. If all the logic fits into a single query, no anomalies occur.

Repeatable Read — a snapshot for the whole transaction

The transaction takes a snapshot of the data at the first query and sees that snapshot until the end. Non-repeatable read and phantom read disappear.

-- T1
BEGIN ISOLATION LEVEL REPEATABLE READ;
SELECT price FROM product WHERE id = 3;
-- → 150

-- T2 updated price=200 and committed

-- T1 continues
SELECT price FROM product WHERE id = 3;
-- → 150  ← the snapshot is fixed, T2's change is invisible
COMMIT;

Additionally: if two transactions try to update the same row, the second one gets an error:

-- T2 tries to update a row that T1 has already updated and committed:
-- ERROR: could not serialize access due to concurrent update

The application must catch this error and retry the transaction. Without handling such an error the code will be unstable under load.

Write skew — what Repeatable Read doesn't catch

This is a subtle anomaly: two transactions read the same data, make independent decisions, and together break a rule that each one alone satisfied.

Example: the rule is "the total price of products in the 'Sweets' category doesn't drop below 250". Right now the total is 50 + 70 + 150 = 270.

-- T1: lowers the candy price by 70
BEGIN ISOLATION LEVEL REPEATABLE READ;
SELECT SUM(price) FROM product WHERE category_id = 1;
-- → 270, slack of 20, can lower by 70
UPDATE product SET price = 80 WHERE id = 3;

-- T2 in parallel: lowers the marmalade price by 40
BEGIN ISOLATION LEVEL REPEATABLE READ;
SELECT SUM(price) FROM product WHERE category_id = 1;
-- → 270 (T1 hasn't committed yet, T2's snapshot doesn't see it)
UPDATE product SET price = 30 WHERE id = 2;

-- T1 COMMIT → total: 50 + 70 + 80 = 200
-- T2 COMMIT → total: 50 + 30 + 80 = 160  ← violation!

Both transactions changed different rows, so there was no UPDATE conflict — Repeatable Read didn't see the problem.

When it fits: analytical queries, period reports where you need a consistent slice of data as of the start.

Serializable — full isolation

Transactions execute as if they ran strictly one after another, in some order. No anomalies, including write skew.

PostgreSQL implements this through SSI (Serializable Snapshot Isolation) — tracking dependencies between transactions. If PostgreSQL sees that two concurrent transactions form a conflict, one of them is rolled back:

-- The same example: T1 and T2 lower prices in one category in parallel

-- T1 COMMIT → success
-- T2 COMMIT → ERROR: could not serialize access due to read/write dependencies
--                    among transactions

T2 is rolled back, the application retries it — on the retry T2 sees the new data after T1 and makes the right decision.

The cost: PostgreSQL doesn't hold physical locks, but it rolls back transactions more often. The application must be able to retry a transaction on the SQLSTATE 40001 (serialization_failure) error. Read-only transactions on Serializable are almost free.

When it fits: money operations with invariants across several rows, uniqueness checks before insertion, any situation with write skew.

Which level to choose when

TaskLevel
CRUD: read one row, update, commitRead Committed (default)
Period report, need a consistent sliceRepeatable Read
Money transfer, limit check before debitingSerializable
Invariant across several rows, concurrent updatesSerializable
Long analytical query (read-only)Repeatable Read

A few practical rules:

  • Raise the level from the bottom up — each next one is more expensive.
  • If you chose Repeatable Read or Serializable — implement transaction retry on a serialization error. Without it the code is unstable under load.
  • SELECT ... FOR UPDATE is often simpler and clearer than a high isolation level: it locks a specific row, causes no rollbacks, and is easy to read in the logs.

In short

  • ACID — the four transaction guarantees: atomicity (all or nothing), consistency (constraints satisfied), isolation (concurrent transactions don't interfere), durability (data isn't lost after COMMIT).
  • WAL provides atomicity and durability — the log is flushed to disk before the client is acknowledged.
  • MVCC — each transaction sees a consistent snapshot without read locks.
  • Read Committed (default) — a snapshot per query; allows non-repeatable read and phantom read.
  • Repeatable Read — a snapshot for the whole transaction; eliminates non-repeatable and phantom reads, but not write skew.
  • Serializable — full isolation, including write skew; requires handling serialization errors in the application.
  • PostgreSQL never shows a dirty read at any level — MVCC lets you see only committed data.
  • Partitioning and sharding in PostgreSQL — when the database grows and a single table is no longer enough.
  • PG Style Guide — practical rules for working with PostgreSQL.