← Back to the section

PostgreSQL supports two unusual data types that most databases don't have: arrays and range types. They solve specific problems and, applied correctly, remove extra tables or dozens of lines of code. Let's look at when you need them and how to work with them.

Arrays: when a list is part of the record

Usually, if an article has tags, you create a separate article_tag table with a foreign key. That's the right thing in the general case. But sometimes it's unnecessary complexity for the sake of formality.

PostgreSQL lets you store an array right in a column:

CREATE TABLE article (
    id    bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    title text   NOT NULL,
    tags  text[] NOT NULL DEFAULT '{}'
);

INSERT INTO article (title, tags) VALUES ('Title', ARRAY['ddd', 'pg', 'architecture']);

You can filter by this array:

SELECT * FROM article WHERE 'pg' = ANY(tags);          -- has at least this tag
SELECT * FROM article WHERE tags @> ARRAY['ddd'];       -- contains all of the given ones
SELECT * FROM article WHERE tags && ARRAY['ddd', 'pg']; -- overlaps with the set

The @> (contains) and && (overlaps) operators replace a multi-line JOIN with filtering.

When an array fits

An array works well if all of these conditions hold:

  • Simple scalar values — tags, permissions, a list of supported locales.
  • Bounded size — dozens of elements at most, not thousands.
  • Elements have no attributes of their own — nobody edits the "third tag" separately, they only replace the whole set.
  • No references to a specific element from other tables.

When it's better to create a table

If even one of the conditions doesn't hold, an array becomes a problem:

  • An element has its own fields — for example, {name, value, quantity}. That's a table, not an array.
  • You need to reference a specific element — a position in the array is unstable when items are deleted.
  • Thousands of elements — an array isn't built for large volumes.
  • Elements are updated independently from different queries or transactions.

A typical mistake is storing order items in an array:

-- Common mistake: order items in an array
CREATE TABLE order_doc (
    id    bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    items jsonb[]  -- can't reference, can't validate status, can't update a single item
);

-- Correct: a separate table
CREATE TABLE order_doc (
    id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY
);

CREATE TABLE order_item (
    id        bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    order_id  bigint NOT NULL REFERENCES order_doc(id) ON DELETE CASCADE,
    sku       varchar(50) NOT NULL,
    quantity  integer NOT NULL CHECK (quantity > 0),
    price     numeric(15, 2) NOT NULL
);

An index for searching by array

A regular B-tree index doesn't help with the @> and && operators. You need a GIN index:

CREATE INDEX article_tags_gin ON article USING gin (tags);

After that, queries with @>, &&, and ANY will use the index instead of a full table scan.

How to read an array in code

Drivers in different languages map a PostgreSQL array differently. Below is reading the tags text[] column.

// jOOQ generates String[] for text[] columns.
List<String> tags = dsl
    .select(ARTICLE.TAGS)
    .from(ARTICLE)
    .where(ARTICLE.ID.eq(articleId))
    .fetchOne(r -> Arrays.asList(r.get(ARTICLE.TAGS)));
// pgx maps text[] to []string directly.
var tags []string
err := pool.QueryRow(ctx,
    "SELECT tags FROM article WHERE id = $1", articleID,
).Scan(&tags)
// node-postgres (pg) returns text[] as string[].
const { rows } = await pool.query<{ tags: string[] }>(
    'SELECT tags FROM article WHERE id = $1',
    [articleId],
);
const tags: string[] = rows[0].tags;
# psycopg3 automatically maps text[] to list[str].
async with await psycopg.AsyncConnection.connect(dsn) as conn:
    row = await conn.execute(
        "SELECT tags FROM article WHERE id = %s", (article_id,)
    )
    tags: list[str] = (await row.fetchone())[0]

Range types: an interval as a first-class type

When you need to store a period, you usually set up two columns: valid_from and valid_to. That works, but it breeds verbose code and leaves room for mistakes.

PostgreSQL offers another way: a range type — a data type that is itself an interval.

Built-in range types:

TypeWhat it stores
int4rangeinterval of integers (integer)
int8rangeinterval of bigint
numrangeinterval of numeric
daterangeinterval of dates
tsrangeinterval of timestamp without time zone
tstzrangeinterval of timestamptz

A range fits when the entity itself is an interval: a tariff's validity period, a room booking, a price history, an age restriction.

One column instead of two

-- Two fields — lots of code, easy to get the boundary conditions wrong
CREATE TABLE tariff_v1 (
    id         bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    valid_from timestamptz NOT NULL,
    valid_to   timestamptz          -- NULL means "no end"
);

-- Range — an interval as a single atomic type
CREATE TABLE tariff_v2 (
    id       bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    validity tstzrange NOT NULL DEFAULT tstzrange(now(), 'infinity', '[)')
);

The query "what is in effect right now" looks different:

-- Two fields — a long WHERE with NULL handling
SELECT * FROM tariff_v1
WHERE valid_from <= now()
  AND (valid_to IS NULL OR valid_to > now());

-- Range — concise
SELECT * FROM tariff_v2
WHERE validity @> now();

The @> operator reads as "contains". PostgreSQL figures out the boundaries itself.

Range type operators

OperatorWhat it does
@>range contains a point or another range
&&two ranges overlap
-|-two ranges are adjacent (follow one another)
<< / >>one range is strictly to the left / right of the other

Open and closed edges

For a range type you can specify whether it includes its boundaries. The notation:

  • [a, b)a is included, b is excluded.
  • [a, b] — both are included.
  • (a, b) — both are excluded.
  • [a, +infinity) — no upper boundary.

For dates and times, the standard is [) — the left boundary is included, the right one is excluded. This matters for correct comparison:

Booking A: [14:00, 16:00)
Booking B: [16:00, 18:00)
→ They don't overlap — that's correct.

If both bookings were [] (closed on both sides), they would overlap at the point 16:00, even though physically they run one after another.

EXCLUDE constraint: guarding against overlaps at the database level

Bookings must not overlap. Usually this is checked in application code. But a check in code doesn't protect against simultaneous requests from two users.

PostgreSQL solves this at the database level with an EXCLUDE constraint. It's a unique capability that other popular databases don't have.

CREATE EXTENSION IF NOT EXISTS btree_gist;

CREATE TABLE booking (
    id      bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    room_id bigint    NOT NULL,
    period  tstzrange NOT NULL,
    EXCLUDE USING gist (room_id WITH =, period WITH &&)
);

The constraint reads like this: "don't allow two records where room_id is the same AND period overlaps". On an attempt to insert an overlapping booking, PostgreSQL returns an error:

INSERT INTO booking (room_id, period) VALUES (1, '[2026-05-07 14:00, 2026-05-09 12:00)');
-- OK

INSERT INTO booking (room_id, period) VALUES (1, '[2026-05-08 10:00, 2026-05-10 12:00)');
-- ERROR: conflicting key value violates exclusion constraint

The btree_gist extension is needed so that PostgreSQL can include a regular bigint type (for room_id) in the GiST index. Without it, an EXCLUDE with several columns of different types doesn't work.

Important: EXCLUDE protects even against simultaneous inserts from different connections. A check on the application side doesn't guarantee that.

How to read a range column in code

Below is reading the validity tstzrange column.

// jOOQ generates Range<OffsetDateTime> for tstzrange.
var row = dsl
    .select(TARIFF_V2.VALIDITY)
    .from(TARIFF_V2)
    .where(TARIFF_V2.ID.eq(tariffId))
    .fetchOne();

var r = row.get(TARIFF_V2.VALIDITY);
var from = r.lower().toInstant();
var to   = r.upper().toInstant();
// pgx reads tstzrange via pgtype.Range[pgtype.Timestamptz].
var validity pgtype.Range[pgtype.Timestamptz]
err := pool.QueryRow(ctx,
    "SELECT validity FROM tariff_v2 WHERE id = $1", tariffID,
).Scan(&validity)

from := validity.Lower.Time
to   := validity.Upper.Time
// node-postgres returns tstzrange as a string.
// It's more convenient to break it apart in SQL:
const { rows } = await pool.query<{ lower: Date; upper: Date }>(
    `SELECT lower(validity) AS lower, upper(validity) AS upper
       FROM tariff_v2 WHERE id = $1`,
    [tariffId],
);
const { lower, upper } = rows[0];
# psycopg3 maps tstzrange to TimestamptzRange.
from psycopg.types.range import TimestamptzRange

async with await psycopg.AsyncConnection.connect(dsn) as conn:
    row = await conn.execute(
        "SELECT validity FROM tariff_v2 WHERE id = %s", (tariff_id,)
    )
    validity: TimestamptzRange = (await row.fetchone())[0]
    lower = validity.lower
    upper = validity.upper

Multirange (PostgreSQL 14+)

Starting with PostgreSQL 14 there is a multirange — a set of non-overlapping intervals in a single column. It's useful for schedules with several availability windows:

CREATE TABLE schedule (
    id    bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    avail tstzmultirange NOT NULL
);

Instead of storing several rows with different periods — a single column with an ordered set of intervals.

In short

  • An array fits simple sets of scalar values (tags, permissions), when there are few elements and none of them has its own attributes. As soon as an element needs fields or foreign references — it's a table.
  • To search by an array you need a GIN index; the @> and && operators do a full scan without it.
  • A range type replaces a pair of valid_from/valid_to columns and gives convenient operators for working with intervals.
  • The standard for dates and times is the [) boundary: the left edge is included, the right one is excluded. That way two adjacent intervals don't overlap.
  • An EXCLUDE constraint forbids overlapping intervals right in the database — including protection during simultaneous requests. It requires the btree_gist extension.
  • A multirange (PostgreSQL 14+) — several intervals in a single column for schedules with multiple windows.
  • Index types in PostgreSQL — GIN and GiST: when to use which.
  • JSONB in PostgreSQL — when you need JSONB instead of an array.
  • Time and time zones — why tstzrange, not tsrange.