← Back to the section

When an application needs to store coordinates and search by them — "shops near me", "delivery within a zone", "nearest 10 points" — the first impulse is to create two fields lat and lon of type numeric. But then the query "all shops within 5 km of me" cannot run fast: the database can't index pairs of numbers as spatial objects.

PostGIS is a PostgreSQL extension that adds full-fledged geographic types, spatial indexes, and dozens of functions for working with geodata. It's what powers OpenStreetMap, Uber, and most government geo-services.

When you need PostGIS, and when plain numbers are enough

Setting up PostGIS makes sense if you're doing any of the following:

  • searching for objects within a radius ("cafés within 2 km");
  • calculating the distance between points;
  • checking whether a point falls inside a polygon (a district, a delivery zone);
  • finding the N nearest objects;
  • routes, geofences, boundary intersections.

If coordinates are simply stored and shown on a map but don't take part in queries, a pair of numeric(9,6) fields is perfectly enough. PostGIS is overkill here.

Installation

CREATE EXTENSION postgis;

The extension is available in most cloud databases (RDS, Cloud SQL, Supabase) and is installed with a single command. After that, new types and functions become available.

Two types: geography and geometry

PostGIS offers two main types for storing geodata.

geometry — a flat coordinate system. Distances are computed as if on a sheet of paper. It's faster and supports more operations, but gives accurate results only over small areas (a city, a region).

geography — a spherical system that accounts for the shape of the Earth. Slower, but accurate for any distances across the globe. ST_Distance returns meters, not degrees.

For most applications working with real coordinates (latitude/longitude), the right choice is geography(Point, 4326). The number 4326 is the identifier of the WGS84 coordinate system — the very one used by GPS.

geographygeometry
Accuracy over large distancesyesno
Unit of ST_DistancemetersSRID units
Speedslowerfaster
SRID4326 onlyany

Storing points

CREATE TABLE shop (
    id        bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    name      text NOT NULL,
    location  geography(Point, 4326) NOT NULL
);

Inserting a point is done via the ST_GeogFromText function with a WKT string, or via ST_MakePoint:

-- via WKT
INSERT INTO shop (name, location) VALUES (
    'Shop on Nevsky',
    ST_GeogFromText('SRID=4326;POINT(30.3358 59.9343)')
);

-- via ST_MakePoint
INSERT INTO shop (name, location) VALUES (
    'Shop on Nevsky',
    ST_MakePoint(30.3358, 59.9343)::geography
);

A common mistake: mixing up the coordinate order. In WKT and ST_MakePoint it's longitude first, then latitude — that is, POINT(lon lat), not POINT(lat lon). Most tutorials get confused right here.

The spatial index — pointless without it

CREATE INDEX ix_shop_location_gist ON shop USING gist (location);

Without a spatial index (GiST), queries like "find shops within 5 km" run as a full-table scan. On a million rows — seconds. With a GiST index — milliseconds.

A GiST index is mandatory for any table with geographic data that you query against.

The main function for radius search is ST_DWithin. It takes two points and a distance in meters, and it uses the GiST index:

SELECT id, name,
       ST_Distance(location, ST_GeogFromText('SRID=4326;POINT(30.3158 59.9398)')) AS distance_m
FROM shop
WHERE ST_DWithin(
    location,
    ST_GeogFromText('SRID=4326;POINT(30.3158 59.9398)'),
    5000   -- 5 km in meters
)
ORDER BY distance_m
LIMIT 50;

An important point: ST_DWithin in WHERE uses the index. ST_Distance in SELECT only computes a value for the rows already selected. If you write WHERE ST_Distance(...) < 5000 without ST_DWithin, the index won't be used and the query will be slow.

Finding the N nearest points (kNN)

When you need not "everything within a radius" but "the 5 nearest", use the <-> operator in ORDER BY:

SELECT id, name
FROM shop
ORDER BY location <-> ST_GeogFromText('SRID=4326;POINT(30.3158 59.9398)')
LIMIT 10;

This operator also uses the GiST index and works fast even on millions of rows.

Polygons: zones, districts, boundaries

Polygons are stored the same way as points, only the type is geography(Polygon, 4326):

CREATE TABLE district (
    id        bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    name      text NOT NULL,
    boundary  geography(Polygon, 4326) NOT NULL
);

CREATE INDEX ix_district_boundary_gist ON district USING gist (boundary);

-- Insert from GeoJSON
INSERT INTO district (name, boundary) VALUES (
    'Central',
    ST_GeogFromGeoJSON('{"type":"Polygon","coordinates":[[...]]}')
);

Check which district a point falls into:

SELECT name FROM district
WHERE ST_Contains(boundary::geometry, ST_MakePoint(30.3158, 59.9398));

Three main functions for working with polygons:

FunctionWhat it does
ST_Contains(A, B)A fully contains B
ST_Within(A, B)A is located inside B
ST_Intersects(A, B)A and B share at least one common point

Inserting from application code

The most universal way to pass coordinates from an application is a WKT string of the form SRID=4326;POINT(lon lat). Every driver accepts it without extra dependencies.

import org.jooq.DSLContext;
import org.jooq.impl.DSL;
import static com.example.db.Tables.SHOP;

public record GeoPoint(double lat, double lon) {
    public String toWkt() {
        return "SRID=4326;POINT(%f %f)".formatted(lon, lat);
    }
}

void insertShop(DSLContext ctx, String name, GeoPoint pt) {
    ctx.insertInto(SHOP)
       .set(SHOP.NAME, name)
       .set(SHOP.LOCATION,
            DSL.field("ST_GeogFromText({0})", Object.class, pt.toWkt()))
       .execute();
}
import (
    "context"
    "fmt"
    "github.com/jackc/pgx/v5/pgxpool"
)

type GeoPoint struct{ Lat, Lon float64 }

func (p GeoPoint) WKT() string {
    return fmt.Sprintf("SRID=4326;POINT(%f %f)", p.Lon, p.Lat)
}

func insertShop(ctx context.Context, pool *pgxpool.Pool, name string, pt GeoPoint) error {
    _, err := pool.Exec(ctx,
        `INSERT INTO shop (name, location) VALUES ($1, ST_GeogFromText($2))`,
        name, pt.WKT(),
    )
    return err
}
import { Pool } from 'pg';

interface GeoPoint { lat: number; lon: number }

function toWkt(pt: GeoPoint): string {
    return `SRID=4326;POINT(${pt.lon} ${pt.lat})`;
}

async function insertShop(pool: Pool, name: string, pt: GeoPoint): Promise<void> {
    await pool.query(
        `INSERT INTO shop (name, location) VALUES ($1, ST_GeogFromText($2))`,
        [name, toWkt(pt)],
    );
}
import psycopg
from dataclasses import dataclass

@dataclass
class GeoPoint:
    lat: float
    lon: float

    def wkt(self) -> str:
        return f"SRID=4326;POINT({self.lon} {self.lat})"

async def insert_shop(conn: psycopg.AsyncConnection, name: str, pt: GeoPoint) -> None:
    await conn.execute(
        "INSERT INTO shop (name, location) VALUES (%s, ST_GeogFromText(%s))",
        (name, pt.wkt()),
    )

If you need geometric operations on the application side, there are specialized libraries: JTS (org.locationtech.jts) for Java, Shapely for Python, turf.js for Node. They all work with the same formats (WKT, GeoJSON) as PostGIS.

Filtering by several fields

Suppose you need to search only for active shops within a radius. A regular GiST index covers only location, and the is_active = true filter is applied afterward. For such cases there's the btree_gist extension, which lets you create a composite spatial index:

CREATE EXTENSION btree_gist;
CREATE INDEX ix_shop_active_loc ON shop USING gist (is_active, location);

SELECT * FROM shop
WHERE is_active = true
  AND ST_DWithin(location, ST_GeogFromText('SRID=4326;POINT(30.3158 59.9398)'), 5000);

Such an index handles both conditions at once.

Migrating from lat/lon to geography

If a table already stores coordinates in lat and lon fields, you can add geodata without rewriting the application — via a generated column:

ALTER TABLE shop ADD COLUMN location geography(Point, 4326)
GENERATED ALWAYS AS (ST_MakePoint(lon, lat)::geography) STORED;

CREATE INDEX ix_shop_location_gist ON shop USING gist (location);

The old code keeps reading and writing lat/lon. New queries use location for spatial operations. No double-writing — the database computes the column automatically.

Common mistakes

Mixed-up coordinate order. POINT(lon lat) — longitude first, then latitude. This runs counter to intuition (we're used to saying "latitude, longitude"), which is why the mistake is very common. The result: points end up in a completely different place.

ST_Distance without ST_DWithin in WHERE. WHERE ST_Distance(...) < 5000 doesn't use the index. Always use ST_DWithin for filtering, and ST_Distance only in SELECT to display the distance.

geometry instead of geography for GPS coordinates. For small zones it works, but it gives inaccurate results over large distances — and the unit won't be meters. For coordinates in WGS84 format (what GPS gives you), use geography.

No spatial index. Sometimes people forget to create the GiST index and are surprised by slow queries. The index is mandatory right when the table is created.

The Haversine formula in code. Sometimes the distance is computed by hand in the application, even though PostGIS does it more accurately and faster right in the query.

In short

  • PostGIS is needed when data takes part in spatial queries: radius search, point-in-polygon, nearest N points.
  • For GPS coordinates use the type geography(Point, 4326) — it accounts for the shape of the Earth and returns distances in meters.
  • A GiST index is mandatory: without it, spatial queries run hundreds of times slower.
  • For radius search — ST_DWithin in WHERE (uses the index). ST_Distance — only in SELECT.
  • For the N nearest points — the <-> operator in ORDER BY.
  • In WKT and ST_MakePoint — longitude first, then latitude.
  • From an application it's convenient to pass coordinates as a WKT string SRID=4326;POINT(lon lat).
  • Switching from lat/lon to geography can be done via a generated column without changing existing code.
  • Index types in PostgreSQL — GiST and other indexes in detail.
  • Materialized views — cache the results of heavy geo-queries.
  • PostgreSQL extensions — how to install postgis and btree_gist.