Imagine you are building a task management service used by several companies at once. Employees of company A must not see the data of company B. How do you store all of this in a single PostgreSQL?
This is the problem of multi-tenancy — serving several "tenants" on shared infrastructure. There are three classic approaches, and the choice between them is always a trade-off between simplicity, isolation, and cost.
Three approaches and how they differ
| Row-per-tenant | Schema-per-tenant | DB-per-tenant | |
|---|---|---|---|
| Data | shared tables + tenant_id | separate schemas | separate databases |
| Isolation | logical (via RLS) | physical within the cluster | full physical |
| Migrations | once for everyone | separately for each tenant | separately for each tenant |
| Per-tenant backups | hard | medium (pg_dump --schema=) | easy |
| Scale | thousands of tenants | up to hundreds | a few to dozens |
The most common approach in practice is a hybrid: small and free customers are stored in a shared database, while large enterprise customers get their own database.
Row-per-tenant — the simple and scalable option
All customers live in the same tables. Each row gets a tenant_id column that indicates who it belongs to.
CREATE TABLE tenant (
id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
name text NOT NULL UNIQUE
);
CREATE TABLE order_doc (
id bigint GENERATED ALWAYS AS IDENTITY,
tenant_id bigint NOT NULL REFERENCES tenant(id),
customer_id bigint NOT NULL,
PRIMARY KEY (tenant_id, id)
);
-- tenant_id as the first column of the index — queries only work on the tenant's data
CREATE INDEX ix_order_tenant_status ON order_doc (tenant_id, status);
The key rule: every query must include WHERE tenant_id = ?. Without it, one customer will see another's data — a critical security vulnerability.
An example of a correct query:
import org.jooq.DSLContext;
import static com.example.generated.tables.OrderDoc.ORDER_DOC;
public class TenantAwareDsl {
private final DSLContext dsl;
private final TenantContext ctx;
public Result<Record> selectOrders() {
return dsl.selectFrom(ORDER_DOC)
.where(ORDER_DOC.TENANT_ID.eq(ctx.currentTenantId()))
.fetch();
}
}
func selectOrders(ctx context.Context, pool *pgxpool.Pool, tenantID int64) (pgx.Rows, error) {
return pool.Query(ctx,
"SELECT * FROM order_doc WHERE tenant_id = $1",
tenantID,
)
}
async function selectOrders(pool: Pool, tenantId: bigint): Promise<QueryResult> {
return pool.query(
'SELECT * FROM order_doc WHERE tenant_id = $1',
[tenantId],
);
}
async def select_orders(conn: asyncpg.Connection, tenant_id: int) -> list[asyncpg.Record]:
return await conn.fetch(
"SELECT * FROM order_doc WHERE tenant_id = $1",
tenant_id,
)
The composite primary key (tenant_id, id) guarantees that all of the table's indexes are automatically segmented by tenant — one customer's queries never scan another's data.
Row-Level Security — a safety net against bugs in code
Filtering with WHERE tenant_id = ? is the first line of defense. But if a developer forgets to add the condition to one of the queries, data leaks.
Row-Level Security (RLS) is a PostgreSQL mechanism that adds filtering right at the database level. Even if the application sends SELECT * FROM order_doc with no conditions, the database returns only the rows of the current tenant.
ALTER TABLE order_doc ENABLE ROW LEVEL SECURITY;
CREATE POLICY tenant_isolation ON order_doc
USING (tenant_id = current_setting('app.tenant_id')::bigint);
Before running queries, you need to tell the database which tenant you are working with — through a session variable:
// inside the transaction, before running queries
connection.createStatement().execute(
"SET LOCAL app.tenant_id = " + tenantId
);
// after this, any SELECT/UPDATE is automatically filtered by the database
func setTenant(ctx context.Context, tx pgx.Tx, tenantID int64) error {
_, err := tx.Exec(ctx, fmt.Sprintf("SET LOCAL app.tenant_id = %d", tenantID))
return err
}
async function setTenant(client: PoolClient, tenantId: bigint): Promise<void> {
await client.query(`SET LOCAL app.tenant_id = ${tenantId}`);
}
async def set_tenant(conn: asyncpg.Connection, tenant_id: int) -> None:
await conn.execute(f"SET LOCAL app.tenant_id = {tenant_id}")
Why SET LOCAL and not SET
This is an important nuance that is easy to miss.
-- Dangerous: the variable lives for the entire connection session
SET app.tenant_id = 42;
-- Correct: the variable is reset when the transaction ends
SET LOCAL app.tenant_id = 42;
Applications use a connection pool — the same database connection serves different users one after another. If you set the variable with SET without LOCAL, it persists after the query finishes. The next user who receives that connection from the pool ends up "impersonating" the previous tenant.
SET LOCAL works only within the current transaction and is automatically reset on COMMIT or ROLLBACK.
RLS does not apply to the superuser
RLS policies apply only to regular roles. The PostgreSQL superuser sees all rows without restrictions. That is why, in production, the application must connect as a regular role, and the superuser should be used only for administrative tasks.
Schema-per-tenant — for enterprise requirements
Each tenant gets its own schema in a single database: the tables are named identically but are physically separated.
CREATE SCHEMA tenant_acme;
CREATE SCHEMA tenant_globex;
CREATE TABLE tenant_acme.order_doc (...);
CREATE TABLE tenant_globex.order_doc (...);
Queries against the right schema are made through an explicit reference or through search_path:
-- explicit schema reference
SELECT * FROM tenant_acme.order_doc;
-- or via search_path — set at the start of the transaction
SET LOCAL search_path = tenant_acme, public;
SELECT * FROM order_doc;
// tenantSchema — a validated string from the tenant registry, not from user input
connection.createStatement().execute(
"SET LOCAL search_path = " + tenantSchema + ", public"
);
func setSearchPath(ctx context.Context, tx pgx.Tx, tenantSchema string) error {
_, err := tx.Exec(ctx, fmt.Sprintf("SET LOCAL search_path = %s, public", tenantSchema))
return err
}
async function setSearchPath(client: PoolClient, tenantSchema: string): Promise<void> {
await client.query(`SET LOCAL search_path = ${tenantSchema}, public`);
}
async def set_search_path(conn: asyncpg.Connection, tenant_schema: str) -> None:
await conn.execute(f"SET LOCAL search_path = {tenant_schema}, public")
This approach is chosen when customers have regulatory requirements for data isolation (healthcare, banking) or when different customers need a different table structure.
Scale limit: at 1000 or more schemas PostgreSQL starts to slow down. The system catalogs (pg_class and others) grow, the query planner spends more time, and autovacuum gets overloaded. Schema-per-tenant works well up to a few hundred tenants, but it is not suitable for open sign-up with thousands of customers.
Migration cost: when the table structure changes, the migration must be applied to each schema separately. With 100 schemas this takes 100 times longer than in row-per-tenant.
DB-per-tenant — full isolation for large customers
Each tenant gets a separate database, and sometimes a separate PostgreSQL server too. The application keeps a list of connections and routes queries to the right database:
class TenantRouter {
private final Map<String, HikariDataSource> pools = new ConcurrentHashMap<>();
HikariDataSource poolFor(String tenantId) {
return pools.computeIfAbsent(tenantId, id -> {
HikariConfig cfg = new HikariConfig();
cfg.setJdbcUrl("jdbc:postgresql://host/" + id);
return new HikariDataSource(cfg);
});
}
}
type TenantRouter struct {
pools map[string]*pgxpool.Pool
}
func (r *TenantRouter) query(ctx context.Context, tenantID string, sql string, args ...any) (pgx.Rows, error) {
return r.pools[tenantID].Query(ctx, sql, args...)
}
class TenantRouter {
private pools: Map<string, Pool>;
async query(tenantId: string, sql: string, values?: unknown[]): Promise<QueryResult> {
const pool = this.pools.get(tenantId);
if (!pool) throw new Error(`No pool for tenant: ${tenantId}`);
return pool.query(sql, values);
}
}
class TenantRouter:
def __init__(self, pools: dict[str, asyncpg.Pool]) -> None:
self._pools = pools
async def fetch(self, tenant_id: str, query: str, *args: Any) -> list[asyncpg.Record]:
return await self._pools[tenant_id].fetch(query, *args)
This option suits a small number of large enterprise customers with individual service-level agreements, their own backup requirements, and audit needs.
It is not suitable for mass SaaS with open sign-up: every new tenant is a separate database, separate monitoring, separate backup jobs. Maintaining this with thousands of customers is extremely hard.
Partitioning by tenant_id
Sometimes people suggest partitioning tables by tenant_id — creating a separate partition for each tenant within a single table. This gives physical data locality and lets you drop all of a tenant's data with a single command (DROP PARTITION is instant, versus a DELETE over millions of rows).
But this approach has a serious limitation: 1000 tenants means 1000 partitions, and PostgreSQL starts to slow down already at query planning. Partitioning by tenant_id is justified only for a small number of large tenants (a few to dozens), each storing gigabytes of data.
Common mistakes
Row-per-tenant without WHERE tenant_id — the most dangerous scenario. A query without the filter returns the data of all customers. Fix: check every query and enable RLS as an additional barrier.
SET without LOCAL for session variables — the tenant "leaks" through the connection pool to the next user. Always use SET LOCAL.
Schema-per-tenant with thousands of customers — the approach stops scaling because of the growing system catalogs. For a large number of tenants, choose row-per-tenant.
DB-per-tenant for the whole SaaS — every new sign-up creates a separate database. The operational load grows linearly with the number of customers. A hybrid is usually the better choice.
Superuser as the production role — RLS does not protect against the superuser. The application connects as a regular role.
In short
- Row-per-tenant: everything in the same tables,
tenant_idon every row. Simple, scales to thousands. Every query isWHERE tenant_id = ?. - RLS — an extra barrier at the database level: even if the application forgets the condition, PostgreSQL will not hand over another's data.
SET LOCAL app.tenant_id—LOCALis mandatory, otherwise the variable "leaks" through the connection pool to the next user.- Schema-per-tenant: a separate schema per customer — more isolation, harder migrations, does not scale to thousands of schemas.
- DB-per-tenant: full isolation for a handful of large customers with individual SLAs. Expensive to maintain.
- Hybrid — the most common solution: small customers in a shared database, large ones in a separate one.
- Superuser in production — do not use it: it bypasses RLS.
Further reading
- Partitioning — when and how to split tables into partitions.
- Composite indexes — why
tenant_idis placed as the first column. - Connection pool — connection pools and routing.