It feels like "send a query to the database" or "call a neighbouring service" is one instant action. In reality, before the first byte of useful data flies out, the two machines go through a whole introduction ritual: they establish a connection, and if the channel is secure, they also negotiate encryption. That ritual costs time, and if you perform it on every request, the service starts to slow down inexplicably under load.
The good news: a connection can be opened once and reused many times. The bad news: reuse has to be managed carefully, otherwise you get the classic outages "too many connections" and "connection pool exhausted". Let's figure out where the cost of a connection comes from, what keep-alive and pools are, and which timeouts save the service from hanging.
Why opening a connection is expensive
Take a typical HTTPS call. Before the client sends GET /orders/42, two handshakes happen back to back.
First, the TCP handshake — three short messages (SYN, SYN-ACK, ACK) with which the parties agree "I want to talk — go ahead — agreed". That's one full round of the signal there and back, i.e. one RTT (round-trip time). If the server is in the next rack, RTT is a fraction of a millisecond. If it's in another data centre, it's already tens of milliseconds.
Then the TLS handshake — the parties present certificates and negotiate encryption keys. That's another one or two rounds of the signal. An analogy: TCP is dialling and hearing "hello", while TLS is making sure the other end really is who it claims to be and agreeing on a secret language. Only after all this does the actual request go out.
The upshot: an "empty" call to another data centre can spend 50–150 ms just setting up the connection — before the server does anything at all. On one request it's invisible. On a thousand requests per second, each with its own fresh connection, it's a wall. More on the mechanics of the connection itself is in the article on TCP and UDP, and on the encrypting layer in HTTPS and TLS.
Keep-alive: don't hang up
Since the handshake is expensive, it makes sense not to repeat it. That's exactly what keep-alive does: after the response, the connection isn't closed but stays open, and the next request to the same server flies over it immediately, with no new ritual.
An analogy: instead of redialling for every question, you keep the line open and ask questions one after another. The first call was expensive; the rest are free.
In HTTP this is the default behaviour starting from version 1.1: the connection is reused until one of the parties closes it. Newer versions of the protocol go further and run many parallel requests over a single connection — that's covered in the article on HTTP versions. What matters for us now is the idea itself: an open connection is a valuable resource worth holding and reusing, not throwing away after every response.
Connection pool: a shared box of open lines
One client reusing one connection is good, but a service has to handle many requests at once. Then several open lines are needed. They're held in a pool — a pre-opened set of connections from which a request takes a free one, uses it, and returns it.
An analogy is a bike rental rack. The bikes (connections) already stand ready. A person comes, takes one, rides, returns it. Nobody has to assemble a bike from scratch each time. If all the bikes are taken — a new person waits until someone returns one.
Pool to the database. The most common case. Every ecosystem has its own standard: HikariCP in Java, the pool inside database/sql in Go, the driver pool (pg) in Node.js, SQLAlchemy or the psycopg pool in Python. The idea is the same: the pool holds open connections to PostgreSQL or another database, and when the code needs to run a query, it takes a ready connection from the pool. The key settings look similar everywhere (example):
pool:
maximum-pool-size: 10 # max connections we hold
connection-timeout: 30000 # how long to wait for a free one, ms
idle-timeout: 600000 # close an idle one after 10 min
max-lifetime: 1800000 # refresh a connection after 30 min
Pool to neighbouring services. When a service calls another service over HTTP, the HTTP client also holds a pool of connections per host — and reuses them via keep-alive. The idea is the same: don't open a TLS handshake for every call, but run requests over already-ready lines.
Pool size: the golden mean
The main tuning question for a pool is how many connections to hold. And it's easy to get wrong in both directions.
Too few. All connections are busy, new requests queue up and wait for at least one to free up. The user sees latency out of nowhere, while the database sits idle. The symptom: requests "hang" precisely waiting for a connection, not doing work.
Too many. It seems like "I'll give it a bigger pool, it'll be faster". But the database itself has a hard limit on the number of simultaneous connections (in PostgreSQL that's max_connections, often around 100). Each connection is memory and a process on the database side. If ten service instances each open 50 connections, the database chokes and starts returning the "too many connections" error — to everyone at once, not just the culprit.
The reality is counterintuitive: a small pool is often faster than a big one. A database with fewer connections competes less for resources and processes requests more evenly. A reasonable starting point for one instance is a handful, at most a couple of dozen connections, not hundreds. The fine points of tuning on the PostgreSQL side are in the PostgreSQL section.
Timeouts: don't wait forever
A connection may fail to establish, a server may stall, the network may silently swallow a packet. Without timeouts, a thread waits for a response forever — and such hung threads pile up and at some point take down the whole service. So network calls are given several different timeouts, and it's worth not confusing them:
- Connect timeout — how long to wait for the connection to be established. Server isn't answering the handshake? After N seconds we give up. Usually small — a few seconds.
- Read timeout (a.k.a. socket timeout) — how long to wait for a response over an already-open connection. Connected, sent the request, but no data comes back — after N seconds we cut it off.
- Pool wait timeout — how long a request is willing to stand in the queue for a free connection (in the example above that's
connection-timeout). Pool exhausted and not freeing up? Don't hang forever, return a clear error right away. - Idle timeout — how long to keep an idle connection open before closing it. No point tying up resources with a line nobody uses.
The rule is simple: every external call should have connect and read timeouts set. A call without a timeout is a landmine: one day the remote side hangs, and your threads hang along with it. How the system as a whole survives such failures is in the article on reliability.
Connection leaks
A separate trouble is a connection leak. You took a connection from the pool, did some work, and… forgot to return it. For example, the code failed with an error before the connection was closed, and there's no handling for that case.
An analogy: a person took a bike from the rack and didn't bring it back. Once is nothing. But if everyone does it, the rack empties, and the next customers stand in front of an empty rack. In a service this looks like slow poisoning: at first everything's fine, then requests start waiting longer and longer for a connection, and eventually the pool is exhausted — that very "connection pool exhausted" error. A restart helps for a while, but the leak remains.
The defence is twofold. First, write code so the connection is always returned, no matter what happens (in Java that's try-with-resources, in Go defer, in Python with; in practice, when working through a framework, it usually closes connections itself as long as you don't handle them manually). Second, many pools can catch "stuck" connections and log when a connection is held suspiciously long — the first beacon that something is leaking.
Where this applies
Connections, pools, and timeouts are the layer that's rarely noticed while it works, and the first to surface when you investigate strange slowdowns and outages under load. Three of the most common symptoms this topic is behind:
- "too many connections" — the instances together opened more connections to the database than it allows. Fixed not by raising the database limit, but by shrinking the pools.
- "connection pool exhausted" — there are no free connections in the pool. Either the pool is small for the load, or there's a leak somewhere, or queries to the database are too slow and don't release the connection.
- Hanging without a timeout — the remote side is silent, and the call waits forever, piling up busy threads. A classic way to bring down a service through one slow external service.
Where beginners stumble:
- They open a connection per request. Works in tests, falls apart under load: all the time goes into handshakes. Pools and keep-alive exist precisely for this.
- They inflate the pool "just in case". Bigger doesn't mean faster; a database with hundreds of connections works worse than one with a dozen. The bottleneck is usually the database itself, not the pool size.
- They don't set timeouts. A call without connect/read timeouts will one day hang and drag threads down with it. Client defaults are often "infinity" — they must be set explicitly.
- They confuse the kinds of timeouts. Connect, read, and pool wait are about different phases. Setting one and forgetting the rest closes only part of the holes.
- They forget to return the connection. A leak isn't visible right away and shows up as poisoning after hours of running under load.
What to learn next
A connection is a layer built on top of transport, so it makes sense to firm up the foundation first: TCP and UDP explains the handshake itself and why it costs one RTT, and HTTPS and TLS explains where the second, encrypting part of the cost comes from. Next, look at how a single connection is reused for many requests across different HTTP versions. When it comes to the database pool, the tuning details on the server side are in the PostgreSQL section. And how all of this together survives failures and spreads load is in the articles on reliability and load balancers.