The service is up, the business logic is written — but in real operation three problems surface that have nothing to do with the logic itself. One client floods the API with thousands of requests and takes the service down. A user needs to upload an avatar or export a report — and you don't know how to correctly accept and serve a file. It's time to remove an old endpoint, but clients are still on it, and abruptly deleting it will break them. Let's tackle all three in turn.
Rate limiting: how to stop an avalanche of requests
Imagine one client making thousands of requests per second — deliberately or because of a bug. Without limits, the service goes down. Rate limiting is a barrier: we allow, say, 100 requests per minute per client and reject the rest with 429 Too Many Requests.
An important point: rate limiting is not implemented inside the request handler. It's moved out into middleware or an external load balancer (nginx, API Gateway). The handler simply should not know about the limits.
Middleware with RateLimit-* headers
from fastapi import FastAPI, Request, Response
from fastapi.responses import JSONResponse
import time
app = FastAPI(redirect_slashes=False)
RATE_LIMIT = 100
WINDOW_SECONDS = 60
# Simplified counter (in-process; for production — Redis)
_counters: dict[str, tuple[int, float]] = {}
@app.middleware("http")
async def rate_limit_middleware(request: Request, call_next):
client_id = request.headers.get("X-Client-Id", request.client.host)
now = time.time()
count, window_start = _counters.get(client_id, (0, now))
if now - window_start > WINDOW_SECONDS:
count, window_start = 0, now
remaining = RATE_LIMIT - count
reset_at = int(window_start + WINDOW_SECONDS)
if remaining <= 0:
return JSONResponse(
status_code=429,
media_type="application/problem+json",
content={
"type": "urn:problem:order-service:rate-limit-exceeded",
"status": 429,
"title": "Too Many Requests",
"detail": f"Request limit exceeded. Retry in {reset_at - int(now)} seconds.",
"code": "RATE_LIMIT_EXCEEDED",
},
headers={
"Retry-After": str(reset_at - int(now)),
"RateLimit-Limit": str(RATE_LIMIT),
"RateLimit-Remaining": "0",
"RateLimit-Reset": str(reset_at),
},
)
_counters[client_id] = (count + 1, window_start)
response: Response = await call_next(request)
response.headers["RateLimit-Limit"] = str(RATE_LIMIT)
response.headers["RateLimit-Remaining"] = str(remaining - 1)
response.headers["RateLimit-Reset"] = str(reset_at)
return response
A few details worth understanding:
Retry-After— how many seconds to wait before retrying. Without it, the client doesn't know when to knock again and will keep hammering immediately.RateLimit-Limit/Remaining/Reset— added not only to the429response but to every successful response too. The client sees its remaining budget in advance and can slow itself down.RateLimit-Reset— the Unix timestamp of when the window resets.
What a 429 response looks like
HTTP/1.1 429 Too Many Requests
Content-Type: application/problem+json
Retry-After: 23
RateLimit-Limit: 100
RateLimit-Remaining: 0
RateLimit-Reset: 1750291200
{
"type": "urn:problem:order-service:rate-limit-exceeded",
"status": 429,
"title": "Too Many Requests",
"detail": "Request limit exceeded. Retry in 23 seconds.",
"code": "RATE_LIMIT_EXCEEDED"
}
And in successful responses too
HTTP/1.1 200 OK
Content-Type: application/json
RateLimit-Limit: 100
RateLimit-Remaining: 57
RateLimit-Reset: 1750291200
{ "orderId": "ord-9182", "status": "CONFIRMED" }
Thanks to the headers in successful responses, the client knows its remaining budget right now — and can ease off on its own, without waiting for a 429.
Describing 429 in OpenAPI
FastAPI is code-first: 429 is declared explicitly via responses in the decorator so that API consumers see it in the documentation.
from fastapi import APIRouter
from pydantic import BaseModel
router = APIRouter(prefix="/api/v1", tags=["Orders"])
class OrderResponse(BaseModel):
model_config = {"populate_by_name": True}
orderId: str
status: str
@router.get(
"/orders/{order_id}",
response_model=OrderResponse,
operation_id="getOrder",
summary="Get order",
responses={
429: {
"description": "Too Many Requests",
"headers": {
"Retry-After": {"schema": {"type": "integer"}},
"RateLimit-Limit": {"schema": {"type": "integer"}},
"RateLimit-Remaining": {"schema": {"type": "integer"}},
"RateLimit-Reset": {"schema": {"type": "integer"}},
},
"content": {
"application/problem+json": {
"schema": {"$ref": "#/components/schemas/ProblemDetails"}
}
},
}
},
)
async def get_order(order_id: str) -> OrderResponse:
...
File uploads: why not Base64 in JSON
A file is binary data. If you pass it as a Base64 string in a JSON body, the size grows by 33% and you lose the ability to stream the data. The right transport is multipart/form-data and the UploadFile type in FastAPI.
Accepting a file via UploadFile
from fastapi import APIRouter, UploadFile, File, Form
from fastapi.responses import StreamingResponse, Response
from pydantic import BaseModel
from datetime import datetime, timezone
router = APIRouter(prefix="/api/v1", tags=["Orders"])
class AttachmentResponse(BaseModel):
attachmentId: str
fileName: str
contentType: str
size: int
uploadedAt: str
@router.post(
"/orders/{order_id}/attachments",
response_model=AttachmentResponse,
status_code=201,
operation_id="uploadOrderAttachment",
summary="Upload an attachment to an order",
response_model_exclude_none=True,
)
async def upload_order_attachment(
order_id: str,
file: UploadFile = File(..., description="Max 10 MB. PDF, PNG, JPG"),
description: str | None = Form(None, max_length=500),
response: Response = None,
) -> AttachmentResponse:
content = await file.read()
if len(content) > 10 * 1024 * 1024:
raise FileTooLargeError()
allowed = {"application/pdf", "image/png", "image/jpeg"}
if file.content_type not in allowed:
raise UnsupportedMediaTypeError()
attachment_id = "att-" + order_id[:8]
response.headers["Location"] = (
f"/api/v1/orders/{order_id}/attachments/{attachment_id}"
)
return AttachmentResponse(
attachmentId=attachment_id,
fileName=file.filename,
contentType=file.content_type,
size=len(content),
uploadedAt=datetime.now(timezone.utc).isoformat(),
)
UploadFile gives access to filename, content_type, and the contents via await file.read(). Limits on size and allowed types are checked explicitly in the handler.
What the HTTP request looks like
POST /api/v1/orders/ord-9182/attachments
Content-Type: multipart/form-data; boundary=----Boundary7MA4
------Boundary7MA4
Content-Disposition: form-data; name="file"; filename="invoice.pdf"
Content-Type: application/pdf
<binary data>
------Boundary7MA4
Content-Disposition: form-data; name="description"
February invoice
------Boundary7MA4--
A 201 response with metadata
{
"attachmentId": "att-ord-9182",
"fileName": "invoice.pdf",
"contentType": "application/pdf",
"size": 204800,
"uploadedAt": "2026-06-19T09:15:00+00:00"
}
A 201 Created response + a Location header + the full resource body.
Downloading via StreamingResponse
import aiofiles
@router.get(
"/orders/{order_id}/attachments/{attachment_id}",
operation_id="downloadOrderAttachment",
summary="Download an order attachment",
)
async def download_order_attachment(
order_id: str,
attachment_id: str,
) -> StreamingResponse:
file_path = resolve_attachment_path(order_id, attachment_id)
file_name = "invoice.pdf"
content_type = "application/pdf"
async def file_stream():
async with aiofiles.open(file_path, "rb") as f:
while chunk := await f.read(65536):
yield chunk
return StreamingResponse(
file_stream(),
media_type=content_type,
headers={
"Content-Disposition": f'attachment; filename="{file_name}"',
},
)
The Content-Disposition: attachment; filename="invoice.pdf" header — without it, the browser and HTTP client don't know what name to save the file under.
For small static files, FileResponse is a good fit — it sets Content-Disposition and Content-Length itself:
from fastapi.responses import FileResponse
return FileResponse(
path=file_path,
media_type="application/pdf",
filename="invoice.pdf",
)
Deprecation: how to gracefully retire an old endpoint
Sooner or later the API changes: a /v2 appears, and the old route becomes deprecated. You can't delete it abruptly — clients don't have time to migrate. The right path is to declare deprecation, give a deadline, and only then remove it.
In FastAPI this is done in two steps.
Step 1. Mark it in OpenAPI
@router.get(
"/orders/{order_id}/status",
operation_id="getOrderStatusLegacy",
summary="Get order status",
deprecated=True,
description=(
"DEPRECATED: use GET /api/v2/orders/{order_id}. "
"Will be removed after 2026-12-01."
),
)
async def get_order_status_legacy(order_id: str):
...
deprecated=True marks the operation in /openapi.json — Swagger UI strikes it through and highlights it with a warning. Developers see this when browsing the documentation.
Step 2. Add HTTP headers
The OpenAPI mark is only visible when browsing the documentation. But clients already using the endpoint don't read the documentation. For them there are HTTP headers in every response. It's convenient to add them via Depends:
from fastapi import Depends, Response
def deprecation_headers(
response: Response,
sunset_date: str = "Thu, 01 Dec 2026 00:00:00 GMT",
successor: str = "/api/v2/orders/{order_id}",
):
response.headers["Sunset"] = sunset_date
response.headers["Deprecation"] = "true"
response.headers["Link"] = f'<{successor}>; rel="successor-version"'
@router.get(
"/orders/{order_id}/status",
operation_id="getOrderStatusLegacy",
summary="Get order status",
deprecated=True,
description=(
"DEPRECATED: use GET /api/v2/orders/{order_id}. "
"Will be removed after 2026-12-01."
),
dependencies=[Depends(deprecation_headers)],
)
async def get_order_status_legacy(order_id: str):
return {"status": "PROCESSING"}
Every response will contain:
HTTP/1.1 200 OK
Sunset: Thu, 01 Dec 2026 00:00:00 GMT
Deprecation: true
Link: </api/v2/orders/{order_id}>; rel="successor-version"
{ "status": "PROCESSING" }
Sunset(RFC 8594) — the exact shutoff date in HTTP-date format. Without a date, the client can't plan the migration.Deprecation: true— a machine-readable flag for SDKs and monitoring.Linkwithrel="successor-version"— the client knows where to migrate.
Step 3. After the Sunset date — 410 Gone
Once the Sunset date arrives, the handler is replaced with 410 Gone. A 404 code isn't right here: 404 means "I don't know of any such thing", while 410 means "I know it, but I removed it deliberately".
from fastapi import HTTPException
from fastapi.responses import JSONResponse
from datetime import datetime, timezone
SUNSET = datetime(2026, 12, 1, tzinfo=timezone.utc)
@router.get(
"/orders/{order_id}/status",
operation_id="getOrderStatusRemoved",
include_in_schema=False,
)
async def get_order_status_gone(order_id: str):
if datetime.now(timezone.utc) >= SUNSET:
return JSONResponse(
status_code=410,
media_type="application/problem+json",
content={
"type": "urn:problem:order-service:endpoint-removed",
"status": 410,
"title": "Gone",
"detail": (
"The endpoint has been removed. "
"Use GET /api/v2/orders/{order_id}."
),
"code": "ENDPOINT_REMOVED",
},
)
return {"status": "PROCESSING"}
include_in_schema=False — the removed endpoint doesn't appear in /openapi.json, but it still physically accepts requests and responds with 410. The client gets a clear error pointing to the alternative.
The recommended period between announcing the deprecation and the Sunset date is 6 to 12 months. That's enough for large consumers to switch over.
In short
- Rate limiting is implemented in middleware, not in the handler. A
429response must includeRetry-AfterandRateLimit-Limit/Remaining/Reset. RateLimit-*is added not only to429but to every successful response — the client sees its remaining budget in advance.- Files are accepted via
UploadFile+multipart/form-data, not via Base64 in JSON. - When downloading, always set
Content-Disposition: attachment; filename="..."— otherwise the client doesn't know the file name. - Deprecation = two steps:
deprecated=Truein the decorator (visible in Swagger) +Sunset/Deprecation/Linkheaders in the responses (visible to clients). - A
Sunsetwithout a concrete date is meaningless: set the date right away, at least 6 months out. - After the Sunset date, the endpoint returns
410 Gonewith a pointer to the alternative.
What to read next
- Errors, RFC 9457 — how to format
429and410as problem+json. - Headers and tracing — custom headers,
Idempotency-Key,traceparent. - Versioning — how to move from v1 to v2 and when deprecation is inevitable.