← Back to the section

The service is up, the business logic is written — but in real operation three problems surface that have nothing to do with the logic itself. One client floods the API with thousands of requests and takes the service down. A user needs to upload an avatar or export a report — and you don't know how to correctly accept and serve a file. It's time to remove an old endpoint, but clients are still on it, and abruptly deleting it will break them. Let's tackle all three in turn.

Rate limiting: how to stop an avalanche of requests

Imagine one client making thousands of requests per second — deliberately or because of a bug. Without limits, the service goes down. Rate limiting is a barrier: we allow, say, 100 requests per minute per client and reject the rest with 429 Too Many Requests.

An important point: rate limiting is not implemented inside the request handler. It's moved out into middleware or an external load balancer (nginx, API Gateway). The handler simply should not know about the limits.

Middleware with RateLimit-* headers

from fastapi import FastAPI, Request, Response
from fastapi.responses import JSONResponse
import time

app = FastAPI(redirect_slashes=False)

RATE_LIMIT = 100
WINDOW_SECONDS = 60

# Simplified counter (in-process; for production — Redis)
_counters: dict[str, tuple[int, float]] = {}


@app.middleware("http")
async def rate_limit_middleware(request: Request, call_next):
    client_id = request.headers.get("X-Client-Id", request.client.host)
    now = time.time()

    count, window_start = _counters.get(client_id, (0, now))
    if now - window_start > WINDOW_SECONDS:
        count, window_start = 0, now

    remaining = RATE_LIMIT - count
    reset_at = int(window_start + WINDOW_SECONDS)

    if remaining <= 0:
        return JSONResponse(
            status_code=429,
            media_type="application/problem+json",
            content={
                "type": "urn:problem:order-service:rate-limit-exceeded",
                "status": 429,
                "title": "Too Many Requests",
                "detail": f"Request limit exceeded. Retry in {reset_at - int(now)} seconds.",
                "code": "RATE_LIMIT_EXCEEDED",
            },
            headers={
                "Retry-After": str(reset_at - int(now)),
                "RateLimit-Limit": str(RATE_LIMIT),
                "RateLimit-Remaining": "0",
                "RateLimit-Reset": str(reset_at),
            },
        )

    _counters[client_id] = (count + 1, window_start)
    response: Response = await call_next(request)
    response.headers["RateLimit-Limit"] = str(RATE_LIMIT)
    response.headers["RateLimit-Remaining"] = str(remaining - 1)
    response.headers["RateLimit-Reset"] = str(reset_at)
    return response

A few details worth understanding:

  • Retry-After — how many seconds to wait before retrying. Without it, the client doesn't know when to knock again and will keep hammering immediately.
  • RateLimit-Limit/Remaining/Reset — added not only to the 429 response but to every successful response too. The client sees its remaining budget in advance and can slow itself down.
  • RateLimit-Reset — the Unix timestamp of when the window resets.

What a 429 response looks like

HTTP/1.1 429 Too Many Requests
Content-Type: application/problem+json
Retry-After: 23
RateLimit-Limit: 100
RateLimit-Remaining: 0
RateLimit-Reset: 1750291200

{
  "type": "urn:problem:order-service:rate-limit-exceeded",
  "status": 429,
  "title": "Too Many Requests",
  "detail": "Request limit exceeded. Retry in 23 seconds.",
  "code": "RATE_LIMIT_EXCEEDED"
}

And in successful responses too

HTTP/1.1 200 OK
Content-Type: application/json
RateLimit-Limit: 100
RateLimit-Remaining: 57
RateLimit-Reset: 1750291200

{ "orderId": "ord-9182", "status": "CONFIRMED" }

Thanks to the headers in successful responses, the client knows its remaining budget right now — and can ease off on its own, without waiting for a 429.

Describing 429 in OpenAPI

FastAPI is code-first: 429 is declared explicitly via responses in the decorator so that API consumers see it in the documentation.

from fastapi import APIRouter
from pydantic import BaseModel

router = APIRouter(prefix="/api/v1", tags=["Orders"])


class OrderResponse(BaseModel):
    model_config = {"populate_by_name": True}
    orderId: str
    status: str


@router.get(
    "/orders/{order_id}",
    response_model=OrderResponse,
    operation_id="getOrder",
    summary="Get order",
    responses={
        429: {
            "description": "Too Many Requests",
            "headers": {
                "Retry-After": {"schema": {"type": "integer"}},
                "RateLimit-Limit": {"schema": {"type": "integer"}},
                "RateLimit-Remaining": {"schema": {"type": "integer"}},
                "RateLimit-Reset": {"schema": {"type": "integer"}},
            },
            "content": {
                "application/problem+json": {
                    "schema": {"$ref": "#/components/schemas/ProblemDetails"}
                }
            },
        }
    },
)
async def get_order(order_id: str) -> OrderResponse:
    ...

File uploads: why not Base64 in JSON

A file is binary data. If you pass it as a Base64 string in a JSON body, the size grows by 33% and you lose the ability to stream the data. The right transport is multipart/form-data and the UploadFile type in FastAPI.

Accepting a file via UploadFile

from fastapi import APIRouter, UploadFile, File, Form
from fastapi.responses import StreamingResponse, Response
from pydantic import BaseModel
from datetime import datetime, timezone

router = APIRouter(prefix="/api/v1", tags=["Orders"])


class AttachmentResponse(BaseModel):
    attachmentId: str
    fileName: str
    contentType: str
    size: int
    uploadedAt: str


@router.post(
    "/orders/{order_id}/attachments",
    response_model=AttachmentResponse,
    status_code=201,
    operation_id="uploadOrderAttachment",
    summary="Upload an attachment to an order",
    response_model_exclude_none=True,
)
async def upload_order_attachment(
    order_id: str,
    file: UploadFile = File(..., description="Max 10 MB. PDF, PNG, JPG"),
    description: str | None = Form(None, max_length=500),
    response: Response = None,
) -> AttachmentResponse:
    content = await file.read()
    if len(content) > 10 * 1024 * 1024:
        raise FileTooLargeError()

    allowed = {"application/pdf", "image/png", "image/jpeg"}
    if file.content_type not in allowed:
        raise UnsupportedMediaTypeError()

    attachment_id = "att-" + order_id[:8]
    response.headers["Location"] = (
        f"/api/v1/orders/{order_id}/attachments/{attachment_id}"
    )

    return AttachmentResponse(
        attachmentId=attachment_id,
        fileName=file.filename,
        contentType=file.content_type,
        size=len(content),
        uploadedAt=datetime.now(timezone.utc).isoformat(),
    )

UploadFile gives access to filename, content_type, and the contents via await file.read(). Limits on size and allowed types are checked explicitly in the handler.

What the HTTP request looks like

POST /api/v1/orders/ord-9182/attachments
Content-Type: multipart/form-data; boundary=----Boundary7MA4

------Boundary7MA4
Content-Disposition: form-data; name="file"; filename="invoice.pdf"
Content-Type: application/pdf

<binary data>
------Boundary7MA4
Content-Disposition: form-data; name="description"

February invoice
------Boundary7MA4--

A 201 response with metadata

{
  "attachmentId": "att-ord-9182",
  "fileName": "invoice.pdf",
  "contentType": "application/pdf",
  "size": 204800,
  "uploadedAt": "2026-06-19T09:15:00+00:00"
}

A 201 Created response + a Location header + the full resource body.

Downloading via StreamingResponse

import aiofiles

@router.get(
    "/orders/{order_id}/attachments/{attachment_id}",
    operation_id="downloadOrderAttachment",
    summary="Download an order attachment",
)
async def download_order_attachment(
    order_id: str,
    attachment_id: str,
) -> StreamingResponse:
    file_path = resolve_attachment_path(order_id, attachment_id)
    file_name = "invoice.pdf"
    content_type = "application/pdf"

    async def file_stream():
        async with aiofiles.open(file_path, "rb") as f:
            while chunk := await f.read(65536):
                yield chunk

    return StreamingResponse(
        file_stream(),
        media_type=content_type,
        headers={
            "Content-Disposition": f'attachment; filename="{file_name}"',
        },
    )

The Content-Disposition: attachment; filename="invoice.pdf" header — without it, the browser and HTTP client don't know what name to save the file under.

For small static files, FileResponse is a good fit — it sets Content-Disposition and Content-Length itself:

from fastapi.responses import FileResponse

return FileResponse(
    path=file_path,
    media_type="application/pdf",
    filename="invoice.pdf",
)

Deprecation: how to gracefully retire an old endpoint

Sooner or later the API changes: a /v2 appears, and the old route becomes deprecated. You can't delete it abruptly — clients don't have time to migrate. The right path is to declare deprecation, give a deadline, and only then remove it.

In FastAPI this is done in two steps.

Step 1. Mark it in OpenAPI

@router.get(
    "/orders/{order_id}/status",
    operation_id="getOrderStatusLegacy",
    summary="Get order status",
    deprecated=True,
    description=(
        "DEPRECATED: use GET /api/v2/orders/{order_id}. "
        "Will be removed after 2026-12-01."
    ),
)
async def get_order_status_legacy(order_id: str):
    ...

deprecated=True marks the operation in /openapi.json — Swagger UI strikes it through and highlights it with a warning. Developers see this when browsing the documentation.

Step 2. Add HTTP headers

The OpenAPI mark is only visible when browsing the documentation. But clients already using the endpoint don't read the documentation. For them there are HTTP headers in every response. It's convenient to add them via Depends:

from fastapi import Depends, Response


def deprecation_headers(
    response: Response,
    sunset_date: str = "Thu, 01 Dec 2026 00:00:00 GMT",
    successor: str = "/api/v2/orders/{order_id}",
):
    response.headers["Sunset"] = sunset_date
    response.headers["Deprecation"] = "true"
    response.headers["Link"] = f'<{successor}>; rel="successor-version"'


@router.get(
    "/orders/{order_id}/status",
    operation_id="getOrderStatusLegacy",
    summary="Get order status",
    deprecated=True,
    description=(
        "DEPRECATED: use GET /api/v2/orders/{order_id}. "
        "Will be removed after 2026-12-01."
    ),
    dependencies=[Depends(deprecation_headers)],
)
async def get_order_status_legacy(order_id: str):
    return {"status": "PROCESSING"}

Every response will contain:

HTTP/1.1 200 OK
Sunset: Thu, 01 Dec 2026 00:00:00 GMT
Deprecation: true
Link: </api/v2/orders/{order_id}>; rel="successor-version"

{ "status": "PROCESSING" }
  • Sunset (RFC 8594) — the exact shutoff date in HTTP-date format. Without a date, the client can't plan the migration.
  • Deprecation: true — a machine-readable flag for SDKs and monitoring.
  • Link with rel="successor-version" — the client knows where to migrate.

Step 3. After the Sunset date — 410 Gone

Once the Sunset date arrives, the handler is replaced with 410 Gone. A 404 code isn't right here: 404 means "I don't know of any such thing", while 410 means "I know it, but I removed it deliberately".

from fastapi import HTTPException
from fastapi.responses import JSONResponse
from datetime import datetime, timezone


SUNSET = datetime(2026, 12, 1, tzinfo=timezone.utc)


@router.get(
    "/orders/{order_id}/status",
    operation_id="getOrderStatusRemoved",
    include_in_schema=False,
)
async def get_order_status_gone(order_id: str):
    if datetime.now(timezone.utc) >= SUNSET:
        return JSONResponse(
            status_code=410,
            media_type="application/problem+json",
            content={
                "type": "urn:problem:order-service:endpoint-removed",
                "status": 410,
                "title": "Gone",
                "detail": (
                    "The endpoint has been removed. "
                    "Use GET /api/v2/orders/{order_id}."
                ),
                "code": "ENDPOINT_REMOVED",
            },
        )
    return {"status": "PROCESSING"}

include_in_schema=False — the removed endpoint doesn't appear in /openapi.json, but it still physically accepts requests and responds with 410. The client gets a clear error pointing to the alternative.

The recommended period between announcing the deprecation and the Sunset date is 6 to 12 months. That's enough for large consumers to switch over.

In short

  • Rate limiting is implemented in middleware, not in the handler. A 429 response must include Retry-After and RateLimit-Limit/Remaining/Reset.
  • RateLimit-* is added not only to 429 but to every successful response — the client sees its remaining budget in advance.
  • Files are accepted via UploadFile + multipart/form-data, not via Base64 in JSON.
  • When downloading, always set Content-Disposition: attachment; filename="..." — otherwise the client doesn't know the file name.
  • Deprecation = two steps: deprecated=True in the decorator (visible in Swagger) + Sunset/Deprecation/Link headers in the responses (visible to clients).
  • A Sunset without a concrete date is meaningless: set the date right away, at least 6 months out.
  • After the Sunset date, the endpoint returns 410 Gone with a pointer to the alternative.
  • Errors, RFC 9457 — how to format 429 and 410 as problem+json.
  • Headers and tracing — custom headers, Idempotency-Key, traceparent.
  • Versioning — how to move from v1 to v2 and when deprecation is inevitable.