Deployment¶
This guide covers how to deploy the QuoinAPI application using Docker.
Docker Deployment¶
The project includes a production-ready Dockerfile and docker-compose.yml
for containerized deployment.
Local Docker Development¶
Run the entire stack (Application + PostgreSQL Database) locally using Docker Compose:
This command:
- Builds the application Docker image
- Starts PostgreSQL container
- Starts the application container
- Configures networking between containers
Access the application at http://localhost:8000 (or via http://api.quoin-api.orb.local if using OrbStack).
Stop Containers¶
To stop and remove all containers:
Production Deployment¶
Building the Production Image¶
Build the Docker image manually for production:
The Dockerfile uses a multi-stage build:
# Stage 1: Builder — install dependencies only
FROM python:3.14-slim-bookworm AS builder
COPY --from=ghcr.io/astral-sh/uv:latest /uv /bin/uv
WORKDIR /app
COPY pyproject.toml uv.lock* README.md ./
RUN uv sync --no-dev --frozen
# Stage 2: Final — lean production image
FROM python:3.14-slim-bookworm
WORKDIR /app
COPY --from=builder /app/.venv /app/.venv
COPY app/ app/
COPY alembic/ alembic/
COPY alembic.ini .
ENV PATH="/app/.venv/bin:$PATH"
ENV PYTHONPATH="/app"
# Non-root user for security
RUN addgroup --system --gid 1001 quoin && \
adduser --system --uid 1001 --ingroup quoin quoin
RUN chown -R quoin:quoin /app
USER quoin
CMD ["fastapi", "run", "app/main.py", "--host", "0.0.0.0", "--port", "8000"]
Running in Production¶
With Docker Compose:
With Docker CLI:
# Run PostgreSQL
docker run -d \
--name postgres \
-e QUOIN_POSTGRES_USER=postgres \
-e QUOIN_POSTGRES_PASSWORD=postgres \
-e QUOIN_POSTGRES_DB=app_db \
-p 5432:5432 \
postgres:17-alpine
# Run Application
docker run -d \
--name quoin-api \
--link postgres:db \
-p 8000:8000 \
-e QUOIN_POSTGRES_HOST=db \
quoin-api:latest
Environment Variables¶
Configure the application using environment variables. See Configuration Guide for all available options.
Production Essentials:
# Application
QUOIN_ENV=production
QUOIN_OTEL_ENABLED=true
# Database
QUOIN_POSTGRES_HOST=db
QUOIN_POSTGRES_PORT=5432
QUOIN_POSTGRES_USER=postgres
QUOIN_POSTGRES_PASSWORD=<strong-password>
QUOIN_POSTGRES_DB=app_db
Security Warning: Never commit
.envfiles with production credentials to version control!
Health Checks¶
The application includes dedicated endpoints for health and readiness monitoring:
Health Probe¶
The /health endpoint checks if the application process is running:
Expected response:
Readiness Probe¶
The /ready endpoint checks if the application is ready to accept traffic (e.g., database is connected):
Expected response (HTTP 200 OK):
It returns an HTTP 503 error if the database is unavailable, or once graceful shutdown has begun (see below) so orchestrators stop routing new traffic to the draining instance.
Use these endpoints for:
- Docker health checks -
HEALTHCHECKdirective - Load balancer probes - Kubernetes liveness/readiness
- Monitoring systems - Uptime tracking
Graceful Shutdown¶
On shutdown the application drains in-flight requests before releasing resources. The sequence in the lifespan handler is:
- The readiness probe flips to 503 (
is_shutting_down), so load balancers and Kubernetes stop routing new traffic to the instance. - In-flight requests are awaited until the in-flight counter reaches
zero, bounded by
QUOIN_SHUTDOWN_DRAIN_TIMEOUT(default30.0seconds;<=0skips the wait). A clean drain logsshutdown_drained; a timeout logsshutdown_drain_timeoutwith the residual count. - The database engine is disposed — only after the drain — so no in-flight request loses its connection mid-query.
The in-flight gauge is maintained by InFlightRequestMiddleware, which
brackets every HTTP request that reaches a handler. The /health and
/ready probe paths are excluded so orchestrator polling never keeps
the gauge from reaching zero. WebSocket connections are outside the
gauge and are not drained.
Relationship to the uvicorn server¶
The server matters here. fastapi run (uvicorn) already drains
connection-level in-flight requests before it runs the lifespan
shutdown: on SIGTERM it stops accepting new connections, waits for
open connections to finish their responses (bounded by
--timeout-graceful-shutdown), and only then triggers the lifespan
shutdown where the steps above execute. In that default setup the
application-level drain is largely a safety net rather than the
primary drain mechanism.
The application-level drain still earns its place:
- Server-agnostic behaviour — the same drain semantics hold under uvicorn, Gunicorn with uvicorn workers, Hypercorn, or any ASGI server, without per-launcher graceful-timeout flags.
- Safe engine disposal ordering — the engine is guaranteed disposed after in-flight work completes, so a request never has its database connection torn out from under it.
- A single, app-controlled timeout —
QUOIN_SHUTDOWN_DRAIN_TIMEOUTis the one knob, with structuredshutdown_drained/shutdown_drain_timeoutlogs for observability. - Explicit readiness signalling — the 503 flip is what actually
removes the instance from a load balancer's rotation; uvicorn's
connection drain does not change what
/readyreports.
Kubernetes wiring¶
For zero-downtime rollouts, give the orchestrator room to react to the readiness flip and the drain:
- Set
terminationGracePeriodSecondsgreater than or equal toQUOIN_SHUTDOWN_DRAIN_TIMEOUTso the pod is not force-killed mid-drain. - Keep the server's
--timeout-graceful-shutdowngreater than or equal toQUOIN_SHUTDOWN_DRAIN_TIMEOUT. If it is shorter, uvicorn cancels in-flight tasks before the lifespan drain runs; cancellation still releases the in-flight counter (via thefinallyinInFlightRequestMiddleware), so the drain reports immediate success — but those requests were terminated, not gracefully finished. - Point the readiness probe at
/ready; once it returns 503 the Endpoints controller removes the pod from Service rotation. Readiness probes poll atperiodSecondsand needfailureThresholdconsecutive failures first, so with the Kubernetes defaults (10s x 3) new traffic can still arrive for up to ~30s after/readybegins failing. SizeterminationGracePeriodSecondsto cover both this probe delay and the drain timeout.
See Also¶
- Release Workflow — Version management and tagging
- Observability — Setting up logs and traces
- Troubleshooting — Common deployment issues
- Dockerfile — Production image configuration