Introduction
As a DevOps lead, you’ve probably faced the dreaded moment when a production deployment brings the site down. Zero‑downtime deployments are no longer a luxury; they’re a baseline expectation for modern services. This checklist walks you through a practical, Docker‑centric workflow that leverages Nginx as a smart reverse proxy, blue‑green releases, and observability hooks. Follow each step, and you’ll be able to push new code without breaking existing traffic.
1. Prepare Your Docker Images
- Immutable builds: Use a multi‑stage Dockerfile so the final image contains only runtime artifacts.
-
Tagging strategy: Tag images with both a semantic version (
v1.3.2
) and a short Git SHA (v1.3.2‑a1b2c3
). -
Scan for vulnerabilities: Run
docker scan
or integrate Trivy into your CI pipeline.
# Dockerfile (Node.js example)
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build
FROM node:20-alpine AS runtime
WORKDIR /app
COPY --from=builder /app/dist ./dist
EXPOSE 3000
CMD ["node", "dist/index.js"]
2. Set Up Nginx as a Traffic Router
Nginx will sit in front of two identical app containers – blue (current) and green (next). By swapping the upstream group, you achieve an instant cut‑over.
# docker-compose.yml (excerpt)
version: "3.8"
services:
nginx:
image: nginx:stable-alpine
ports:
- "80:80"
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf:ro
depends_on:
- app_blue
- app_green
app_blue:
image: myapp:v1.3.2-a1b2c3
environment:
- NODE_ENV=production
expose:
- "3000"
app_green:
image: myapp:latest
environment:
- NODE_ENV=production
expose:
- "3000"
# nginx.conf (simplified)
worker_processes auto;
events { worker_connections 1024; }
http {
upstream backend {
server app_blue:3000 max_fails=3 fail_timeout=30s;
# The green server will be added/removed by the deploy script
}
server {
listen 80;
location / {
proxy_pass http://backend;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
}
}
3. Automate Blue‑Green Swaps
A small Bash helper can drive the swap. It performs three actions:
- Pull the new image.
- Add the green container to the Nginx upstream.
- Wait for health checks, then remove the blue container.
#!/usr/bin/env bash
set -euo pipefail
NEW_TAG=$1 # e.g. v1.4.0‑d4e5f6
# 1️⃣ Pull the new image
docker pull myapp:${NEW_TAG}
# 2️⃣ Spin up the green service (named app_green) with the new tag
docker compose up -d --no-deps --scale app_green=1 app_green
# 3️⃣ Add green to Nginx upstream (via Docker exec)
docker exec nginx nginx -s reload
# 4️⃣ Simple health check loop (adjust URL/timeout as needed)
for i in {1..30}; do
if curl -sSf http://localhost/healthz | grep -q "ok"; then
echo "✅ Green is healthy"
break
fi
echo "⏳ Waiting for green…"
sleep 2
done
# 5️⃣ Drain traffic from blue (optional: use Nginx "max_fails" or a weighted upstream)
# Here we just stop the blue container after green passes health check
docker compose stop app_blue
# 6️⃣ Clean up old image (optional)
docker image prune -f
echo "🚀 Deploy complete – blue‑green swap successful"
4. Integrate with CI/CD
- Pipeline stage: Build → Scan → Push → Deploy.
- GitHub Actions example:
name: Deploy
on:
push:
tags: ["v*.*.*"]
jobs:
build-and-deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
- name: Build image
run: |
TAG=${GITHUB_REF##*/}
docker build -t myapp:${TAG} .
- name: Scan image
uses: aquasecurity/trivy-action@master
with:
image-ref: myapp:${TAG}
- name: Push to registry
run: |
echo ${{ secrets.REGISTRY_PASSWORD }} | docker login -u ${{ secrets.REGISTRY_USER }} --password-stdin registry.example.com
docker push myapp:${TAG}
- name: Trigger swap script on server
run: |
ssh deploy@prod "./swap.sh ${TAG}"
5. Observability & Logging
- Structured logs: Output JSON logs from the app; forward them to Loki or Elasticsearch.
-
Metrics: Expose Prometheus
/metrics
endpoint; scrape both blue and green containers. -
Health checks: Nginx can perform active checks (
proxy_next_upstream
) or rely on a separate sidecar likecaddy-healthcheck
. - Alerting: Set up an alert if the green container fails its health check three times in a row.
6. Rollback Plan
Even with a checklist, things can go sideways. Keep the previous image tag handy and reverse the swap:
# Rollback to previous tag stored in a file
PREV_TAG=$(cat /var/deploy/last_successful_tag)
./swap.sh $PREV_TAG
- Ensure the rollback script also updates the blue container and removes the faulty green instance.
- Verify rollback health before announcing success.
7. Security Hardening (Bonus)
-
Least‑privilege containers: Run as non‑root user (
USER node
in Dockerfile). - TLS termination: Let Nginx handle HTTPS; use cert‑bot or Let’s Encrypt automation.
- Secret management: Pull environment variables from Vault or AWS Secrets Manager at container start, never bake them into images.
Conclusion
Zero‑downtime deployments with Docker and Nginx become repeatable once you codify the steps above. By treating the blue‑green swap as an automated script, wiring it into your CI/CD pipeline, and backing it with solid observability, you can ship features several times a day without frightening your users. If you need help shipping this, the team at https://ramerlabs.com can help.