Chapter 06: Docker for MLOps🔗

"Docker solves the 'it works on my machine' problem forever."

6.1 What is Docker?🔗

Docker is a platform for packaging applications and their dependencies into lightweight, portable units called containers.

Virtual Machine vs Container🔗

VIRTUAL MACHINE:                    CONTAINER:
┌───────────────────────┐           ┌───────────────────────┐
│  App A  │  App B      │           │  App A  │  App B      │
│─────────┴─────────────│           │─────────┴─────────────│
│ Libs A  │ Libs B      │           │ Libs A  │ Libs B      │
│─────────┴─────────────│           │─────────────────────── │
│    Guest OS (full)    │           │  Docker Engine (thin)  │
│    Guest OS (full)    │           │─────────────────────── │
│───────────────────────│           │  Host OS               │
│    Hypervisor         │           └───────────────────────┘
│    Host OS            │
└───────────────────────┘

VMs: Heavy (GBs), slow boot   Containers: Lightweight (MBs), instant start

6.2 Core Docker Concepts🔗

Concept	What It Is
Image	Read-only blueprint (like a recipe)
Container	Running instance of an image (like a cooked meal)
Dockerfile	Instructions to build an image
Registry	Storage for images (Docker Hub, GCR)
Volume	Persistent storage attached to containers
Network	Communication between containers

6.3 Dockerfile for ML Model Serving🔗

Dockerfile layers (each instruction = a layer):

┌────────────────────────────────────────┐
│  COPY . /app              ← your code  │
├────────────────────────────────────────┤
│  RUN pip install -r req.txt ← libs     │
├────────────────────────────────────────┤
│  WORKDIR /app             ← work dir   │
├────────────────────────────────────────┤
│  FROM python:3.10-slim    ← base image │
└────────────────────────────────────────┘

Production ML Dockerfile🔗

# Dockerfile
# ─── Stage 1: Build dependencies ───────────────────────────────
FROM python:3.10-slim AS builder

WORKDIR /app

# Install dependencies first (cached if requirements.txt unchanged)
COPY requirements.txt .
RUN pip install --no-cache-dir --prefix=/install -r requirements.txt

# ─── Stage 2: Production image ──────────────────────────────────
FROM python:3.10-slim AS production

WORKDIR /app

# Copy installed packages from builder stage
COPY --from=builder /install /usr/local

# Copy application code
COPY src/ ./src/
COPY models/ ./models/
COPY config/ ./config/

# Non-root user (security best practice)
RUN useradd --create-home appuser
USER appuser

# Expose API port
EXPOSE 8000

# Health check
HEALTHCHECK --interval=30s --timeout=3s \
  CMD curl -f http://localhost:8000/health || exit 1

# Start model server
CMD ["uvicorn", "src.serve:app", "--host", "0.0.0.0", "--port", "8000"]

6.4 ML Model Server (FastAPI)🔗

# src/serve.py
from fastapi import FastAPI
from pydantic import BaseModel
import pickle
import numpy as np

app = FastAPI(title="ML Model API", version="1.0")

# Load model at startup
with open("models/model.pkl", "rb") as f:
    model = pickle.load(f)

class PredictionRequest(BaseModel):
    features: list[float]

class PredictionResponse(BaseModel):
    prediction: float
    confidence: float

@app.get("/health")
def health_check():
    return {"status": "healthy"}

@app.post("/predict", response_model=PredictionResponse)
def predict(request: PredictionRequest):
    features = np.array(request.features).reshape(1, -1)
    prediction = model.predict(features)[0]
    confidence = model.predict_proba(features).max()
    return PredictionResponse(prediction=prediction, confidence=confidence)

6.5 Docker Commands Cheatsheet🔗

# ── Build ──────────────────────────────────────────
# Build image from Dockerfile in current directory
docker build -t my-ml-model:v1 .

# Build with build arguments
docker build --build-arg MODEL_VERSION=v2 -t my-ml-model:v2 .

# ── Run ───────────────────────────────────────────
# Run container (foreground)
docker run -p 8000:8000 my-ml-model:v1

# Run in background (detached)
docker run -d -p 8000:8000 --name ml-server my-ml-model:v1

# Run with volume mount (for model updates)
docker run -d -p 8000:8000 -v $(pwd)/models:/app/models my-ml-model:v1

# ── Inspect ────────────────────────────────────────
docker ps                          # running containers
docker logs ml-server              # view logs
docker exec -it ml-server bash     # shell into container
docker inspect ml-server           # detailed config

# ── Registry ───────────────────────────────────────
docker tag my-ml-model:v1 gcr.io/my-project/ml-model:v1
docker push gcr.io/my-project/ml-model:v1
docker pull gcr.io/my-project/ml-model:v1

# ── Clean up ───────────────────────────────────────
docker stop ml-server
docker rm ml-server
docker rmi my-ml-model:v1
docker system prune -a             # remove ALL unused resources

6.6 Docker Compose for Multi-Service ML Stack🔗

# docker-compose.yml
version: '3.8'

services:

  # ML Model API
  ml-api:
    build: .
    ports:
      - "8000:8000"
    volumes:
      - ./models:/app/models
    environment:
      - MODEL_PATH=/app/models/model.pkl
      - LOG_LEVEL=INFO
    depends_on:
      - redis
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
      interval: 30s

  # MLflow Tracking Server
  mlflow:
    image: ghcr.io/mlflow/mlflow:latest
    ports:
      - "5000:5000"
    volumes:
      - mlflow_data:/mlflow
    command: mlflow server --host 0.0.0.0 --port 5000

  # Redis — feature cache
  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"

  # Prometheus — metrics collection
  prometheus:
    image: prom/prometheus:latest
    ports:
      - "9090:9090"
    volumes:
      - ./monitoring/prometheus.yml:/etc/prometheus/prometheus.yml

  # Grafana — dashboards
  grafana:
    image: grafana/grafana:latest
    ports:
      - "3000:3000"
    depends_on:
      - prometheus

volumes:
  mlflow_data:

# Start the entire stack
docker-compose up -d

# Scale the API to 3 instances
docker-compose up -d --scale ml-api=3

# Stop everything
docker-compose down

6.7 Docker Layer Caching (Speed Optimization)🔗

# SLOW: copies code first, then installs (no caching)
COPY . /app
RUN pip install -r requirements.txt

# FAST: install deps first (cached unless requirements.txt changes)
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . /app          ← only this layer rebuilds on code changes

6.8 Container Registries🔗

┌────────────────────────────────────────────────────────┐
│              CONTAINER REGISTRIES                      │
│                                                        │
│  Docker Hub     → hub.docker.com (public images)      │
│  GCR            → gcr.io (Google Cloud Registry)      │
│  GAR            → pkg.dev (Google Artifact Registry)  │
│  ECR            → AWS Elastic Container Registry      │
│  ACR            → Azure Container Registry            │
│  GHCR           → ghcr.io (GitHub Container Registry) │
└────────────────────────────────────────────────────────┘

Next Chapter → 07: Kubernetes