Docker Best Practices

Overview

Docker containerization provides consistent, reproducible runtime environments across development, testing, and production. A well-crafted Dockerfile creates small, secure images that deploy quickly and run reliably. This guide covers multi-stage builds, layer optimization, security hardening, and local development configurations.

Effective Docker usage balances multiple concerns: image size affects deployment speed and storage costs; layer structure impacts build times and cache efficiency; security configurations prevent vulnerabilities; and base image selection determines maintenance burden. Understanding these tradeoffs enables you to create optimal container images.

The containerization strategy follows the "build once, run anywhere" principle. The same Docker image built in CI/CD deploys to all environments without modification. Environment-specific configuration comes from external sources (environment variables, config maps) rather than being baked into the image.

Core Principles

Multi-Stage Builds: Separate build and runtime environments
Small Images: Use minimal base images (Alpine, Distroless)
Layer Optimization: Order layers from least to most frequently changed
Security: Run as non-root user, scan for vulnerabilities
Reproducibility: Pin versions, use specific tags (not latest)
Build Cache: Leverage Docker layer caching for fast builds
Health Checks: Include health check endpoints

Multi-Stage Builds

Multi-stage builds separate the build environment from the runtime environment. The build stage includes all tools needed to compile and package the application (Maven, npm, compilers). The runtime stage contains only what's needed to run the compiled application (JRE, Node.js runtime).

Why Multi-Stage Builds Matter

Single-stage builds create bloated images containing both build tools and runtime artifacts. A Java application might need Maven (50+ MB) and the full JDK (200+ MB) to build, but only needs the JRE (50 MB) and the compiled JAR to run. Including build tools in the runtime image wastes space and increases attack surface - an attacker who compromises your container gains access to compilers and build tools they could use for further exploitation.

Multi-stage builds use the FROM instruction multiple times, with each FROM starting a new stage. The COPY --from=builder instruction copies files from earlier stages, allowing you to selectively include only necessary artifacts in the final image. Earlier stages are discarded after the build completes.

For Spring Boot applications, multi-stage builds are particularly effective because Spring Boot's fat JARs contain all dependencies. You can extract the JAR's layers (dependencies, Spring Boot loader, application classes) and copy them separately, enabling Docker to cache the large dependency layer independently from your frequently-changing application code. For more on Spring Boot Docker optimization, see Spring Boot General.

Java Spring Boot Application

# syntax=docker/dockerfile:1

# Stage 1: Build
FROM eclipse-temurin:21-jdk-alpine AS builder

WORKDIR /build

# Copy Gradle wrapper and build files first (layer caching)
COPY gradlew .
COPY gradle gradle
COPY build.gradle settings.gradle ./

# Download dependencies (cached layer - only re-runs when build files change)
RUN ./gradlew dependencies --no-daemon --quiet

# Copy source code
COPY src src

# Build application and extract layers for optimal caching
RUN ./gradlew bootJar --no-daemon && \
    java -Djarmode=layertools -jar build/libs/*.jar extract --destination build/extracted

# Stage 2: Runtime
FROM eclipse-temurin:21-jre-alpine

# Create non-root user
RUN addgroup -g 1001 appuser && \
    adduser -D -u 1001 -G appuser appuser

WORKDIR /app

# Copy layered JAR contents (dependencies change rarely, application code changes often)
COPY --from=builder /build/extracted/dependencies/ ./
COPY --from=builder /build/extracted/spring-boot-loader/ ./
COPY --from=builder /build/extracted/snapshot-dependencies/ ./
COPY --from=builder /build/extracted/application/ ./

USER appuser

EXPOSE 8080

HEALTHCHECK --interval=30s --timeout=3s --start-period=40s --retries=3 \
    CMD wget --no-verbose --tries=1 --spider http://localhost:8080/actuator/health || exit 1

ENTRYPOINT ["java", "org.springframework.boot.loader.launch.JarLauncher"]

Node.js/TypeScript Application

# syntax=docker/dockerfile:1

# Stage 1: Build
FROM node:22-alpine AS builder

WORKDIR /build

# Copy dependency files
COPY package*.json ./
COPY tsconfig.json ./

# Install dependencies
RUN npm ci --only=production && \
    npm cache clean --force

# Copy source code
COPY src src

# Build TypeScript
RUN npm run build

# Stage 2: Runtime
FROM node:22-alpine

# Install dumb-init for proper signal handling
RUN apk add --no-cache dumb-init

# Create non-root user
RUN addgroup -g 1001 appuser && \
    adduser -D -u 1001 -G appuser appuser

WORKDIR /app

# Copy dependencies and built code
COPY --from=builder /build/node_modules ./node_modules
COPY --from=builder /build/dist ./dist
COPY package*.json ./

# Switch to non-root user
USER appuser

# Expose port
EXPOSE 3000

# Health check
HEALTHCHECK --interval=30s --timeout=3s --start-period=10s --retries=3 \
    CMD node -e "require('http').get('http://localhost:3000/health', (r) => {process.exit(r.statusCode === 200 ? 0 : 1)})"

# Run with dumb-init
ENTRYPOINT ["dumb-init", "--"]
CMD ["node", "dist/index.js"]

Layer Optimization

Docker builds images as a series of layers, each layer representing a Dockerfile instruction (RUN, COPY, ADD). Docker caches these layers and reuses them in subsequent builds if the instruction and its inputs haven't changed. Proper layer ordering dramatically reduces build times.

How Docker Layer Caching Works

When Docker builds an image, it checks each instruction against its cache. If an instruction and its context (files being copied, commands being run) match a cached layer, Docker reuses the cache. However, once a layer changes, all subsequent layers are invalidated - Docker must rebuild them even if they haven't changed.

This invalidation behavior makes layer ordering critical. If you copy all source code in one instruction and then install dependencies, every code change invalidates the dependency installation layer, forcing a complete re-download of all dependencies. Reordering to install dependencies first means code changes don't affect the dependency layer - it stays cached.

For Gradle projects, copying build.gradle and settings.gradle first and running the dependencies task creates a dependency layer that only invalidates when dependencies change. Most builds only change source code, so this layer remains cached. Similarly, for npm projects, copying package.json and running npm ci caches the node_modules layer.

Order Matters

# BAD: Source code copied before dependencies
FROM eclipse-temurin:21-jdk-alpine
WORKDIR /app
COPY . .                           # Everything invalidates cache
RUN ./gradlew bootJar --no-daemon

# GOOD: Dependencies cached separately
FROM eclipse-temurin:21-jdk-alpine
WORKDIR /app

# Layer 1: Dependencies (rarely changes)
COPY gradlew .
COPY gradle gradle
COPY build.gradle settings.gradle ./
RUN ./gradlew dependencies --no-daemon --quiet

# Layer 2: Source code (changes frequently)
COPY src src
RUN ./gradlew bootJar --no-daemon

The good example creates two separate layers. The first layer (build files + dependency resolution) changes infrequently — only when you add/update dependencies. The second layer (source code + compilation) changes frequently — every commit. This separation means most builds reuse the expensive dependency download layer.

Minimize Layers

# BAD: Too many layers
RUN apk update
RUN apk add curl
RUN apk add wget
RUN apk add bash

# GOOD: Combine into single layer
RUN apk update && \
    apk add --no-cache \
        curl \
        wget \
        bash && \
    rm -rf /var/cache/apk/*

Image Size Reduction

Use Minimal Base Images

# BAD: Large base image (~500MB)
FROM openjdk:21

# GOOD: Alpine-based image (~150MB)
FROM eclipse-temurin:21-jre-alpine

# BETTER: Distroless image (~50MB)
FROM gcr.io/distroless/java21-debian12

Remove Build Tools in Runtime Image

# Multi-stage build removes Maven/npm from final image
FROM maven:3.9-eclipse-temurin-21-alpine AS builder
# ... build steps

FROM eclipse-temurin:21-jre-alpine
# Only JRE, no Maven = smaller image
COPY --from=builder /build/target/*.jar app.jar

Clean Up Temporary Files

RUN apt-get update && \
    apt-get install -y curl && \
    # Clean up in same layer
    apt-get clean && \
    rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

Security Best Practices

Container security prevents unauthorized access and limits damage from successful exploits. Security configurations follow the principle of least privilege - containers run with minimal permissions necessary for their function.

Understanding Container Security

By default, Docker containers run as root (UID 0). If an attacker exploits a vulnerability in your application and gains code execution, they have root access within the container. While container isolation provides some protection, root access enables privilege escalation attacks that could break out of the container.

Running as a non-root user limits exploit impact. Even if an attacker gains code execution, they're constrained by the non-root user's limited permissions. They can't install system packages, modify system files, or bind to privileged ports (< 1024).

Creating a dedicated user (appuser) with a fixed UID (1001) ensures consistent permissions across environments. Some orchestration platforms (like Kubernetes with Pod Security Policies) enforce non-root containers, so following this practice prevents deployment failures.

For comprehensive security guidance including input validation, authentication, and encryption, see Security Best Practices.

Non-Root User

# Create user and group
RUN addgroup -g 1001 appuser && \
    adduser -D -u 1001 -G appuser appuser

# Set ownership
COPY --chown=appuser:appuser --from=builder /build/app.jar app.jar

# Switch to non-root
USER appuser

# Verify
RUN whoami  # Should output: appuser

The --chown flag in COPY ensures copied files are owned by appuser rather than root. Without this, the application running as appuser might not be able to read its own files.

Scan for Vulnerabilities

# Trivy scanner
docker run --rm -v /var/run/docker.sock:/var/run/docker.sock \
  aquasec/trivy:latest image payment-service:latest

# Fail on high/critical vulnerabilities
docker run --rm -v /var/run/docker.sock:/var/run/docker.sock \
  aquasec/trivy:latest image --exit-code 1 --severity HIGH,CRITICAL \
  payment-service:latest

Read-Only Filesystem

# Application should not write to filesystem
FROM eclipse-temurin:21-jre-alpine
USER appuser

# Run with read-only filesystem
# docker run --read-only payment-service:latest

Drop Capabilities

# docker-compose.yml
services:
  payment-service:
    image: payment-service:latest
    cap_drop:
      - ALL
    cap_add:
      - NET_BIND_SERVICE  # Only if needed
    security_opt:
      - no-new-privileges:true

Dockerfile Best Practices

Pin Versions

# BAD: Using latest
FROM node:latest
RUN npm install express

# GOOD: Specific versions
FROM node:22.11.0-alpine3.20
RUN npm install [email protected]

Use .dockerignore

# .dockerignore
.git
.gitignore
.gitlab-ci.yml
README.md
docs/
target/
node_modules/
.env
.env.local
*.log
.DS_Store
.idea/
.vscode/

Metadata Labels

LABEL org.opencontainers.image.title="Payment Service" \
      org.opencontainers.image.description="Core payment processing service" \
      org.opencontainers.image.version="2.1.0" \
      org.opencontainers.image.vendor="Example Bank" \
      org.opencontainers.image.source="https://gitlab.com/org/payment-service" \
      maintainer="[email protected]"

Docker Compose

Docker Compose orchestrates multi-container applications for local development. It defines all required services (application, database, cache, message queue) in a single YAML file and starts them with one command.

Why Use Docker Compose

Local development often requires multiple services: your application needs a database, cache, and perhaps other microservices it depends on. Without Docker Compose, developers manually start each service, manage port conflicts, and maintain individual configuration files. Docker Compose automates this complexity.

The depends_on directive with health checks ensures services start in the correct order. Your application waits for PostgreSQL to be healthy (accepting connections) before starting, preventing startup failures from connection errors.

Service names act as hostnames within the Docker network. Your application connects to postgres:5432 rather than localhost:5432 - Docker's DNS resolves the postgres service name to the appropriate container IP. This mirrors production environments where services communicate via service discovery rather than hard-coded IPs.

Development Environment

# docker-compose.yml
version: '3.8'

services:
  payment-service:
    build:
      context: .
      dockerfile: Dockerfile
      target: builder  # Use builder stage for hot reload
    ports:
      - "8080:8080"
    environment:
      - SPRING_PROFILES_ACTIVE=dev
      - DATABASE_URL=jdbc:postgresql://postgres:5432/payments
      - REDIS_URL=redis://redis:6379
    depends_on:
      postgres:
        condition: service_healthy
      redis:
        condition: service_started
    volumes:
      - ./src:/app/src  # Hot reload
    networks:
      - payment-network

  postgres:
    image: postgres:16-alpine
    environment:
      POSTGRES_DB: payments
      POSTGRES_USER: paymentuser
      POSTGRES_PASSWORD: paymentpass
    ports:
      - "5432:5432"
    volumes:
      - postgres-data:/var/lib/postgresql/data
      - ./scripts/init-db.sql:/docker-entrypoint-initdb.d/init.sql
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U paymentuser"]
      interval: 10s
      timeout: 5s
      retries: 5
    networks:
      - payment-network

  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"
    volumes:
      - redis-data:/data
    command: redis-server --appendonly yes
    networks:
      - payment-network

  jaeger:
    image: jaegertracing/all-in-one:latest
    ports:
      - "16686:16686"  # Jaeger UI
      - "14250:14250"  # Collector
    environment:
      - COLLECTOR_OTLP_ENABLED=true
    networks:
      - payment-network

volumes:
  postgres-data:
  redis-data:

networks:
  payment-network:
    driver: bridge

Production-Like Environment

# docker-compose.prod.yml
version: '3.8'

services:
  payment-service:
    image: registry.example.com/payment-service:${VERSION:-latest}
    deploy:
      replicas: 2
      resources:
        limits:
          cpus: '1.0'
          memory: 512M
        reservations:
          cpus: '0.5'
          memory: 256M
      restart_policy:
        condition: on-failure
        delay: 5s
        max_attempts: 3
    environment:
      - SPRING_PROFILES_ACTIVE=prod
      - DATABASE_URL=${DATABASE_URL}
      - REDIS_URL=${REDIS_URL}
      - JWT_SECRET=${JWT_SECRET}
    ports:
      - "8080:8080"
    healthcheck:
      test: ["CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:8080/actuator/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 40s
    logging:
      driver: "json-file"
      options:
        max-size: "10m"
        max-file: "3"

Health Checks

Health checks enable Docker and orchestration platforms (Kubernetes) to monitor container health and restart unhealthy containers automatically. The health check command runs periodically; if it fails repeatedly, the container is marked unhealthy and restarted.

Understanding Health Check Parameters

The --interval parameter controls how frequently the health check runs (30 seconds means Docker checks health every 30 seconds). The --timeout limits how long the check can run before being considered failed. The --retries determines how many consecutive failures trigger an unhealthy status.

The --start-period provides a grace period during application startup. Spring Boot applications typically take 30-60 seconds to start, during which health checks would fail. The start period prevents false positives during legitimate startup time.

Spring Boot's actuator provides a /actuator/health endpoint that checks not just that the application is running, but that it can connect to required dependencies (database, cache, external services). This comprehensive check detects degraded states where the application is running but non-functional. For more on Spring Boot actuator configuration, see Spring Boot Observability.

Spring Boot Health Check

HEALTHCHECK --interval=30s --timeout=3s --start-period=40s --retries=3 \
    CMD wget --no-verbose --tries=1 --spider \
        http://localhost:8080/actuator/health || exit 1

The wget --spider flag performs a HEAD request without downloading the response body - checking that the endpoint returns HTTP 200 without wasting bandwidth on the response content.

Node.js Health Check

HEALTHCHECK --interval=30s --timeout=3s --start-period=10s --retries=3 \
    CMD node -e "require('http').get('http://localhost:3000/health', (r) => { \
        process.exit(r.statusCode === 200 ? 0 : 1) \
    })"

Custom Health Check Script

COPY healthcheck.sh /usr/local/bin/
RUN chmod +x /usr/local/bin/healthcheck.sh

HEALTHCHECK --interval=30s --timeout=5s --start-period=30s --retries=3 \
    CMD ["/usr/local/bin/healthcheck.sh"]

#!/bin/sh
# healthcheck.sh
set -e

# Check HTTP endpoint
if ! wget --no-verbose --tries=1 --spider http://localhost:8080/health; then
    exit 1
fi

# Check database connection
if ! nc -z postgres 5432; then
    exit 1
fi

exit 0

Build Optimization

BuildKit

# Enable BuildKit
export DOCKER_BUILDKIT=1

# Build with BuildKit
docker build --progress=plain -t payment-service:latest .

# Use BuildKit cache mounts
docker build --secret id=npmrc,src=$HOME/.npmrc \
  -t payment-service:latest .

Cache Mounts (BuildKit)

# syntax=docker/dockerfile:1

FROM maven:3.9-eclipse-temurin-21-alpine

WORKDIR /build

COPY gradlew .
COPY gradle gradle
COPY build.gradle settings.gradle ./

# Use cache mount for Gradle dependencies
RUN --mount=type=cache,target=/root/.gradle \
    ./gradlew dependencies --no-daemon --quiet

COPY src src

RUN --mount=type=cache,target=/root/.gradle \
    ./gradlew bootJar --no-daemon

Image Tagging Strategy

Semantic Versioning

# Build with multiple tags
docker build \
  -t payment-service:2.1.0 \
  -t payment-service:2.1 \
  -t payment-service:2 \
  -t payment-service:latest \
  .

# Push all tags
docker push payment-service:2.1.0
docker push payment-service:2.1
docker push payment-service:2
docker push payment-service:latest

Git-Based Tagging

# Tag with git commit SHA
docker build -t payment-service:${CI_COMMIT_SHA} .
docker build -t payment-service:${CI_COMMIT_BRANCH}-${CI_COMMIT_SHORT_SHA} .

# Tag with git tag
docker build -t payment-service:${CI_COMMIT_TAG} .

Environment-Specific Builds

Build Arguments

ARG ENVIRONMENT=production
ARG VERSION=unknown

# Use build args
RUN echo "Building for ${ENVIRONMENT} environment, version ${VERSION}"

LABEL environment=${ENVIRONMENT} \
      version=${VERSION}

# Build with args
docker build \
  --build-arg ENVIRONMENT=production \
  --build-arg VERSION=2.1.0 \
  -t payment-service:2.1.0 \
  .

Summary

Key Takeaways

Multi-stage builds - Separate build and runtime for smaller images
Minimal base images - Use Alpine or Distroless (< 200MB)
Layer optimization - Order from least to most frequently changed
Non-root user - Security best practice, prevent privilege escalation
Pin versions - Reproducible builds, avoid latest tag
Health checks - Enable container health monitoring
.dockerignore - Exclude unnecessary files from context
BuildKit - Enable for faster builds and cache mounts
Security scans - Trivy for vulnerability detection
Docker Compose - Local development with dependencies

Next Steps: Review CI/CD Pipelines for Docker image building in GitLab CI and Spring Boot General for Spring Boot container best practices.

Overview​

Core Principles​

Multi-Stage Builds​

Why Multi-Stage Builds Matter​

Java Spring Boot Application​

Node.js/TypeScript Application​

Layer Optimization​

How Docker Layer Caching Works​

Order Matters​

Minimize Layers​

Image Size Reduction​

Use Minimal Base Images​

Remove Build Tools in Runtime Image​

Clean Up Temporary Files​

Security Best Practices​

Understanding Container Security​

Non-Root User​

Scan for Vulnerabilities​

Read-Only Filesystem​

Drop Capabilities​

Dockerfile Best Practices​

Pin Versions​

Use .dockerignore​

Metadata Labels​

Docker Compose​

Why Use Docker Compose​

Development Environment​

Production-Like Environment​

Health Checks​

Understanding Health Check Parameters​

Spring Boot Health Check​

Node.js Health Check​

Custom Health Check Script​

Build Optimization​

BuildKit​

Cache Mounts (BuildKit)​

Image Tagging Strategy​

Semantic Versioning​

Git-Based Tagging​

Environment-Specific Builds​

Build Arguments​

Further Reading​

Internal Documentation​

External Resources​

Summary​

Key Takeaways​