Kubernetes Best Practices

Overview

Kubernetes orchestrates containerized applications across clusters of machines, providing automated deployment, scaling, and management. A well-architected Kubernetes configuration ensures applications run reliably, scale efficiently, and recover automatically from failures. This guide covers deployment strategies, resource management, health probes, configuration handling, service mesh patterns, ingress routing, observability, and security hardening.

Kubernetes abstracts infrastructure complexity, allowing you to declare your desired state (3 replicas of payment service, each with 512MB memory) and letting Kubernetes maintain that state. If a pod crashes, Kubernetes restarts it. If a node fails, Kubernetes reschedules pods to healthy nodes. If traffic increases, Kubernetes scales pods horizontally.

The control plane (API server, scheduler, controller manager, etcd) manages the cluster state. The API server is the central point for all operations - kubectl commands, CI/CD deployments, and internal component communication all go through the API server. The scheduler assigns Pods to nodes based on resource availability and constraints. The controller manager runs control loops that watch cluster state and make changes to move current state toward desired state (ReplicaSet controller ensures correct number of Pod replicas exist). Etcd stores all cluster state persistently - if etcd data is lost, the cluster loses knowledge of all resources.

Worker nodes run the actual application Pods. The kubelet on each node watches for Pods assigned to its node, starts containers via the container runtime (containerd, CRI-O), and reports Pod status back to the API server. Kube-proxy maintains network rules enabling Pod-to-Pod communication and Service load balancing.

The platform's declarative approach uses YAML manifests to define resources (Deployments, Services, ConfigMaps, Ingresses). These manifests version control your infrastructure alongside application code, enabling GitOps workflows where infrastructure changes follow the same review and deployment process as code changes. For comprehensive Docker container preparation, see Docker Best Practices.

Core Principles

Declarative Configuration: Define desired state, let Kubernetes converge
Resource Limits: Set requests and limits for CPU/memory
Health Probes: Use liveness, readiness, and startup probes
Rolling Updates: Zero-downtime deployments with gradual rollout
Configuration Separation: Externalize config via ConfigMaps and Secrets
Observability: Expose metrics, logs, and traces
Security: Pod Security Standards, Network Policies, RBAC
High Availability: Multi-replica deployments across availability zones

Deployment Strategies

Kubernetes Deployment resources manage ReplicaSets, which in turn manage Pods. Deployment strategies control how updates roll out to running Pods - balancing between deployment speed, resource consumption, and risk mitigation.

Understanding Deployment Strategies

The deployment strategy determines how Kubernetes replaces old Pods with new ones during updates. The wrong strategy can cause downtime (Recreate), consume excessive resources (duplicate all Pods in blue-green), or require complex configuration (canary with manual verification).

Rolling updates provide the optimal balance for most use cases: gradual replacement of old Pods with new ones, maintaining application availability throughout the update. Kubernetes replaces a configurable number of Pods at a time (controlled by maxUnavailable and maxSurge), verifying each new Pod is ready before proceeding. This strategy works well for stateless services where running multiple versions simultaneously is acceptable.

Blue-green deployments maintain two complete environments - blue (current) and green (new). Traffic routes to blue while green is tested. Once validated, traffic switches to green instantly. This enables immediate rollback (switch back to blue) but requires double the resources.

Canary deployments gradually shift traffic from the old version to the new version, monitoring metrics at each step. Start with 10% traffic to the new version, monitor for issues, increase to 50%, then 100%. This detects problems early with minimal user impact but requires robust monitoring and metrics.

Recreate strategy terminates all old Pods before creating new ones, causing downtime. Only use this for stateful applications that cannot tolerate multiple versions running simultaneously (database schema migrations, singleton services).

The maxUnavailable parameter defines how many Pods can be unavailable during the update (0 means all Pods stay available). The maxSurge parameter defines how many extra Pods Kubernetes creates temporarily (1 means one extra Pod beyond desired replicas). Together, these control rollout speed and resource consumption.

For more on container health checks referenced in these strategies, see Docker Best Practices.

Rolling Update (Default)

apiVersion: apps/v1
kind: Deployment
metadata:
  name: payment-service
  namespace: banking
  labels:
    app: payment-service
    version: v2.1.0
spec:
  replicas: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1    # 1 pod can be unavailable during update
      maxSurge: 1          # 1 extra pod during rollout
  selector:
    matchLabels:
      app: payment-service
  template:
    metadata:
      labels:
        app: payment-service
        version: v2.1.0
    spec:
      containers:
      - name: payment-service
        image: registry.example.com/payment-service:2.1.0
        ports:
        - containerPort: 8080
          protocol: TCP
        resources:
          requests:
            cpu: 500m
            memory: 512Mi
          limits:
            cpu: 1000m
            memory: 1Gi
        readinessProbe:
          httpGet:
            path: /actuator/health/readiness
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 10
          timeoutSeconds: 5
          failureThreshold: 3
        livenessProbe:
          httpGet:
            path: /actuator/health/liveness
            port: 8080
          initialDelaySeconds: 60
          periodSeconds: 30
          timeoutSeconds: 5
          failureThreshold: 3

How it works: Kubernetes gradually replaces Pods. With maxUnavailable: 1, at most 2 of 3 Pods are available during the update (one being replaced). With maxSurge: 1, Kubernetes creates a 4th Pod before terminating an old one, ensuring 3 Pods always handle traffic. Once the new Pod passes readiness checks, Kubernetes terminates an old Pod and repeats until all Pods are updated.

Blue-Green Deployment

# Blue deployment (current production)
apiVersion: apps/v1
kind: Deployment
metadata:
  name: payment-service-blue
  namespace: banking
spec:
  replicas: 3
  selector:
    matchLabels:
      app: payment-service
      version: blue
  template:
    metadata:
      labels:
        app: payment-service
        version: blue
    spec:
      containers:
      - name: payment-service
        image: registry.example.com/payment-service:2.0.0
---
# Green deployment (new version)
apiVersion: apps/v1
kind: Deployment
metadata:
  name: payment-service-green
  namespace: banking
spec:
  replicas: 3
  selector:
    matchLabels:
      app: payment-service
      version: green
  template:
    metadata:
      labels:
        app: payment-service
        version: green
    spec:
      containers:
      - name: payment-service
        image: registry.example.com/payment-service:2.1.0
---
# Service initially points to blue
apiVersion: v1
kind: Service
metadata:
  name: payment-service
  namespace: banking
spec:
  selector:
    app: payment-service
    version: blue    # Switch to 'green' to cutover
  ports:
  - port: 80
    targetPort: 8080
  type: ClusterIP

How it works: Both blue (old) and green (new) deployments run simultaneously. The Service routes traffic to blue Pods initially. After verifying green Pods work correctly, update the Service selector from version: blue to version: green, instantly switching all traffic. If issues arise, revert by changing the selector back to blue. This strategy requires double the resources (6 total Pods instead of 3) but enables instant rollback.

Canary Deployment

# Stable deployment (90% traffic)
apiVersion: apps/v1
kind: Deployment
metadata:
  name: payment-service-stable
  namespace: banking
spec:
  replicas: 9
  selector:
    matchLabels:
      app: payment-service
      track: stable
  template:
    metadata:
      labels:
        app: payment-service
        track: stable
        version: v2.0.0
    spec:
      containers:
      - name: payment-service
        image: registry.example.com/payment-service:2.0.0
---
# Canary deployment (10% traffic)
apiVersion: apps/v1
kind: Deployment
metadata:
  name: payment-service-canary
  namespace: banking
spec:
  replicas: 1
  selector:
    matchLabels:
      app: payment-service
      track: canary
  template:
    metadata:
      labels:
        app: payment-service
        track: canary
        version: v2.1.0
    spec:
      containers:
      - name: payment-service
        image: registry.example.com/payment-service:2.1.0
---
# Service routes to both
apiVersion: v1
kind: Service
metadata:
  name: payment-service
  namespace: banking
spec:
  selector:
    app: payment-service  # Matches both stable and canary
  ports:
  - port: 80
    targetPort: 8080

How it works: The Service selector matches both stable and canary Pods. With 9 stable Pods and 1 canary Pod, approximately 10% of traffic routes to the canary (Kubernetes load balances across all matching Pods). Monitor canary metrics (error rate, latency) and gradually increase canary replicas while decreasing stable replicas until fully rolled out. This strategy detects issues with minimal user impact.

For more sophisticated canary routing (percentage-based, header-based), use a service mesh like Istio, covered in Service Mesh Patterns.

Recreate Strategy

apiVersion: apps/v1
kind: Deployment
metadata:
  name: database-migration-job
  namespace: banking
spec:
  replicas: 1
  strategy:
    type: Recreate  # Terminate all old Pods before creating new ones
  selector:
    matchLabels:
      app: migration-runner
  template:
    metadata:
      labels:
        app: migration-runner
    spec:
      containers:
      - name: migrator
        image: registry.example.com/migration-runner:2.1.0

How it works: Kubernetes terminates all old Pods simultaneously, waits for termination to complete, then creates new Pods. This causes downtime (no Pods run during the transition) but guarantees only one version runs at a time. Use this strategy for stateful applications that cannot run multiple versions concurrently (e.g., database migration tools).

Resource Management

Kubernetes schedules Pods onto nodes based on resource requests. Resource limits constrain how much CPU and memory a Pod can consume, preventing resource exhaustion. Proper resource configuration ensures efficient cluster utilization and application stability.

Understanding Requests vs Limits

Requests specify the minimum resources guaranteed to a Pod. Kubernetes uses requests for scheduling - a Pod with cpu: 500m only schedules to nodes with at least 500 millicores available. The Pod receives this amount of CPU during contention; if other Pods are idle, it can use more.

Limits specify the maximum resources a Pod can consume. If a Pod exceeds its memory limit, Kubernetes terminates it (OOMKilled). If it exceeds its CPU limit, Kubernetes throttles it (slows down execution) but doesn't terminate it.

The ratio between limits and requests determines resource overcommitment. If all Pods set requests: 500m and limits: 1000m, the cluster can run more Pods than physically possible if all Pods hit their limits simultaneously. This overcommitment improves utilization (most Pods don't use maximum resources constantly) but risks resource contention.

For Spring Boot applications, resource requirements vary based on traffic patterns and JVM heap size. A typical Spring Boot service needs 512MB-1GB memory for heap plus off-heap memory for threads, metaspace, and native libraries. See Spring Boot General for JVM tuning in containers.

Resource Requests and Limits

apiVersion: apps/v1
kind: Deployment
metadata:
  name: payment-service
spec:
  replicas: 3
  template:
    spec:
      containers:
      - name: payment-service
        image: registry.example.com/payment-service:2.1.0
        resources:
          requests:
            cpu: 500m        # 0.5 CPU cores
            memory: 512Mi    # 512 MiB
          limits:
            cpu: 1000m       # 1 CPU core
            memory: 1Gi      # 1 GiB
        env:
        - name: JAVA_OPTS
          value: "-Xmx768m -Xms512m -XX:MaxMetaspaceSize=256m"

Resource units: CPU is measured in millicores (1000m = 1 core). Memory uses binary (Mi, Gi) or decimal (M, G) units (1Mi = 1024 KiB, 1M = 1000 KB). Use binary units (Mi, Gi) for consistency.

Setting appropriate values: Start with conservative estimates based on profiling. Monitor actual usage in production and adjust. Under-provisioning causes OOMKilled Pods and slow performance. Over-provisioning wastes cluster resources and increases costs.

Quality of Service Classes

# Guaranteed QoS: requests = limits
apiVersion: v1
kind: Pod
metadata:
  name: payment-service-guaranteed
spec:
  containers:
  - name: payment-service
    resources:
      requests:
        cpu: 500m
        memory: 512Mi
      limits:
        cpu: 500m         # Same as request
        memory: 512Mi     # Same as request
---
# Burstable QoS: limits > requests
apiVersion: v1
kind: Pod
metadata:
  name: payment-service-burstable
spec:
  containers:
  - name: payment-service
    resources:
      requests:
        cpu: 250m
        memory: 256Mi
      limits:
        cpu: 1000m        # Higher than request
        memory: 1Gi       # Higher than request
---
# BestEffort QoS: no requests or limits
apiVersion: v1
kind: Pod
metadata:
  name: payment-service-besteffort
spec:
  containers:
  - name: payment-service
    resources: {}  # No requests or limits

QoS implications: When nodes run out of resources, Kubernetes evicts Pods to reclaim resources. Eviction priority: BestEffort Pods first, then Burstable Pods exceeding requests, then Guaranteed Pods. Critical services should use Guaranteed QoS to minimize eviction risk.

Horizontal Pod Autoscaling (HPA)

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: payment-service-hpa
  namespace: banking
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: payment-service
  minReplicas: 3
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70  # Target 70% CPU
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80  # Target 80% memory
  - type: Pods
    pods:
      metric:
        name: http_requests_per_second
      target:
        type: AverageValue
        averageValue: "1000"
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300  # Wait 5 min before scaling down
      policies:
      - type: Percent
        value: 50                      # Max 50% of Pods removed per period
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 60   # Wait 1 min before scaling up
      policies:
      - type: Percent
        value: 100                     # Max double Pods per period
        periodSeconds: 60

How autoscaling works: The HPA controller queries metrics every 15 seconds. When average CPU utilization exceeds 70%, it calculates the number of replicas needed to bring utilization to 70% and scales up. The behavior section prevents flapping (rapid scaling up and down) by adding stabilization windows and limiting scaling velocity.

Custom metrics: Beyond CPU and memory, HPA supports custom metrics from application metrics servers (Prometheus). The http_requests_per_second metric scales based on application load rather than resource consumption, better reflecting actual demand.

Vertical Pod Autoscaling (VPA)

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: payment-service-vpa
  namespace: banking
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: payment-service
  updatePolicy:
    updateMode: Auto  # Auto, Initial, Off, Recreate
  resourcePolicy:
    containerPolicies:
    - containerName: payment-service
      minAllowed:
        cpu: 250m
        memory: 256Mi
      maxAllowed:
        cpu: 2000m
        memory: 2Gi
      controlledResources:
      - cpu
      - memory

How VPA works: VPA monitors resource usage and recommends (or automatically applies) resource request adjustments. If a Pod consistently uses 800MB memory but requests only 512MB, VPA increases the request to match actual usage. This prevents OOMKilled Pods while avoiding over-provisioning.

VPA limitations: VPA cannot update running Pods' resources (except in Kubernetes 1.27+ with InPlacePodVerticalScaling feature gate). Instead, it recreates Pods with new resource values, causing brief unavailability. For most use cases, HPA (horizontal scaling) is preferred over VPA.

Note: Do not use HPA and VPA together on the same resource (CPU or memory) as they can conflict. Use HPA for CPU-based scaling and VPA for memory, or use only one autoscaler.

Cluster Autoscaling

Cluster autoscaling adds/removes nodes based on Pod scheduling needs. When Pods cannot schedule due to insufficient node resources, the cluster autoscaler provisions new nodes. When nodes are underutilized, it drains and removes them.

Configuration is cloud-provider-specific (AWS Cluster Autoscaler, GKE Node Auto-Provisioning, Azure Cluster Autoscaler). The autoscaler respects Pod Disruption Budgets (PDBs) during node drains and avoids removing nodes hosting Pods with local storage or strict affinity rules.

# Pod Disruption Budget to protect during node drains
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: payment-service-pdb
  namespace: banking
spec:
  minAvailable: 2  # Always keep at least 2 Pods running
  selector:
    matchLabels:
      app: payment-service

Why PDBs matter: Without PDBs, cluster autoscaler might drain nodes aggressively, terminating all replicas of a service simultaneously during scale-down. PDBs ensure at least minAvailable Pods stay running during voluntary disruptions (node drains, evictions), maintaining application availability.

Health Checks

Kubernetes uses health probes to determine Pod health and readiness to serve traffic. Properly configured probes enable automatic recovery from failures and prevent broken Pods from receiving traffic.

Understanding Probe Types

Liveness probes determine if a Pod is alive. If liveness checks fail repeatedly, Kubernetes restarts the Pod. Use liveness probes to detect deadlocks, infinite loops, or other unrecoverable states where the application is running but non-functional. Be conservative with liveness probes - overly aggressive checks cause restart loops.

Readiness probes determine if a Pod can serve traffic. If readiness checks fail, Kubernetes removes the Pod from Service endpoints (stops routing traffic to it) but doesn't restart the Pod. Use readiness probes to handle temporary issues like slow startup, database connection loss, or dependency unavailability.

Startup probes protect slow-starting containers. While the startup probe is running, Kubernetes doesn't execute liveness or readiness probes. This prevents premature restarts of applications with long initialization times (Spring Boot applications loading large contexts, JVM warming up).

For Spring Boot applications, the Actuator provides separate endpoints for liveness (/actuator/health/liveness) and readiness (/actuator/health/readiness). The readiness endpoint checks database connectivity, cache availability, and dependent services. See Spring Boot Observability for detailed Actuator configuration.

Liveness Probe

apiVersion: v1
kind: Pod
metadata:
  name: payment-service
spec:
  containers:
  - name: payment-service
    image: registry.example.com/payment-service:2.1.0
    livenessProbe:
      httpGet:
        path: /actuator/health/liveness
        port: 8080
        scheme: HTTP
      initialDelaySeconds: 60    # Wait 60s after startup
      periodSeconds: 30          # Check every 30s
      timeoutSeconds: 5          # Timeout after 5s
      failureThreshold: 3        # Restart after 3 failures
      successThreshold: 1        # Mark healthy after 1 success

Configuration rationale: The initialDelaySeconds: 60 allows the Spring Boot application to complete startup before liveness checks begin. The periodSeconds: 30 balances responsiveness (detecting failures within 90 seconds: 30s × 3 failures) with overhead (health check load). The failureThreshold: 3 prevents flapping from transient failures.

Liveness probe pitfalls: Never use dependency checks (database connectivity) in liveness probes. If the database is temporarily unavailable, liveness probe failures restart all Pods, exacerbating the problem. Liveness probes should only detect unrecoverable application states, not transient dependency issues. Use readiness probes for dependency checks.

Readiness Probe

apiVersion: v1
kind: Pod
metadata:
  name: payment-service
spec:
  containers:
  - name: payment-service
    image: registry.example.com/payment-service:2.1.0
    readinessProbe:
      httpGet:
        path: /actuator/health/readiness
        port: 8080
        scheme: HTTP
      initialDelaySeconds: 30    # Start checking after 30s
      periodSeconds: 10          # Check every 10s
      timeoutSeconds: 5          # Timeout after 5s
      failureThreshold: 3        # Mark unready after 3 failures
      successThreshold: 1        # Mark ready after 1 success

Configuration rationale: The periodSeconds: 10 makes readiness checks more frequent than liveness checks (10s vs 30s), enabling faster traffic routing decisions. When a Pod becomes unready (database connection lost), Kubernetes removes it from Service endpoints within 30 seconds (10s × 3 failures), preventing broken Pods from receiving traffic.

Readiness probe content: The /actuator/health/readiness endpoint should check all dependencies required to serve traffic: database connectivity, cache availability, external API reachability. If any dependency is unavailable, return 503 Service Unavailable, marking the Pod unready.

Startup Probe

apiVersion: v1
kind: Pod
metadata:
  name: payment-service
spec:
  containers:
  - name: payment-service
    image: registry.example.com/payment-service:2.1.0
    startupProbe:
      httpGet:
        path: /actuator/health/liveness
        port: 8080
      initialDelaySeconds: 10
      periodSeconds: 10
      timeoutSeconds: 5
      failureThreshold: 30       # Allow 300s (10s × 30) for startup
    livenessProbe:
      httpGet:
        path: /actuator/health/liveness
        port: 8080
      periodSeconds: 30
      timeoutSeconds: 5
      failureThreshold: 3
    readinessProbe:
      httpGet:
        path: /actuator/health/readiness
        port: 8080
      periodSeconds: 10
      timeoutSeconds: 5
      failureThreshold: 3

How startup probes work: Kubernetes runs the startup probe first. Liveness and readiness probes don't start until the startup probe succeeds. With failureThreshold: 30 and periodSeconds: 10, the application has up to 300 seconds (5 minutes) to start. Once the startup probe succeeds, liveness and readiness probes take over.

When to use startup probes: Use startup probes for applications with variable or long startup times. Without startup probes, you must set high initialDelaySeconds on liveness probes to accommodate worst-case startup time, delaying failure detection for already-running Pods. Startup probes allow aggressive liveness probe settings while still supporting slow startup.

TCP and Exec Probes

# TCP probe (database, cache)
apiVersion: v1
kind: Pod
metadata:
  name: postgres
spec:
  containers:
  - name: postgres
    image: postgres:16-alpine
    livenessProbe:
      tcpSocket:
        port: 5432
      periodSeconds: 30
      failureThreshold: 3
---
# Exec probe (custom script)
apiVersion: v1
kind: Pod
metadata:
  name: payment-service
spec:
  containers:
  - name: payment-service
    image: registry.example.com/payment-service:2.1.0
    livenessProbe:
      exec:
        command:
        - /bin/sh
        - -c
        - "wget -q -O- http://localhost:8080/actuator/health/liveness | grep UP"
      periodSeconds: 30
      failureThreshold: 3

TCP probes check if a port is accepting connections. Use TCP probes for services without HTTP endpoints (databases, caches, message queues). TCP probes only verify the port is open, not that the service is functional, so they're less thorough than HTTP probes.

Exec probes run commands inside the container and check the exit code (0 = success, non-zero = failure). Use exec probes for complex health checks not exposed via HTTP. However, exec probes have higher overhead (spawning processes) than HTTP or TCP probes.

Configuration Management

Kubernetes separates configuration from application code using ConfigMaps (non-sensitive data) and Secrets (sensitive data). This separation enables the same container image to run in multiple environments (dev, staging, production) with different configurations.

Understanding ConfigMaps and Secrets

ConfigMaps store configuration as key-value pairs or files. Applications consume ConfigMaps as environment variables, command-line arguments, or mounted files. Changing a ConfigMap requires restarting Pods to pick up new values (unless the application reloads configuration dynamically).

Secrets store sensitive data (passwords, API keys, certificates) base64-encoded. While base64 is not encryption, Secrets enable integration with encryption-at-rest (etcd encryption) and external secret management systems (Sealed Secrets, External Secrets Operator, Vault). Never store unencrypted secrets in Git repositories.

For comprehensive secrets handling strategies including rotation and cloud provider integration, see Secrets Management.

ConfigMap from Literal Values

apiVersion: v1
kind: ConfigMap
metadata:
  name: payment-service-config
  namespace: banking
data:
  DATABASE_HOST: postgres.banking.svc.cluster.local
  DATABASE_PORT: "5432"
  DATABASE_NAME: payments
  REDIS_HOST: redis.banking.svc.cluster.local
  REDIS_PORT: "6379"
  LOG_LEVEL: INFO
  FEATURE_FLAG_NEW_PAYMENT_FLOW: "true"
---
# Consuming as environment variables
apiVersion: apps/v1
kind: Deployment
metadata:
  name: payment-service
spec:
  template:
    spec:
      containers:
      - name: payment-service
        image: registry.example.com/payment-service:2.1.0
        envFrom:
        - configMapRef:
            name: payment-service-config

How it works: The envFrom field injects all ConfigMap key-value pairs as environment variables. The payment service reads DATABASE_HOST, DATABASE_PORT, etc., from the environment. This approach works well for simple configuration but doesn't support dynamic reloading - changing the ConfigMap requires Pod restarts.

ConfigMap from File

apiVersion: v1
kind: ConfigMap
metadata:
  name: payment-service-config
  namespace: banking
data:
  application.properties: |
    server.port=8080
    spring.datasource.url=jdbc:postgresql://${DATABASE_HOST}:${DATABASE_PORT}/${DATABASE_NAME}
    spring.datasource.username=${DATABASE_USER}
    spring.datasource.password=${DATABASE_PASSWORD}
    spring.data.redis.host=${REDIS_HOST}
    spring.data.redis.port=${REDIS_PORT}
    logging.level.root=${LOG_LEVEL}
    payment.new-flow.enabled=${FEATURE_FLAG_NEW_PAYMENT_FLOW}
---
# Mounting as file
apiVersion: apps/v1
kind: Deployment
metadata:
  name: payment-service
spec:
  template:
    spec:
      containers:
      - name: payment-service
        image: registry.example.com/payment-service:2.1.0
        volumeMounts:
        - name: config
          mountPath: /app/config
          readOnly: true
        command: ["java"]
        args:
        - "-jar"
        - "app.jar"
        - "--spring.config.location=file:/app/config/application.properties"
      volumes:
      - name: config
        configMap:
          name: payment-service-config

How it works: The ConfigMap contains the entire application.properties file. The volume mount makes it available at /app/config/application.properties. Spring Boot loads the configuration file at startup. This approach supports complex configuration files (YAML, JSON, XML) and dynamic reloading if the application watches for file changes.

Secrets Management

apiVersion: v1
kind: Secret
metadata:
  name: payment-service-secrets
  namespace: banking
type: Opaque
stringData:  # Use stringData for plain text (Kubernetes base64 encodes it)
  DATABASE_PASSWORD: "p@ssw0rd123"
  JWT_SECRET: "super-secret-jwt-key-change-in-production"
  API_KEY: "sk_live_abc123xyz789"
---
# Consuming secrets
apiVersion: apps/v1
kind: Deployment
metadata:
  name: payment-service
spec:
  template:
    spec:
      containers:
      - name: payment-service
        image: registry.example.com/payment-service:2.1.0
        env:
        - name: DATABASE_PASSWORD
          valueFrom:
            secretKeyRef:
              name: payment-service-secrets
              key: DATABASE_PASSWORD
        - name: JWT_SECRET
          valueFrom:
            secretKeyRef:
              name: payment-service-secrets
              key: JWT_SECRET
        - name: API_KEY
          valueFrom:
            secretKeyRef:
              name: payment-service-secrets
              key: API_KEY

Security consideration: Never commit Secrets to Git with literal values. Use Sealed Secrets (encrypted Secrets safe for Git) or External Secrets Operator (fetches from Vault, AWS Secrets Manager) in production. The example above shows the Secret structure but should be created via CI/CD pipeline or external secret management.

External Secrets Operator

apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: payment-service-secrets
  namespace: banking
spec:
  refreshInterval: 1h
  secretStoreRef:
    name: vault-backend
    kind: SecretStore
  target:
    name: payment-service-secrets  # Creates this Secret
    creationPolicy: Owner
  data:
  - secretKey: DATABASE_PASSWORD
    remoteRef:
      key: banking/payment-service/database
      property: password
  - secretKey: JWT_SECRET
    remoteRef:
      key: banking/payment-service/jwt
      property: secret
---
apiVersion: external-secrets.io/v1beta1
kind: SecretStore
metadata:
  name: vault-backend
  namespace: banking
spec:
  provider:
    vault:
      server: "https://vault.example.com"
      path: "secret"
      version: "v2"
      auth:
        kubernetes:
          mountPath: "kubernetes"
          role: "payment-service"
          serviceAccountRef:
            name: payment-service

How it works: The External Secrets Operator syncs secrets from external systems (Vault, AWS Secrets Manager, Azure Key Vault) into Kubernetes Secrets. The refreshInterval: 1h keeps Secrets updated automatically. The application consumes the created Secret normally, unaware of the external source. This pattern centralizes secret management and enables secret rotation without modifying Kubernetes configurations.

Sealed Secrets

# Original secret (don't commit to Git)
apiVersion: v1
kind: Secret
metadata:
  name: payment-service-secrets
  namespace: banking
stringData:
  DATABASE_PASSWORD: "p@ssw0rd123"

# Encrypt with kubeseal
kubeseal --format yaml < secret.yaml > sealed-secret.yaml

# Sealed secret (safe to commit to Git)
apiVersion: bitnami.com/v1alpha1
kind: SealedSecret
metadata:
  name: payment-service-secrets
  namespace: banking
spec:
  encryptedData:
    DATABASE_PASSWORD: AgBy3i4OJSWK+PiTySYZZA9rO43cGDEq...
  template:
    metadata:
      name: payment-service-secrets
      namespace: banking
    type: Opaque

How it works: The Sealed Secrets controller runs in the cluster with a private key. You encrypt Secrets using the public key (kubeseal), creating SealedSecrets safe for Git. When you apply the SealedSecret, the controller decrypts it and creates a regular Secret. Only the cluster's controller can decrypt, so encrypted SealedSecrets are safe in version control.

Service Mesh Patterns

Service meshes provide advanced traffic management, security, and observability for microservices without changing application code. The mesh injects a sidecar proxy (Envoy) into each Pod, intercepting all network traffic and applying policies.

Understanding Service Mesh Benefits

Service meshes address challenges in microservice architectures: mutual TLS (mTLS) between services requires certificate management; retries and circuit breakers require implementing resilience patterns; observability requires instrumenting every service; canary deployments require complex routing logic.

A service mesh handles these concerns at the infrastructure layer. The sidecar proxies establish mTLS automatically (no application changes), retry failed requests based on configured policies, emit metrics for every request (latency, error rate, traffic volume), and route traffic based on rules (percentage-based canaries, header-based routing).

The tradeoff is complexity - service meshes add operational overhead (managing the mesh control plane) and latency (every request passes through two proxies: client sidecar → server sidecar). For architectures with few microservices (< 5), a service mesh's complexity may outweigh its benefits. For larger microservice architectures (10+), the mesh simplifies cross-cutting concerns.

Istio Traffic Management

# Virtual Service for canary deployment
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: payment-service
  namespace: banking
spec:
  hosts:
  - payment-service
  http:
  - match:
    - headers:
        x-canary:
          exact: "true"
    route:
    - destination:
        host: payment-service
        subset: v2-1-0
      weight: 100
  - route:
    - destination:
        host: payment-service
        subset: v2-0-0
      weight: 90
    - destination:
        host: payment-service
        subset: v2-1-0
      weight: 10
---
# Destination Rule defining subsets
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
  name: payment-service
  namespace: banking
spec:
  host: payment-service
  trafficPolicy:
    connectionPool:
      tcp:
        maxConnections: 100
      http:
        http1MaxPendingRequests: 50
        http2MaxRequests: 100
    outlierDetection:
      consecutiveErrors: 5
      interval: 30s
      baseEjectionTime: 30s
      maxEjectionPercent: 50
  subsets:
  - name: v2-0-0
    labels:
      version: v2.0.0
  - name: v2-1-0
    labels:
      version: v2.1.0

How it works: The VirtualService defines routing rules. Requests with the x-canary: true header route 100% to v2.1.0 (for testing). Other requests split 90/10 between v2.0.0 and v2.1.0 (canary rollout). The DestinationRule configures connection pooling (max 100 concurrent connections) and outlier detection (eject unhealthy Pods after 5 consecutive errors for 30 seconds).

Progressive canary: Start with 10% traffic to v2.1.0, monitor error rates and latency for 30 minutes, increase to 50%, monitor again, then 100%. If issues arise at any stage, revert the VirtualService to route 100% to v2.0.0.

Mutual TLS (mTLS)

apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
  name: default
  namespace: banking
spec:
  mtls:
    mode: STRICT  # STRICT, PERMISSIVE, DISABLE
---
# Authorization Policy
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
  name: payment-service-authz
  namespace: banking
spec:
  selector:
    matchLabels:
      app: payment-service
  action: ALLOW
  rules:
  - from:
    - source:
        namespaces: ["banking", "api-gateway"]
        principals: ["cluster.local/ns/banking/sa/account-service"]
    to:
    - operation:
        methods: ["POST"]
        paths: ["/api/v1/payments"]

How it works: PeerAuthentication with mode: STRICT requires mTLS for all service-to-service communication in the banking namespace. Services without valid certificates cannot communicate. The AuthorizationPolicy restricts access: only the account-service (identified by its ServiceAccount principal) from the banking or api-gateway namespace can POST to /api/v1/payments. This provides zero-trust security - even inside the cluster, services must authenticate and are authorized based on identity.

Traffic Splitting and A/B Testing

apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: payment-service
spec:
  hosts:
  - payment-service
  http:
  - match:
    - headers:
        user-segment:
          exact: "premium"
    route:
    - destination:
        host: payment-service
        subset: v2-1-0-premium
      headers:
        response:
          add:
            x-variant: "premium"
  - match:
    - headers:
        user-segment:
          exact: "standard"
    route:
    - destination:
        host: payment-service
        subset: v2-1-0-standard
      headers:
        response:
          add:
            x-variant: "standard"
  - route:
    - destination:
        host: payment-service
        subset: v2-0-0

How it works: Requests with user-segment: premium route to v2.1.0-premium (perhaps with different rate limits or features). Requests with user-segment: standard route to v2.1.0-standard. Other requests route to v2.0.0. The headers.response.add configuration adds an x-variant header to responses, enabling client-side analytics to correlate user experience with variant.

A/B testing: The application (or API gateway) sets the user-segment header based on user properties (subscription tier, geographic region, random assignment). The service mesh routes traffic accordingly. Application metrics (conversion rate, revenue per user) are tagged with variant, enabling statistical analysis of variant performance.

Ingress Controllers and Routing

Ingress controllers expose HTTP(S) routes from outside the cluster to Services inside the cluster. Ingress provides load balancing, TLS termination, and name-based virtual hosting.

Understanding Ingress

Kubernetes Services expose applications inside the cluster, but external clients cannot reach them (except via NodePort or LoadBalancer Services, which have limitations). Ingress provides a Layer 7 (HTTP) entry point, routing requests based on hostnames and paths to backend Services.

An Ingress controller (NGINX, Traefik, HAProxy, cloud-provider controllers) watches Ingress resources and configures itself to route traffic accordingly. The controller runs as a Deployment in the cluster, typically exposed via a LoadBalancer Service (cloud) or NodePort (on-premises).

Ingress enables multiple services to share a single load balancer IP (reducing cloud costs) and provides centralized TLS termination (certificates managed in one place rather than per-service).

NGINX Ingress

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: banking-api-ingress
  namespace: banking
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /$2
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
    nginx.ingress.kubernetes.io/rate-limit: "100"
    nginx.ingress.kubernetes.io/limit-rps: "10"
    cert-manager.io/cluster-issuer: "letsencrypt-prod"
spec:
  ingressClassName: nginx
  tls:
  - hosts:
    - api.example.com
    secretName: api-tls-cert
  rules:
  - host: api.example.com
    http:
      paths:
      - path: /payments(/|$)(.*)
        pathType: ImplementationSpecific
        backend:
          service:
            name: payment-service
            port:
              number: 80
      - path: /accounts(/|$)(.*)
        pathType: ImplementationSpecific
        backend:
          service:
            name: account-service
            port:
              number: 80

How it works: Requests to https://api.example.com/payments/... route to the payment-service. The rewrite-target: /$2 annotation rewrites /payments/123 to /123 before forwarding to the backend (the payment-service sees /123, not /payments/123). The ssl-redirect: "true" annotation redirects HTTP to HTTPS. The rate limit annotations cap requests at 100 total and 10 per second per IP.

TLS termination: The tls section references api-tls-cert Secret containing the TLS certificate and private key. Cert-manager (indicated by the cert-manager.io/cluster-issuer annotation) automatically obtains and renews certificates from Let's Encrypt, storing them in the Secret. The Ingress controller terminates TLS and forwards unencrypted traffic to backend Services over the cluster network (secure because cluster networking is isolated).

Path-Based Routing

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: api-ingress
  namespace: banking
spec:
  ingressClassName: nginx
  rules:
  - host: api.example.com
    http:
      paths:
      - path: /v1/payments
        pathType: Prefix
        backend:
          service:
            name: payment-service-v1
            port:
              number: 80
      - path: /v2/payments
        pathType: Prefix
        backend:
          service:
            name: payment-service-v2
            port:
              number: 80
      - path: /
        pathType: Prefix
        backend:
          service:
            name: api-gateway
            port:
              number: 80

How it works: Requests to /v1/payments/... route to payment-service-v1. Requests to /v2/payments/... route to payment-service-v2. All other requests (/) route to api-gateway. Path matching is prefix-based: /v1/payments matches /v1/payments/123/status. Order matters - more specific paths should appear before less specific paths.

API versioning: This pattern supports running multiple API versions simultaneously, enabling gradual client migration from v1 to v2. Clients specify the version in the path (/v1/payments vs /v2/payments), and Kubernetes routes to the appropriate backend.

Host-Based Routing

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: multi-tenant-ingress
  namespace: banking
spec:
  ingressClassName: nginx
  tls:
  - hosts:
    - acme-bank.example.com
    - globex-bank.example.com
    secretName: wildcard-tls-cert
  rules:
  - host: acme-bank.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: banking-app
            port:
              number: 80
        headers:
          request:
            set:
              X-Tenant-ID: "acme"
  - host: globex-bank.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: banking-app
            port:
              number: 80
        headers:
          request:
            set:
              X-Tenant-ID: "globex"

How it works: Requests to acme-bank.example.com route to banking-app with X-Tenant-ID: acme header. Requests to globex-bank.example.com route to the same banking-app but with X-Tenant-ID: globex header. The application uses the tenant ID to query the correct tenant's data. This multi-tenancy pattern uses a single application deployment serving multiple tenants, identified by hostname.

Monitoring and Logging

Kubernetes generates vast amounts of operational data: container logs, resource metrics, cluster events, API server audit logs. Effective monitoring and logging enable troubleshooting, performance optimization, and capacity planning.

Understanding Kubernetes Observability

Observability in Kubernetes operates at multiple levels: container logs (stdout/stderr from applications), metrics (CPU, memory, request rate, latency), events (Pod scheduled, container restarted, volume mounted), and traces (distributed request flow through microservices).

The metrics-server provides basic resource metrics for HPA and kubectl top. Prometheus scrapes application and system metrics for detailed monitoring and alerting. The ELK stack (Elasticsearch, Logstash, Kibana) or Loki aggregates logs from all containers. Jaeger or Zipkin collects distributed traces showing request flow.

For comprehensive observability strategies including structured logging and distributed tracing, see Observability Overview.

Prometheus Metrics

apiVersion: v1
kind: Service
metadata:
  name: payment-service
  namespace: banking
  labels:
    app: payment-service
  annotations:
    prometheus.io/scrape: "true"
    prometheus.io/port: "8080"
    prometheus.io/path: "/actuator/prometheus"
spec:
  selector:
    app: payment-service
  ports:
  - port: 80
    targetPort: 8080
---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: payment-service
  namespace: banking
  labels:
    app: payment-service
spec:
  selector:
    matchLabels:
      app: payment-service
  endpoints:
  - port: http
    path: /actuator/prometheus
    interval: 30s
    scrapeTimeout: 10s

How it works: The prometheus.io/* annotations tell Prometheus to scrape metrics from the payment-service at /actuator/prometheus on port 8080 every 30 seconds. The ServiceMonitor (Prometheus Operator CRD) creates the Prometheus scrape configuration automatically. Spring Boot Actuator exposes JVM metrics (heap usage, GC), application metrics (request count, error rate), and custom business metrics (payments processed).

Key metrics to monitor: Request rate (http_server_requests_seconds_count), error rate (http_server_requests_seconds_count{status="5xx"}), request duration (http_server_requests_seconds_sum / http_server_requests_seconds_count), JVM heap usage (jvm_memory_used_bytes{area="heap"}), GC pauses (jvm_gc_pause_seconds). These metrics enable SLO tracking and alert configuration.

Logging with Fluentd and Loki

apiVersion: v1
kind: ConfigMap
metadata:
  name: fluentd-config
  namespace: logging
data:
  fluent.conf: |
    <source>
      @type tail
      path /var/log/containers/*.log
      pos_file /var/log/fluentd-containers.log.pos
      tag kubernetes.*
      read_from_head true
      <parse>
        @type json
        time_key time
        time_format %Y-%m-%dT%H:%M:%S.%NZ
      </parse>
    </source>

    <filter kubernetes.**>
      @type kubernetes_metadata
      @id filter_kube_metadata
    </filter>

    <match kubernetes.var.log.containers.**banking**.log>
      @type loki
      url http://loki:3100
      extra_labels {"env":"production"}
      <label>
        namespace $.kubernetes.namespace_name
        pod $.kubernetes.pod_name
        container $.kubernetes.container_name
      </label>
    </match>
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentd
  namespace: logging
spec:
  selector:
    matchLabels:
      app: fluentd
  template:
    metadata:
      labels:
        app: fluentd
    spec:
      serviceAccountName: fluentd
      containers:
      - name: fluentd
        image: fluent/fluentd-kubernetes-daemonset:v1-debian-loki
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: config
          mountPath: /fluentd/etc
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: config
        configMap:
          name: fluentd-config

How it works: Fluentd runs as a DaemonSet (one Pod per node), tailing container logs from /var/log/containers/*.log. The kubernetes_metadata filter enriches logs with Kubernetes metadata (namespace, pod name, labels). Logs from the banking namespace are forwarded to Loki with labels (namespace, pod, container) for querying. Loki stores logs efficiently and provides a query language (LogQL) similar to Prometheus.

Structured logging: Applications should output JSON-formatted logs to stdout. JSON logs are machine-parseable, enabling filtering, aggregation, and correlation. For example: {"timestamp":"2025-01-08T10:30:00Z","level":"ERROR","message":"Payment failed","paymentId":"123","error":"Insufficient funds"}. Loki can query {namespace="banking"} | json | paymentId="123" to find all logs for payment 123.

Distributed Tracing with Jaeger

apiVersion: apps/v1
kind: Deployment
metadata:
  name: payment-service
spec:
  template:
    spec:
      containers:
      - name: payment-service
        image: registry.example.com/payment-service:2.1.0
        env:
        - name: OTEL_EXPORTER_OTLP_ENDPOINT
          value: "http://jaeger-collector:4317"
        - name: OTEL_SERVICE_NAME
          value: "payment-service"
        - name: OTEL_TRACES_SAMPLER
          value: "parentbased_traceidratio"
        - name: OTEL_TRACES_SAMPLER_ARG
          value: "0.1"  # Sample 10% of traces

How it works: The application uses OpenTelemetry to instrument HTTP requests, database queries, and external API calls, creating spans (units of work). Each request gets a trace ID, propagated across service calls via HTTP headers. Spans are exported to the Jaeger collector at jaeger-collector:4317. The sampler configuration (10%) reduces overhead - only 10% of traces are collected, which is sufficient for identifying performance bottlenecks.

Analyzing traces: Jaeger UI shows the complete request flow: API gateway → payment-service → database query → account-service → payment-service → response. Each span includes duration, enabling identification of slow operations (e.g., a database query taking 500ms). Traces correlate logs via trace IDs - logs can include traceId fields, allowing you to view all logs for a specific request.

Security

Kubernetes security operates at multiple layers: cluster access control (RBAC), network isolation (Network Policies), Pod permissions (Pod Security Standards), secret management, and container image security.

Understanding Kubernetes Security Model

By default, Kubernetes is permissive: Pods can communicate with all other Pods, containers run as root, and ServiceAccounts have minimal permissions. Production environments must harden these defaults.

Principle of least privilege guides Kubernetes security: Pods should run with minimal required permissions, ServiceAccounts should have minimal RBAC permissions, and network policies should deny traffic by default and explicitly allow necessary communication.

For comprehensive security coverage including authentication, authorization, and encryption, see Security Overview.

Pod Security Standards

apiVersion: v1
kind: Namespace
metadata:
  name: banking
  labels:
    pod-security.kubernetes.io/enforce: restricted
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/warn: restricted
---
# Pod adhering to restricted policy
apiVersion: v1
kind: Pod
metadata:
  name: payment-service
  namespace: banking
spec:
  securityContext:
    runAsNonRoot: true
    runAsUser: 1001
    fsGroup: 1001
    seccompProfile:
      type: RuntimeDefault
  containers:
  - name: payment-service
    image: registry.example.com/payment-service:2.1.0
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop:
        - ALL
      readOnlyRootFilesystem: true
    volumeMounts:
    - name: tmp
      mountPath: /tmp
  volumes:
  - name: tmp
    emptyDir: {}

How it works: The namespace labels enforce the restricted Pod Security Standard, the most restrictive level. Pods in this namespace must run as non-root (runAsNonRoot: true), cannot escalate privileges (allowPrivilegeEscalation: false), must drop all capabilities (capabilities.drop: ALL), and use the default seccomp profile. The readOnlyRootFilesystem: true prevents the container from modifying its filesystem (except mounted volumes like the emptyDir for /tmp).

Security benefits: If an attacker exploits the application and gains code execution, they run as a non-root user (UID 1001) with no capabilities and cannot modify the filesystem. This dramatically limits the attack surface. For writable storage requirements, mount emptyDir or PersistentVolumeClaims at specific paths.

Network Policies

# Default deny all ingress and egress
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
  namespace: banking
spec:
  podSelector: {}
  policyTypes:
  - Ingress
  - Egress
---
# Allow payment-service to receive from API gateway
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: payment-service-ingress
  namespace: banking
spec:
  podSelector:
    matchLabels:
      app: payment-service
  policyTypes:
  - Ingress
  ingress:
  - from:
    - namespaceSelector:
        matchLabels:
          name: api-gateway
      podSelector:
        matchLabels:
          app: api-gateway
    - podSelector:
        matchLabels:
          app: account-service
    ports:
    - protocol: TCP
      port: 8080
---
# Allow payment-service to reach database and external APIs
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: payment-service-egress
  namespace: banking
spec:
  podSelector:
    matchLabels:
      app: payment-service
  policyTypes:
  - Egress
  egress:
  - to:
    - podSelector:
        matchLabels:
          app: postgres
    ports:
    - protocol: TCP
      port: 5432
  - to:
    - podSelector:
        matchLabels:
          app: redis
    ports:
    - protocol: TCP
      port: 6379
  - to:
    - namespaceSelector:
        matchLabels:
          name: kube-system
      podSelector:
        matchLabels:
          k8s-app: kube-dns
    ports:
    - protocol: UDP
      port: 53
  - to:
    - ipBlock:
        cidr: 0.0.0.0/0
        except:
        - 10.0.0.0/8
        - 172.16.0.0/12
        - 192.168.0.0/16
    ports:
    - protocol: TCP
      port: 443

How it works: The default-deny-all policy blocks all ingress and egress traffic in the banking namespace (zero-trust baseline). Subsequent policies explicitly allow required traffic: payment-service accepts connections from api-gateway and account-service on port 8080; payment-service connects to postgres (5432), redis (6379), DNS (53), and external HTTPS (443) but not to private IP ranges (preventing access to metadata services or other internal services).

Network segmentation: Network policies create microsegmentation - each service only communicates with necessary dependencies. If an attacker compromises the payment-service, they cannot reach the database admin interface or other unrelated services. This "defense in depth" limits lateral movement.

RBAC (Role-Based Access Control)

# ServiceAccount for payment-service
apiVersion: v1
kind: ServiceAccount
metadata:
  name: payment-service
  namespace: banking
---
# Role granting minimal permissions
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: payment-service-role
  namespace: banking
rules:
- apiGroups: [""]
  resources: ["configmaps"]
  verbs: ["get", "list"]
  resourceNames: ["payment-service-config"]
- apiGroups: [""]
  resources: ["secrets"]
  verbs: ["get"]
  resourceNames: ["payment-service-secrets"]
---
# RoleBinding attaching Role to ServiceAccount
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: payment-service-rolebinding
  namespace: banking
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: payment-service-role
subjects:
- kind: ServiceAccount
  name: payment-service
  namespace: banking
---
# Deployment using ServiceAccount
apiVersion: apps/v1
kind: Deployment
metadata:
  name: payment-service
  namespace: banking
spec:
  template:
    spec:
      serviceAccountName: payment-service
      automountServiceAccountToken: true

How it works: The payment-service Pod uses the payment-service ServiceAccount. This ServiceAccount has permission to read only the payment-service-config ConfigMap and payment-service-secrets Secret, nothing else. If the application is compromised, the attacker cannot list all Secrets, modify Deployments, or access other resources - the ServiceAccount's limited permissions constrain them.

Default ServiceAccount: By default, Pods use the default ServiceAccount in their namespace. The default ServiceAccount should have no permissions (automountServiceAccountToken: false to prevent mounting entirely). Create dedicated ServiceAccounts for each application with minimal required permissions.

Admission Controllers

# OPA Gatekeeper constraint template
apiVersion: templates.gatekeeper.sh/v1beta1
kind: ConstraintTemplate
metadata:
  name: k8srequiredlabels
spec:
  crd:
    spec:
      names:
        kind: K8sRequiredLabels
      validation:
        openAPIV3Schema:
          properties:
            labels:
              type: array
              items:
                type: string
  targets:
    - target: admission.k8s.gatekeeper.sh
      rego: |
        package k8srequiredlabels
        violation[{"msg": msg}] {
          provided := {label | input.review.object.metadata.labels[label]}
          required := {label | label := input.parameters.labels[_]}
          missing := required - provided
          count(missing) > 0
          msg := sprintf("Missing required labels: %v", [missing])
        }
---
# Constraint requiring labels
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sRequiredLabels
metadata:
  name: require-app-and-version-labels
spec:
  match:
    kinds:
      - apiGroups: ["apps"]
        kinds: ["Deployment"]
    namespaces: ["banking"]
  parameters:
    labels: ["app", "version"]

How it works: OPA Gatekeeper is a validating admission controller that runs policies (written in Rego) on all API requests. This policy requires all Deployments in the banking namespace to have app and version labels. Attempts to create Deployments without these labels are rejected. Admission controllers enforce organizational policies (required labels, image registries, resource limits, security contexts) automatically.

Common policies: Enforce container image sources (only from approved registries), require resource limits on all containers, disallow privileged containers, require specific labels for cost attribution, enforce naming conventions.

Summary

Key Takeaways

Deployment strategies - Rolling updates for zero-downtime, blue-green for instant rollback, canary for gradual rollout
Resource management - Set requests for scheduling, limits for safety; use HPA for scaling
Health probes - Liveness detects crashes, readiness controls traffic, startup protects slow initialization
Configuration separation - ConfigMaps for config, Secrets for sensitive data, external secret management for production
Service mesh - Istio/Linkerd for mTLS, traffic splitting, observability without code changes
Ingress - Centralized HTTP routing, TLS termination, path/host-based routing
Observability - Prometheus for metrics, Loki/ELK for logs, Jaeger for traces
Security - Pod Security Standards, Network Policies, RBAC, admission controllers
High availability - Multi-replica deployments, Pod Disruption Budgets, anti-affinity rules
GitOps - Version control Kubernetes manifests, automate deployments via CI/CD

Next Steps: Review Docker Best Practices for container image optimization, CI/CD Pipelines for Kubernetes deployment automation, and Microservices Architecture for service design patterns.

Overview​

Core Principles​

Deployment Strategies​

Understanding Deployment Strategies​

Rolling Update (Default)​

Blue-Green Deployment​

Canary Deployment​

Recreate Strategy​

Resource Management​

Understanding Requests vs Limits​

Resource Requests and Limits​

Quality of Service Classes​

Horizontal Pod Autoscaling (HPA)​

Vertical Pod Autoscaling (VPA)​

Cluster Autoscaling​

Health Checks​

Understanding Probe Types​

Liveness Probe​

Readiness Probe​

Startup Probe​

TCP and Exec Probes​

Configuration Management​

Understanding ConfigMaps and Secrets​

ConfigMap from Literal Values​

ConfigMap from File​

Secrets Management​

External Secrets Operator​

Sealed Secrets​

Service Mesh Patterns​

Understanding Service Mesh Benefits​

Istio Traffic Management​

Mutual TLS (mTLS)​

Traffic Splitting and A/B Testing​

Ingress Controllers and Routing​

Understanding Ingress​

NGINX Ingress​

Path-Based Routing​

Host-Based Routing​

Monitoring and Logging​

Understanding Kubernetes Observability​

Prometheus Metrics​

Logging with Fluentd and Loki​

Distributed Tracing with Jaeger​

Security​

Understanding Kubernetes Security Model​

Pod Security Standards​

Network Policies​

RBAC (Role-Based Access Control)​

Admission Controllers​

Further Reading​

Internal Documentation​

External Resources​

Summary​

Key Takeaways​