Skip to main content

Spring Boot Observability

This guide covers Spring Boot-specific configuration for implementing the three pillars of observability. For foundational concepts, strategies, and when to use each pillar, see:


Core Dependencies

All observability features require Spring Boot Actuator as the foundation:

// build.gradle
dependencies {
// Required for all observability features
implementation 'org.springframework.boot:spring-boot-starter-actuator'

// Structured logging (JSON format)
implementation 'net.logstash.logback:logstash-logback-encoder:8.0'

// Metrics - Prometheus registry
implementation 'io.micrometer:micrometer-registry-prometheus'

// Distributed tracing - OpenTelemetry
implementation 'io.micrometer:micrometer-tracing-bridge-otel'
implementation 'io.opentelemetry:opentelemetry-exporter-otlp'
}

Actuator Configuration

Spring Boot Actuator exposes observability endpoints. Configure which endpoints to expose:

# application.yml
management:
endpoints:
web:
exposure:
# Production: limit to health and metrics only
include: health,info,metrics,prometheus

endpoint:
health:
show-details: when_authorized # Protect health details

metrics:
export:
prometheus:
enabled: true
tags:
# Add to all metrics for filtering in multi-app environments
application: ${spring.application.name}
environment: ${spring.profiles.active:local}

Endpoints exposed:

  • /actuator/health - Health checks for load balancers and Kubernetes
  • /actuator/prometheus - Metrics in Prometheus format
  • /actuator/metrics - Metrics in JSON format (for debugging)
  • /actuator/info - Application information

Structured Logging Configuration

Spring Boot uses Logback by default. Configure it for JSON output to enable centralized log aggregation.

For structured logging concepts and why JSON format matters, see Logging: Structured Logging.

Logback Configuration

src/main/resources/logback-spring.xml:

<?xml version="1.0" encoding="UTF-8"?>
<configuration>
<!-- JSON format for centralized logging systems (ELK, Splunk) -->
<appender name="CONSOLE" class="ch.qos.logback.core.ConsoleAppender">
<encoder class="net.logstash.logback.encoder.LogstashEncoder">
<!-- Include these MDC keys in every log entry -->
<includeMdcKeyName>correlationId</includeMdcKeyName>
<includeMdcKeyName>userId</includeMdcKeyName>
<includeMdcKeyName>traceId</includeMdcKeyName>
<includeMdcKeyName>spanId</includeMdcKeyName>

<!-- Static metadata for all logs -->
<customFields>
{"application":"${spring.application.name}",
"environment":"${ENVIRONMENT:local}"}
</customFields>
</encoder>
</appender>

<!-- Async appender for high-throughput applications -->
<appender name="ASYNC_CONSOLE" class="ch.qos.logback.classic.AsyncAppender">
<queueSize>512</queueSize>
<neverBlock>true</neverBlock>
<appender-ref ref="CONSOLE"/>
</appender>

<!-- Production: use async, lower levels for noisy loggers -->
<springProfile name="prod">
<logger name="org.springframework" level="WARN"/>
<logger name="org.hibernate" level="WARN"/>
<root level="INFO">
<appender-ref ref="ASYNC_CONSOLE"/>
</root>
</springProfile>

<!-- Development: direct console, more verbose -->
<springProfile name="!prod">
<root level="INFO">
<appender-ref ref="CONSOLE"/>
</root>
</springProfile>
</configuration>

Correlation ID Filter

Correlation IDs track requests across services. See Logging: Correlation IDs for why this matters.

import org.slf4j.MDC;
import org.springframework.core.Ordered;
import org.springframework.core.annotation.Order;
import org.springframework.stereotype.Component;
import org.springframework.web.filter.OncePerRequestFilter;

import jakarta.servlet.FilterChain;
import jakarta.servlet.ServletException;
import jakarta.servlet.http.HttpServletRequest;
import jakarta.servlet.http.HttpServletResponse;
import java.io.IOException;
import java.util.UUID;

@Component
@Order(Ordered.HIGHEST_PRECEDENCE)
public class CorrelationIdFilter extends OncePerRequestFilter {

private static final String CORRELATION_ID_HEADER = "X-Correlation-ID";
private static final String MDC_KEY = "correlationId";

@Override
protected void doFilterInternal(
HttpServletRequest request,
HttpServletResponse response,
FilterChain filterChain) throws ServletException, IOException {

// Accept existing ID from upstream service or generate new one
String correlationId = request.getHeader(CORRELATION_ID_HEADER);
if (correlationId == null || correlationId.isBlank()) {
correlationId = UUID.randomUUID().toString();
}

// Store in MDC - automatically included in all logs
MDC.put(MDC_KEY, correlationId);

// Return in response for client-side debugging
response.setHeader(CORRELATION_ID_HEADER, correlationId);

try {
filterChain.doFilter(request, response);
} finally {
// Critical: prevent MDC pollution in thread pools
MDC.remove(MDC_KEY);
}
}
}

Propagating Correlation IDs to Downstream Services

When calling other services, propagate the correlation ID via HTTP headers:

import org.slf4j.MDC;
import org.springframework.boot.web.client.RestTemplateBuilder;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.web.client.RestTemplate;

@Configuration
public class RestTemplateConfig {

@Bean
public RestTemplate restTemplate(RestTemplateBuilder builder) {
return builder
.interceptors((request, body, execution) -> {
String correlationId = MDC.get("correlationId");
if (correlationId != null) {
request.getHeaders().add("X-Correlation-ID", correlationId);
}
return execution.execute(request, body);
})
.build();
}
}

Metrics with Micrometer

Micrometer provides the metrics API for Spring Boot. For metric types (Counter, Gauge, Timer) and when to use each, see Metrics: Metric Types.

Configuration

# application.yml
management:
metrics:
distribution:
# Enable histogram buckets for percentile calculation
percentiles-histogram:
http.server.requests: true
# Pre-configured percentiles
percentiles:
http.server.requests: 0.5, 0.95, 0.99

Auto-Configured Metrics

Spring Boot automatically provides:

MetricDescription
http.server.requestsHTTP request count and latency
jvm.memory.usedJVM memory usage by area
jvm.gc.pauseGarbage collection pause times
hikaricp.connections.activeDatabase connection pool usage
spring.data.repository.*Repository method execution

Custom Business Metrics

Inject MeterRegistry to create custom metrics:

import io.micrometer.core.instrument.Counter;
import io.micrometer.core.instrument.MeterRegistry;
import io.micrometer.core.instrument.Timer;
import org.springframework.stereotype.Service;

@Service
public class PaymentService {

private final MeterRegistry meterRegistry;
private final PaymentRepository repository;

public PaymentService(MeterRegistry meterRegistry, PaymentRepository repository) {
this.meterRegistry = meterRegistry;
this.repository = repository;
}

public Payment createPayment(PaymentRequest request) {
Timer.Sample sample = Timer.start(meterRegistry);

try {
Payment payment = buildPayment(request);
payment = repository.save(payment);

// Counter with tags for filtering
meterRegistry.counter("payments.created",
"currency", request.currency(),
"status", "success"
).increment();

// Timer records duration
sample.stop(meterRegistry.timer("payments.processing.time",
"outcome", "success"
));

return payment;

} catch (Exception ex) {
meterRegistry.counter("payments.created",
"currency", request.currency(),
"status", "failure",
"error", ex.getClass().getSimpleName()
).increment();

sample.stop(meterRegistry.timer("payments.processing.time",
"outcome", "failure"
));

throw ex;
}
}
}

Querying Metrics

Example Prometheus queries for the metrics above:

# Payment creation rate (per second)
rate(payments_created_total[5m])

# Error rate as percentage
rate(payments_created_total{status="failure"}[5m])
/ rate(payments_created_total[5m]) * 100

# 95th percentile processing time
histogram_quantile(0.95, rate(payments_processing_time_seconds_bucket[5m]))

# Database connection pool utilization
hikaricp_connections_active / hikaricp_connections_max * 100

Distributed Tracing with OpenTelemetry

Spring Boot 3.x uses Micrometer Tracing with OpenTelemetry. For tracing concepts (traces, spans, context propagation), see Distributed Tracing.

Configuration

# application.yml
management:
tracing:
sampling:
probability: 0.1 # Sample 10% of requests

otlp:
tracing:
endpoint: http://jaeger:4318/v1/traces

Sampling considerations:

  • Production: 0.05-0.1 (5-10%) reduces overhead while maintaining visibility
  • Development: 1.0 (100%) for full visibility
  • Always sample errors at 100% using custom sampler

Auto-Instrumented Spans

Spring Boot automatically creates spans for:

  • HTTP requests and responses
  • JDBC database queries
  • RestTemplate/WebClient outgoing calls
  • Spring Data repository methods
  • Kafka producer/consumer
  • Redis operations

Custom Spans

For business operations not covered by auto-instrumentation:

import io.micrometer.tracing.Span;
import io.micrometer.tracing.Tracer;
import org.springframework.stereotype.Service;

@Service
public class PaymentService {

private final Tracer tracer;
private final PaymentGatewayClient gatewayClient;

public PaymentService(Tracer tracer, PaymentGatewayClient gatewayClient) {
this.tracer = tracer;
this.gatewayClient = gatewayClient;
}

public Payment createPayment(PaymentRequest request) {
// Create span for business operation
Span span = tracer.nextSpan().name("payment.create");

try (Tracer.SpanInScope ws = tracer.withSpan(span.start())) {
// Add business context as span tags
span.tag("customer.id", request.customerId());
span.tag("payment.amount", String.valueOf(request.amount()));
span.tag("payment.currency", request.currency());

Payment payment = buildPayment(request);
payment = repository.save(payment);

// Nested span for external call
Span gatewaySpan = tracer.nextSpan().name("payment.gateway.call");
try (Tracer.SpanInScope gatewayScope = tracer.withSpan(gatewaySpan.start())) {
GatewayResult result = gatewayClient.process(payment);
payment.setStatus(result.status());

gatewaySpan.tag("gateway.txn.id", result.transactionId());
} catch (Exception ex) {
gatewaySpan.error(ex);
throw ex;
} finally {
gatewaySpan.end();
}

span.tag("payment.id", payment.getId());
return payment;

} catch (Exception ex) {
span.error(ex);
throw ex;
} finally {
span.end();
}
}
}

Correlating Traces with Logs

Include trace ID and span ID in logs by configuring the MDC bridge. Spring Boot 3.x does this automatically when tracing is enabled:

<!-- logback-spring.xml - MDC keys automatically populated -->
<includeMdcKeyName>traceId</includeMdcKeyName>
<includeMdcKeyName>spanId</includeMdcKeyName>

Now you can search logs by trace ID to see all related log entries across services.


Health Checks

Health checks tell load balancers and Kubernetes whether your service can handle traffic.

Built-in Health Indicators

Spring Boot auto-configures health indicators for:

  • Database (DataSource)
  • Redis
  • Elasticsearch
  • RabbitMQ/Kafka
  • Disk space

Custom Health Indicators

Add health checks for dependencies without built-in support:

import org.springframework.boot.actuate.health.Health;
import org.springframework.boot.actuate.health.HealthIndicator;
import org.springframework.stereotype.Component;

@Component
public class PaymentGatewayHealthIndicator implements HealthIndicator {

private final PaymentGatewayClient gatewayClient;

public PaymentGatewayHealthIndicator(PaymentGatewayClient gatewayClient) {
this.gatewayClient = gatewayClient;
}

@Override
public Health health() {
try {
// Lightweight health check, not full operation
gatewayClient.ping();

return Health.up()
.withDetail("gateway", "reachable")
.build();

} catch (Exception ex) {
return Health.down()
.withDetail("gateway", "unreachable")
.withDetail("error", ex.getMessage())
.build();
}
}
}

Liveness vs Readiness

For Kubernetes, separate liveness from readiness:

# application.yml
management:
endpoint:
health:
probes:
enabled: true
group:
liveness:
include: livenessState
readiness:
include: readinessState, db, paymentGateway
  • Liveness (/actuator/health/liveness): Is the app running? If DOWN, Kubernetes restarts the pod.
  • Readiness (/actuator/health/readiness): Can the app handle traffic? If DOWN, traffic stops routing to this pod.

Integration Example

Complete observability for a payment service:

import io.micrometer.core.instrument.MeterRegistry;
import io.micrometer.core.instrument.Timer;
import io.micrometer.tracing.Span;
import io.micrometer.tracing.Tracer;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.slf4j.MDC;
import org.springframework.stereotype.Service;

@Service
public class PaymentService {

private static final Logger log = LoggerFactory.getLogger(PaymentService.class);

private final MeterRegistry meterRegistry;
private final Tracer tracer;
private final PaymentRepository repository;
private final PaymentGatewayClient gatewayClient;

public PaymentService(
MeterRegistry meterRegistry,
Tracer tracer,
PaymentRepository repository,
PaymentGatewayClient gatewayClient) {
this.meterRegistry = meterRegistry;
this.tracer = tracer;
this.repository = repository;
this.gatewayClient = gatewayClient;
}

public Payment createPayment(PaymentRequest request) {
// Add business context to MDC for all logs in this operation
MDC.put("customerId", request.customerId());

Timer.Sample timerSample = Timer.start(meterRegistry);
Span span = tracer.nextSpan().name("payment.create");

try (Tracer.SpanInScope ws = tracer.withSpan(span.start())) {
span.tag("customer.id", request.customerId());
span.tag("payment.amount", String.valueOf(request.amount()));

log.info("Creating payment: amount={} {}",
request.amount(), request.currency());

Payment payment = buildPayment(request);
payment = repository.save(payment);

log.info("Calling payment gateway for paymentId={}", payment.getId());
GatewayResult result = gatewayClient.process(payment);
payment.setStatus(result.status());
payment = repository.save(payment);

// Record metrics
meterRegistry.counter("payments.created",
"currency", request.currency(),
"status", "success"
).increment();

timerSample.stop(meterRegistry.timer("payments.processing.time",
"outcome", "success"
));

log.info("Payment created: paymentId={} status={}",
payment.getId(), payment.getStatus());

span.tag("payment.id", payment.getId());
return payment;

} catch (Exception ex) {
log.error("Payment failed: error={}", ex.getMessage(), ex);

meterRegistry.counter("payments.created",
"currency", request.currency(),
"status", "failure",
"error", ex.getClass().getSimpleName()
).increment();

timerSample.stop(meterRegistry.timer("payments.processing.time",
"outcome", "failure"
));

span.error(ex);
throw ex;

} finally {
span.end();
MDC.remove("customerId");
}
}

private Payment buildPayment(PaymentRequest request) {
// Build payment entity
return new Payment();
}
}

What this demonstrates:

  • Structured logging with MDC context
  • Correlation ID automatically included (from filter)
  • Custom metrics for business events
  • Custom tracing span with business attributes
  • Error handling recorded in all three pillars
  • Proper cleanup of MDC and spans

Production Checklist

Before deploying, verify:

Logging:

  • JSON format enabled (LogstashEncoder)
  • Correlation ID filter configured
  • MDC cleanup in finally blocks
  • No sensitive data logged (PII, passwords, tokens)
  • Async appender for high-throughput services

Metrics:

  • Prometheus endpoint exposed (/actuator/prometheus)
  • Custom metrics for business events
  • Histogram percentiles configured
  • Appropriate tags (no high-cardinality values)

Tracing:

  • Sampling rate configured (5-10% for production)
  • OpenTelemetry exporter endpoint configured
  • Custom spans for critical business operations
  • Trace ID/Span ID included in logs

Health Checks:

  • Custom health indicators for all dependencies
  • Liveness/readiness separation for Kubernetes
  • Health details protected (when_authorized)

Summary

Spring Boot provides excellent observability support through Actuator and Micrometer. This guide covered:

  • Actuator endpoints for health, metrics, and prometheus format
  • Logback configuration for JSON structured logging
  • Correlation ID filter for request tracing across services
  • Micrometer metrics for business and technical measurements
  • OpenTelemetry tracing for distributed request visualization
  • Health indicators for dependency monitoring

For conceptual understanding and best practices:

Related Spring Boot guides: