Microservices Architecture Patterns

Microservices architecture structures an application as a collection of loosely coupled, independently deployable services. Each service encapsulates a specific business capability and can be developed, deployed, and scaled independently. This architectural style emerged to address the limitations of monolithic applications, particularly in terms of scalability, deployment flexibility, and organizational alignment.

The fundamental principle is that each microservice owns its domain logic and data, communicating with other services through well-defined interfaces. This independence enables teams to make technology choices appropriate to their service's needs, deploy changes without coordinating with other teams, and scale individual components based on demand.

Service Decomposition Strategies

Decomposing a system into microservices requires careful analysis of business capabilities, domain boundaries, and operational constraints. The goal is to identify service boundaries that maximize cohesion within services while minimizing coupling between them.

Decomposition by Business Capability

Business capability decomposition aligns services with what the organization does, rather than how it does it. A business capability represents a stable aspect of the business that provides value. For example, in a banking context, capabilities include customer management, account management, payment processing, and transaction history.

This approach creates services that are resilient to changes in implementation details. When business processes change, the underlying capabilities often remain stable, meaning service boundaries don't need to shift. Each service team can develop deep domain expertise in their capability area.

The diagram below visualizes how different business capabilities in a banking application map to independent microservices. Each service owns a specific business capability and communicates with others through well-defined interfaces. Notice how services are organized around business functions (what the system does) rather than technical concerns (how it's built).

This service decomposition matters because it aligns technical architecture with business organization. When the payment processing team needs to implement a new payment method, they modify only the Payment Service without coordinating changes across multiple services. The boundaries between services reflect natural seams in the business domain, making the system easier to understand and evolve as business requirements change.

Identify business capabilities by analyzing the organization's value streams and business model. Work with domain experts to map out what the business does at a high level, ignoring implementation details. Each capability should represent a coherent area of business functionality that could be understood and managed as a unit.

Decomposition by Subdomain (Domain-Driven Design)

Domain-Driven Design (DDD) provides a structured approach to service decomposition through bounded contexts and subdomains. A subdomain represents a specific area of the overall business domain. Bounded contexts define explicit boundaries within which a particular domain model applies.

Subdomains are categorized as core (key differentiators), supporting (necessary but not differentiating), or generic (common functionality). Core domains receive the most investment and attention, while generic domains might use off-the-shelf solutions.

// Core Domain: Payment Processing
@Service
public class PaymentProcessor {
    private final PaymentRepository paymentRepository;
    private final FraudDetectionService fraudDetectionService;
    private final AccountServiceClient accountServiceClient;

    /**
     * Process a payment with fraud checks and account validation.
     * This represents core business logic unique to the organization.
     */
    public PaymentResult processPayment(PaymentRequest request) {
        // Fraud detection is part of core domain logic
        FraudAssessment assessment = fraudDetectionService.assessRisk(request);
        if (assessment.isHighRisk()) {
            return PaymentResult.rejected("Fraud risk too high");
        }

        // Account validation via inter-service call
        AccountBalance balance = accountServiceClient.getBalance(request.getAccountId());
        if (balance.isInsufficient(request.getAmount())) {
            return PaymentResult.rejected("Insufficient funds");
        }

        // Process the payment
        Payment payment = new Payment(request);
        payment.markAsProcessed();
        paymentRepository.save(payment);

        return PaymentResult.success(payment.getId());
    }
}

Bounded contexts help manage complexity by defining clear boundaries where a particular model is valid. The term "payment" might mean different things in the payment processing context versus the accounting context. Each bounded context can use the same term with different meanings, as long as the boundaries are explicit.

When mapping bounded contexts to services, aim for one service per bounded context where practical. This ensures each service has a cohesive model without conflicting definitions. Services communicate across bounded context boundaries through anti-corruption layers that translate between different models.

Decomposition by Transaction Boundaries

Some operations require strong consistency guarantees that are difficult to maintain across service boundaries. Identifying transaction boundaries helps determine which functionality must remain together in a single service to maintain ACID properties.

Consider an account transfer operation: debiting one account and crediting another must happen atomically. If these accounts are managed by different services, you need distributed transaction coordination or compensation mechanisms (see Saga Pattern below).

// Keep strongly consistent operations within a service boundary
@Service
public class AccountService {
    private final AccountRepository accountRepository;

    @Transactional
    public TransferResult transferBetweenAccounts(
            String fromAccountId,
            String toAccountId,
            Money amount) {

        Account fromAccount = accountRepository.findById(fromAccountId)
            .orElseThrow(() -> new AccountNotFoundException(fromAccountId));
        Account toAccount = accountRepository.findById(toAccountId)
            .orElseThrow(() -> new AccountNotFoundException(toAccountId));

        // Both operations happen in same transaction - atomic
        fromAccount.debit(amount);
        toAccount.credit(amount);

        accountRepository.saveAll(List.of(fromAccount, toAccount));

        return TransferResult.success();
    }
}

Transaction boundaries often align with aggregate boundaries in DDD. An aggregate is a cluster of domain objects that can be treated as a single unit for data changes. Transactions should not span multiple aggregates, and therefore should not span multiple services.

When strong consistency is required across service boundaries, consider whether those boundaries are correct. If operations frequently require distributed transactions, it may indicate the services are too fine-grained or boundaries are drawn in the wrong places.

Inter-Service Communication

Microservices must communicate to fulfill business operations. The choice between synchronous and asynchronous communication, and the protocols used, significantly impacts system characteristics like latency, coupling, and resilience.

The following diagram illustrates the two primary communication patterns in microservice architectures. Synchronous communication (REST/gRPC) creates tight temporal coupling where both services must be available simultaneously, while asynchronous messaging decouples services in time, allowing them to operate independently and handle varying load patterns.

This communication pattern choice fundamentally affects system behavior. Synchronous calls provide immediate feedback but create failure cascades - if Account Service is down, Payment Service cannot complete requests. Asynchronous messaging enables resilience through temporal decoupling - Payment Service publishes events and continues operating even if Notification Service is unavailable. The trade-off is eventual consistency: notifications may be delayed, but the system remains operational.

Synchronous Communication: REST and gRPC

Synchronous communication creates request-response interactions where the caller waits for a response. This model is intuitive and appropriate when immediate responses are needed.

REST over HTTP provides a widely-understood, language-agnostic communication mechanism. It leverages standard HTTP methods (GET, POST, PUT, DELETE) and status codes. REST APIs are discoverable through standard documentation formats like OpenAPI, and can be tested with common tools. See API Design Guidelines for REST best practices.

// REST client using Spring's RestClient (Java 21+)
@Service
public class AccountServiceClient {
    private final RestClient restClient;

    public AccountServiceClient(RestClient.Builder builder) {
        this.restClient = builder
            .baseUrl("http://account-service")
            .build();
    }

    /**
     * Synchronously fetch account balance from Account Service.
     * Caller blocks until response received or timeout occurs.
     */
    public AccountBalance getBalance(String accountId) {
        return restClient.get()
            .uri("/accounts/{id}/balance", accountId)
            .retrieve()
            .body(AccountBalance.class);
    }
}

gRPC uses Protocol Buffers for efficient binary serialization and HTTP/2 for transport. It provides strongly-typed contracts, built-in code generation, and better performance than JSON over HTTP. gRPC is particularly suitable for internal service-to-service communication where both ends are under your control.

// account_service.proto - strongly typed contract
syntax = "proto3";

package com.example.account;

service AccountService {
  rpc GetBalance (GetBalanceRequest) returns (GetBalanceResponse);
  rpc CreateAccount (CreateAccountRequest) returns (CreateAccountResponse);
}

message GetBalanceRequest {
  string account_id = 1;
}

message GetBalanceResponse {
  string account_id = 1;
  int64 balance_cents = 2;
  string currency = 3;
}

The primary drawback of synchronous communication is temporal coupling: both services must be available simultaneously for the interaction to succeed. If the downstream service is unavailable, the request fails immediately. This creates cascading failures where one service's unavailability affects all its callers. Implement circuit breakers and fallbacks to mitigate this (see Resilience Patterns).

Asynchronous Communication: Messaging

Asynchronous messaging decouples services in time - the sender doesn't wait for the receiver to process the message. This improves resilience because services can operate independently, and provides natural load leveling as messages queue when consumers are slow.

Message Queues (like RabbitMQ, AWS SQS) provide point-to-point communication. A message is delivered to exactly one consumer. This is suitable for task distribution and work queuing.

// Publishing to a queue (Spring AMQP with RabbitMQ)
@Service
public class PaymentEventPublisher {
    private final RabbitTemplate rabbitTemplate;

    public void publishPaymentProcessed(Payment payment) {
        PaymentProcessedEvent event = new PaymentProcessedEvent(
            payment.getId(),
            payment.getAmount(),
            payment.getTimestamp()
        );

        // Fire and forget - publisher doesn't wait for processing
        rabbitTemplate.convertAndSend(
            "payment.exchange",
            "payment.processed",
            event
        );
    }
}

// Consuming from a queue
@Component
public class PaymentEventListener {
    private final TransactionService transactionService;

    @RabbitListener(queues = "transaction.payment-processed")
    public void handlePaymentProcessed(PaymentProcessedEvent event) {
        // Process asynchronously - payment service doesn't know when this executes
        transactionService.recordTransaction(event);
    }
}

Event Streaming platforms (like Apache Kafka, AWS Kinesis) provide pub-sub semantics with durability. Multiple consumers can independently read the same event stream. Events are retained for a configured period, enabling new consumers to replay historical events.

Kafka's partition-based architecture provides ordering guarantees within a partition and enables parallel processing across partitions. Choose partition keys carefully to ensure related events end up in the same partition.

// Publishing to Kafka
@Service
public class AccountEventPublisher {
    private final KafkaTemplate<String, AccountEvent> kafkaTemplate;

    public void publishAccountCreated(Account account) {
        AccountCreatedEvent event = new AccountCreatedEvent(
            account.getId(),
            account.getCustomerId(),
            account.getType()
        );

        // Use account ID as partition key for ordering guarantees
        kafkaTemplate.send("account-events", account.getId(), event);
    }
}

// Multiple services can consume the same events independently
@Service
public class AccountEventConsumer {

    @KafkaListener(topics = "account-events", groupId = "notification-service")
    public void handleAccountEvent(AccountEvent event) {
        // Notification service processes account events
        // Payment service might also consume the same events independently
    }
}

Asynchronous communication introduces complexity around message ordering, delivery guarantees, and eventual consistency. Messages might be delivered out of order, multiple times, or in rare cases not at all. Design consumers to be idempotent and handle duplicate messages gracefully.

Choosing Communication Patterns

Use synchronous REST/gRPC when:

You need immediate responses (user-facing operations)
The operation is naturally request-response (queries)
Simplicity is more important than resilience
The caller needs to know the operation succeeded before proceeding

Use asynchronous messaging when:

Operations can complete eventually without immediate confirmation
You need to decouple services for resilience
Multiple services need to react to the same event
Load leveling is beneficial (queuing during traffic spikes)

Many systems use hybrid approaches: synchronous for reads and critical user-facing operations, asynchronous for writes and cross-service notifications.

Data Management Patterns

Data management is one of the most challenging aspects of microservices. Each service should own its data to maintain independence, but business operations often require data from multiple services.

Database Per Service

The database-per-service pattern ensures each microservice has its own database schema, which no other service accesses directly. This provides loose coupling: services can change their data model without affecting others, and each service can choose the database technology best suited to its needs.

This independence comes at a cost: you cannot use database joins across services or rely on database-level referential integrity. Queries that previously joined tables now require multiple service calls or data denormalization.

// Without database-per-service: join tables directly
// SELECT c.name, a.balance
// FROM customers c JOIN accounts a ON c.id = a.customer_id

// With database-per-service: coordinate across services
@Service
public class CustomerAccountService {
    private final CustomerServiceClient customerClient;
    private final AccountServiceClient accountClient;

    public CustomerWithAccounts getCustomerWithAccounts(String customerId) {
        // Must make multiple service calls
        Customer customer = customerClient.getCustomer(customerId);
        List<Account> accounts = accountClient.getAccountsByCustomer(customerId);

        return new CustomerWithAccounts(customer, accounts);
    }
}

Consider denormalizing data across service boundaries when queries frequently need data from multiple services. For example, the transaction service might store customer names alongside transactions rather than fetching them on every query. Keep denormalized data synchronized by subscribing to events from the source service.

Saga Pattern for Distributed Transactions

Sagas manage distributed transactions across multiple services without requiring traditional two-phase commit. A saga is a sequence of local transactions where each service performs its transaction and publishes an event. If any step fails, compensating transactions undo previous steps.

Choreography-based sagas have each service listen for events and decide what to do next. There's no central coordinator - the saga emerges from services reacting to events.

// Payment Service - initiates the saga
@Service
public class PaymentService {
    private final PaymentRepository paymentRepository;
    private final EventPublisher eventPublisher;

    public Payment initiatePayment(PaymentRequest request) {
        Payment payment = new Payment(request);
        payment.setStatus(PaymentStatus.PENDING);
        paymentRepository.save(payment);

        // Publish event to continue saga
        eventPublisher.publish(new PaymentCreatedEvent(
            payment.getId(),
            payment.getAccountId(),
            payment.getAmount()
        ));

        return payment;
    }

    // Listen for success or failure from downstream services
    @EventHandler
    public void handleAccountDebited(AccountDebitedEvent event) {
        Payment payment = paymentRepository.findById(event.getPaymentId())
            .orElseThrow();
        payment.setStatus(PaymentStatus.COMPLETED);
        paymentRepository.save(payment);
    }

    @EventHandler
    public void handleDebitFailed(DebitFailedEvent event) {
        // Compensating transaction - mark payment as failed
        Payment payment = paymentRepository.findById(event.getPaymentId())
            .orElseThrow();
        payment.setStatus(PaymentStatus.FAILED);
        payment.setFailureReason(event.getReason());
        paymentRepository.save(payment);
    }
}

Choreography is simple and decentralized but can be hard to understand and debug. The flow isn't explicit anywhere - you must trace through multiple services to understand the complete saga.

Orchestration-based sagas use a central orchestrator that explicitly manages the saga flow. The orchestrator tells each service what to do and handles compensations on failure.

// Saga orchestrator coordinates the entire flow
@Service
public class PaymentSagaOrchestrator {
    private final PaymentService paymentService;
    private final AccountService accountService;
    private final NotificationService notificationService;
    private final SagaRepository sagaRepository;

    public void executePaymentSaga(PaymentRequest request) {
        SagaInstance saga = new SagaInstance(request);
        sagaRepository.save(saga);

        try {
            // Step 1: Create payment
            Payment payment = paymentService.createPayment(request);
            saga.recordStep("payment-created", payment.getId());

            // Step 2: Debit account
            accountService.debitAccount(request.getAccountId(), request.getAmount());
            saga.recordStep("account-debited");

            // Step 3: Send notification
            notificationService.sendPaymentConfirmation(payment.getId());
            saga.markCompleted();

        } catch (InsufficientFundsException e) {
            // Compensate: reverse payment
            saga.markFailed();
            paymentService.cancelPayment(saga.getPaymentId());

        } catch (Exception e) {
            // Compensate: reverse account debit and payment
            saga.markFailed();
            accountService.creditAccount(request.getAccountId(), request.getAmount());
            paymentService.cancelPayment(saga.getPaymentId());
        }

        sagaRepository.save(saga);
    }
}

Orchestration makes the flow explicit and easier to understand, but creates a central component that must be highly available. The orchestrator becomes a potential bottleneck and single point of failure.

Sagas provide eventual consistency - at any point in time, some services might have completed their steps while others haven't. Design systems to handle intermediate states gracefully. For example, a payment might be marked "processing" while the saga executes.

Event Sourcing

Event sourcing stores state changes as a sequence of events rather than storing current state. The current state is derived by replaying events from the beginning. This provides a complete audit trail and enables temporal queries.

// Traditional state storage
public class Account {
    private String id;
    private BigDecimal balance; // Current state only

    public void debit(BigDecimal amount) {
        this.balance = this.balance.subtract(amount);
        // Previous state lost
    }
}

// Event sourcing - store events
public class AccountEventStore {

    public void debit(String accountId, BigDecimal amount) {
        AccountDebitedEvent event = new AccountDebitedEvent(
            accountId,
            amount,
            Instant.now()
        );

        // Store the event
        eventStore.append(accountId, event);
    }

    public BigDecimal getCurrentBalance(String accountId) {
        // Rebuild state by replaying all events
        List<AccountEvent> events = eventStore.getEvents(accountId);
        BigDecimal balance = BigDecimal.ZERO;

        for (AccountEvent event : events) {
            if (event instanceof AccountCreditedEvent credited) {
                balance = balance.add(credited.getAmount());
            } else if (event instanceof AccountDebitedEvent debited) {
                balance = balance.subtract(debited.getAmount());
            }
        }

        return balance;
    }
}

Event sourcing pairs well with CQRS (Command Query Responsibility Segregation) where write and read models are separate. Commands append events to the event store. Read models are projections built by consuming events, optimized for specific queries.

// Write model - appends events
@Service
public class AccountCommandService {
    private final EventStore eventStore;

    public void creditAccount(String accountId, BigDecimal amount) {
        AccountCreditedEvent event = new AccountCreditedEvent(accountId, amount);
        eventStore.append(accountId, event);
    }
}

// Read model - optimized projection for queries
@Service
public class AccountQueryService {
    private final AccountBalanceRepository balanceRepository;

    // Updated by consuming events from event store
    @EventHandler
    public void handleAccountCredited(AccountCreditedEvent event) {
        AccountBalance balance = balanceRepository.findById(event.getAccountId())
            .orElse(new AccountBalance(event.getAccountId(), BigDecimal.ZERO));
        balance.add(event.getAmount());
        balanceRepository.save(balance);
    }

    // Fast query from denormalized read model
    public BigDecimal getBalance(String accountId) {
        return balanceRepository.findById(accountId)
            .map(AccountBalance::getAmount)
            .orElse(BigDecimal.ZERO);
    }
}

Event sourcing increases complexity: replaying thousands of events to rebuild state is slow, requiring snapshots for performance. Events become your API contract - they cannot be changed without migration strategies. However, it provides powerful capabilities for auditing, debugging (replay events to reproduce issues), and temporal queries (what was the state at a specific time?).

CQRS (Command Query Responsibility Segregation)

CQRS separates read and write operations into different models. Commands change state but don't return data. Queries return data but don't change state. This separation enables optimizing each model independently.

The write model focuses on business invariants and validation. The read model focuses on query performance and denormalization. This is particularly valuable when read and write patterns differ significantly - for example, complex writes but simple reads, or vice versa.

// Command side - enforces business rules
@Service
public class AccountCommandService {
    private final AccountRepository accountRepository;
    private final EventPublisher eventPublisher;

    public void openAccount(OpenAccountCommand command) {
        // Validate business rules
        if (command.getInitialDeposit().compareTo(MINIMUM_DEPOSIT) < 0) {
            throw new InsufficientDepositException();
        }

        Account account = new Account(command.getCustomerId(), command.getInitialDeposit());
        accountRepository.save(account);

        // Publish event for read model to consume
        eventPublisher.publish(new AccountOpenedEvent(
            account.getId(),
            account.getCustomerId(),
            account.getBalance()
        ));
    }
}

// Query side - optimized for reads
@Service
public class AccountQueryService {
    private final AccountReadModelRepository readModelRepository;

    // Denormalized read model including customer name
    public AccountSummary getAccountSummary(String accountId) {
        return readModelRepository.findById(accountId)
            .orElseThrow();
    }

    // Read model updated by consuming events
    @EventHandler
    public void handleAccountOpened(AccountOpenedEvent event) {
        // Could join with customer service data, cache, denormalize
        AccountSummary summary = new AccountSummary(
            event.getAccountId(),
            event.getCustomerId(),
            event.getBalance()
        );
        readModelRepository.save(summary);
    }
}

CQRS introduces eventual consistency between command and query sides. After executing a command, the read model might not immediately reflect the change. This is acceptable in many domains but requires careful UX design - for example, showing "Processing..." states or redirecting to confirmation pages rather than immediately refreshing lists.

Service Discovery and Load Balancing

In dynamic environments where service instances scale up and down, discovering service locations and distributing load becomes essential.

Client-side discovery has clients query a service registry (like Consul, Eureka) to find available instances, then choose one to call. This gives clients full control over load balancing but requires registry integration in every client.

// Using Spring Cloud LoadBalancer with Eureka
@Configuration
public class ServiceConfiguration {

    @Bean
    @LoadBalanced  // Enable client-side load balancing
    public RestClient.Builder restClientBuilder() {
        return RestClient.builder();
    }
}

@Service
public class AccountServiceClient {
    private final RestClient restClient;

    public AccountServiceClient(RestClient.Builder builder) {
        // Service name instead of host:port
        this.restClient = builder.baseUrl("http://account-service").build();
    }

    public Account getAccount(String id) {
        // Load balancer picks an instance from service registry
        return restClient.get()
            .uri("/accounts/{id}", id)
            .retrieve()
            .body(Account.class);
    }
}

Server-side discovery uses a load balancer or API gateway that queries the registry and forwards requests. Clients call the load balancer, which handles instance selection. This simplifies clients but adds a network hop and potential bottleneck.

In Kubernetes environments, service discovery is built-in. Services are accessed by name, and kube-proxy handles routing to available pods. This leverages the platform's native capabilities rather than introducing separate service registry infrastructure.

API Gateway Pattern

An API gateway provides a single entry point for clients, routing requests to appropriate backend services. It handles cross-cutting concerns like authentication, rate limiting, and request routing.

The API Gateway pattern centralizes client interactions with microservices, providing a unified interface that shields clients from internal service complexity. Rather than mobile and web applications making direct calls to dozens of microservices, they interact with a single gateway that routes requests, aggregates responses, and handles authentication.

The gateway serves as a critical architectural component that simplifies client development and provides operational benefits. When you need to implement rate limiting to prevent API abuse, you configure it once at the gateway rather than in every microservice. When authentication requirements change, you update the gateway instead of coordinating changes across all services. This centralization reduces complexity but creates a potential single point of failure - if the gateway goes down, all client requests fail, making high availability critical.

API gateways implement the Backends for Frontends (BFF) pattern where different client types (mobile, web, partner APIs) have dedicated gateways optimized for their needs. Mobile apps might need smaller payloads and different data aggregations than web applications.

// Spring Cloud Gateway configuration
@Configuration
public class GatewayConfiguration {

    @Bean
    public RouteLocator customRouteLocator(RouteLocatorBuilder builder) {
        return builder.routes()
            // Route to customer service
            .route("customer-service", r -> r
                .path("/api/customers/**")
                .filters(f -> f
                    .rewritePath("/api/customers/(?<segment>.*)", "/customers/${segment}")
                    .addRequestHeader("X-Request-Source", "gateway"))
                .uri("lb://customer-service"))

            // Route to account service with rate limiting
            .route("account-service", r -> r
                .path("/api/accounts/**")
                .filters(f -> f
                    .requestRateLimiter(config -> config
                        .setRateLimiter(redisRateLimiter())))
                .uri("lb://account-service"))

            .build();
    }
}

Gateways can aggregate responses from multiple services, reducing client complexity and round trips. However, aggregation logic in the gateway can become complex and may require updates when backend services change.

// Aggregating data from multiple services in the gateway
@RestController
public class CustomerAggregationController {
    private final CustomerServiceClient customerClient;
    private final AccountServiceClient accountClient;

    @GetMapping("/api/customer-dashboard/{customerId}")
    public CustomerDashboard getDashboard(@PathVariable String customerId) {
        // Parallel calls to multiple services
        CompletableFuture<Customer> customerFuture =
            CompletableFuture.supplyAsync(() -> customerClient.getCustomer(customerId));
        CompletableFuture<List<Account>> accountsFuture =
            CompletableFuture.supplyAsync(() -> accountClient.getAccounts(customerId));

        // Wait for both and aggregate
        return new CustomerDashboard(
            customerFuture.join(),
            accountsFuture.join()
        );
    }
}

Be cautious about putting too much logic in the gateway. It should focus on routing, authentication, and protocol translation. Complex business logic belongs in services, not the gateway.

Circuit Breakers and Resilience

When services call other services, failures propagate. Circuit breakers prevent cascading failures by detecting when a downstream service is failing and temporarily blocking requests to it.

A circuit breaker has three states:

Closed: Normal operation, requests pass through
Open: Too many failures detected, requests immediately fail without calling downstream service
Half-Open: After a timeout, allows a few test requests to check if service has recovered

// Using Resilience4j circuit breaker
@Service
public class PaymentServiceClient {
    private final RestClient restClient;
    private final CircuitBreaker circuitBreaker;

    @CircuitBreaker(name = "payment-service", fallbackMethod = "fallbackProcessPayment")
    public PaymentResult processPayment(PaymentRequest request) {
        return restClient.post()
            .uri("/payments")
            .body(request)
            .retrieve()
            .body(PaymentResult.class);
    }

    /**
     * Fallback method called when circuit is open or request fails.
     * Provides degraded functionality rather than complete failure.
     */
    private PaymentResult fallbackProcessPayment(PaymentRequest request, Exception e) {
        // Queue for later processing
        paymentQueue.add(request);
        return PaymentResult.queued("Payment queued for processing");
    }
}

Circuit breakers should be configured based on service characteristics. Fast-failing services might use shorter timeout windows, while services with variable response times need more lenient thresholds.

# Resilience4j configuration
resilience4j:
  circuitbreaker:
    instances:
      payment-service:
        sliding-window-size: 10
        failure-rate-threshold: 50  # Open circuit if 50% of requests fail
        wait-duration-in-open-state: 10s
        permitted-number-of-calls-in-half-open-state: 3
        slow-call-duration-threshold: 2s
        slow-call-rate-threshold: 50  # Consider slow calls as failures

Beyond circuit breakers, implement retry logic for transient failures, timeouts to prevent indefinite blocking, and bulkheads to isolate resources. See Spring Boot Resilience Patterns for detailed implementation guidance.

Distributed Tracing and Observability

Understanding request flows across multiple services requires distributed tracing. Each request is assigned a trace ID that propagates through all service calls, creating a complete picture of the request path.

Implement tracing with OpenTelemetry or similar standards. Traces contain spans - units of work with start time, duration, and metadata. Analyzing traces reveals performance bottlenecks, timeout sources, and error propagation paths.

// OpenTelemetry automatic instrumentation
// Most frameworks auto-instrument common operations
// Manual instrumentation for custom operations:

@Service
public class PaymentProcessor {
    private final Tracer tracer;

    public Payment processPayment(PaymentRequest request) {
        Span span = tracer.spanBuilder("process-payment")
            .setAttribute("payment.amount", request.getAmount().toString())
            .setAttribute("payment.customer_id", request.getCustomerId())
            .startSpan();

        try (Scope scope = span.makeCurrent()) {
            // Processing logic
            Payment payment = doProcess(request);
            span.setStatus(StatusCode.OK);
            return payment;

        } catch (Exception e) {
            span.setStatus(StatusCode.ERROR, e.getMessage());
            span.recordException(e);
            throw e;

        } finally {
            span.end();
        }
    }
}

Combine tracing with structured logging and metrics. Logs should include trace IDs for correlation. Metrics track aggregate statistics (request rate, error rate, latency percentiles). Together, these provide comprehensive observability into system behavior.

Deployment Patterns

Microservices enable deployment flexibility, but also introduce deployment complexity. Choose deployment strategies that balance risk, speed, and resource usage.

Blue-Green Deployment

Blue-green deployment maintains two identical production environments. The current version runs in "blue" while the new version deploys to "green." After testing green, traffic switches over. If issues arise, switching back to blue is instant.

This requires double the infrastructure capacity but provides instant rollback. It's particularly valuable for databases - both environments can share a database if schema changes are backward compatible.

Canary Deployment

Canary deployment gradually rolls out changes to a subset of users while monitoring for issues. Start with a small percentage (1-5%), then gradually increase if metrics look healthy.

// Feature flag controlling canary rollout
@Service
public class PaymentService {
    private final FeatureFlags featureFlags;
    private final PaymentProcessor v1Processor;
    private final PaymentProcessor v2Processor;

    public PaymentResult processPayment(PaymentRequest request) {
        // Route small percentage to new implementation
        if (featureFlags.isEnabled("payment-processor-v2", request.getCustomerId())) {
            return v2Processor.process(request);
        } else {
            return v1Processor.process(request);
        }
    }
}

Monitor error rates, latency, and business metrics for canary traffic compared to baseline. Automatically roll back if metrics degrade beyond thresholds.

Feature Flags

Feature flags decouple deployment from release. Code ships to production with new features hidden behind flags. Enable features for specific users, gradually, or on demand.

This enables trunk-based development where all changes merge to main branch continuously. Long-lived feature branches are avoided. Incomplete features remain behind flags until ready.

// Using feature flags for gradual rollout
@Service
public class AccountService {
    private final FeatureFlagService featureFlags;

    public AccountBalance getBalance(String accountId, String userId) {
        if (featureFlags.isEnabledFor("real-time-balance", userId)) {
            return calculateRealTimeBalance(accountId);
        } else {
            return getCachedBalance(accountId);
        }
    }
}

Feature flags create technical debt if not cleaned up. Old flags accumulate, creating complexity and confusion. Establish processes for flag removal once features are fully rolled out. See Feature Flags and Toggles for comprehensive guidance.

Common Pitfalls

Distributed Monolith

A distributed monolith has microservices that are tightly coupled, deployed together, and share databases. This combines the complexity of distributed systems with the inflexibility of monoliths.

Warning signs:

Services frequently deploy together
Database shared across services
Changes in one service require changes in many others
Services communicate through shared database tables

Avoid this by respecting service boundaries, enforcing API contracts, and maintaining database-per-service. If services consistently need to deploy together, consider whether they should actually be one service.

Chatty Services

Excessive inter-service communication creates latency and fragility. A single user request triggering dozens of service calls indicates poor service boundaries or missing data aggregation.

// Chatty anti-pattern - N+1 problem across services
public List<CustomerWithBalance> getCustomersWithBalances() {
    List<Customer> customers = customerService.getAllCustomers(); // 1 call

    return customers.stream()
        .map(customer -> {
            // N additional calls - one per customer
            BigDecimal balance = accountService.getBalance(customer.getAccountId());
            return new CustomerWithBalance(customer, balance);
        })
        .toList();
}

// Better: Batch operation
public List<CustomerWithBalance> getCustomersWithBalances() {
    List<Customer> customers = customerService.getAllCustomers();
    List<String> accountIds = customers.stream()
        .map(Customer::getAccountId)
        .toList();

    // Single batch call
    Map<String, BigDecimal> balances = accountService.getBalances(accountIds);

    return customers.stream()
        .map(customer -> new CustomerWithBalance(
            customer,
            balances.get(customer.getAccountId())))
        .toList();
}

Redesign APIs to support batch operations. Consider data aggregation at the gateway layer. Evaluate whether services have the right boundaries if they constantly need each other's data.

Ignoring Network Fallibility

The network will fail. Services will be unavailable. Requests will timeout. Designs that assume perfect network reliability create fragile systems.

Implement timeouts on all external calls. Use circuit breakers to prevent cascading failures. Design for graceful degradation - provide reduced functionality rather than complete failure. Cache data where appropriate to serve stale content when services are unavailable.

// Network-aware design with timeouts and fallbacks
@Service
public class CustomerDashboardService {

    @Retry(maxAttempts = 3, backoff = @Backoff(delay = 100))
    @CircuitBreaker(name = "account-service")
    @TimeLimiter(name = "account-service")
    public CompletableFuture<List<Account>> getAccounts(String customerId) {
        return CompletableFuture.supplyAsync(() ->
            accountService.getAccounts(customerId));
    }

    public CustomerDashboard getDashboard(String customerId) {
        Customer customer = customerService.getCustomer(customerId);

        try {
            List<Account> accounts = getAccounts(customerId)
                .orTimeout(2, TimeUnit.SECONDS)
                .exceptionally(ex -> {
                    // Fallback to cached data
                    return accountCache.get(customerId);
                })
                .join();

            return new CustomerDashboard(customer, accounts);

        } catch (Exception e) {
            // Degrade gracefully
            return new CustomerDashboard(customer, emptyList());
        }
    }
}

API Design Guidelines - REST API best practices for service interfaces
Spring Boot Resilience - Circuit breakers, retries, and bulkheads
Event-Driven Architecture - Asynchronous communication patterns
Database Guidelines - Data consistency and transaction management
Observability - Logging, metrics, and tracing
Testing Microservices - Integration and contract testing strategies
Container Orchestration - Kubernetes deployment and management

Service Decomposition Strategies​

Decomposition by Business Capability​

Decomposition by Subdomain (Domain-Driven Design)​

Decomposition by Transaction Boundaries​

Inter-Service Communication​

Synchronous Communication: REST and gRPC​

Asynchronous Communication: Messaging​

Choosing Communication Patterns​

Data Management Patterns​

Database Per Service​

Saga Pattern for Distributed Transactions​

Event Sourcing​

CQRS (Command Query Responsibility Segregation)​

Service Discovery and Load Balancing​

API Gateway Pattern​

Circuit Breakers and Resilience​

Distributed Tracing and Observability​

Deployment Patterns​

Blue-Green Deployment​

Canary Deployment​

Feature Flags​

Common Pitfalls​

Distributed Monolith​

Chatty Services​

Ignoring Network Fallibility​

Related Topics​

Further Reading​