AWS API Gateway

Overview

AWS API Gateway is a fully managed service that enables developers to create, publish, maintain, monitor, and secure APIs at any scale. It acts as a "front door" for applications to access data, business logic, or functionality from backend services such as Lambda functions, EC2 instances, or other AWS services.

API Gateway handles the heavy lifting of API management - request routing, authorization, rate limiting, caching, monitoring, and versioning - allowing teams to focus on building business logic rather than infrastructure. For serverless applications, API Gateway provides the HTTP interface that triggers Lambda functions, enabling event-driven architectures without managing servers.

Choosing the correct API Gateway type and configuration determines API performance, security, and cost-effectiveness. This guide covers the three API Gateway types (REST API, HTTP API, WebSocket API), authorization mechanisms, throttling strategies, caching patterns, and integration options. For general API design principles independent of API Gateway, see API Design.

The choice between API Gateway types involves trade-offs between features, cost, and performance. REST APIs provide the most features (request validation, transformation, caching, usage plans) but at higher cost and latency. HTTP APIs offer better performance and lower cost but fewer features. WebSocket APIs enable real-time bidirectional communication for use cases like chat applications and live dashboards.

Core Principles

Choose the right API type - HTTP API for simple proxies and cost optimization, REST API for advanced features, WebSocket API for real-time bidirectional communication
Secure all endpoints - Never expose public APIs without authentication; use Lambda authorizers, IAM, Cognito, or JWT validation
Implement throttling - Protect backend services from traffic spikes using throttling and usage plans
Enable caching strategically - Cache GET requests to reduce latency and backend load, but understand cache invalidation implications
Monitor and alert - Track 4XX/5XX errors, latency, and cache hit ratios to detect issues before they impact users

API Gateway Types Comparison

AWS offers three distinct API Gateway types, each designed for specific use cases. Understanding the capabilities and limitations of each helps select the appropriate type for your requirements.

REST API

REST API is the original API Gateway offering with the most comprehensive feature set. It provides request validation, request/response transformation using Velocity Template Language (VTL), API caching, usage plans with API keys, and extensive monitoring capabilities.

When to use REST API:

You need request validation before invoking backend services
You require request/response transformation using VTL templates
API caching is critical for performance
You need usage plans and API keys for partner integrations
You require AWS WAF integration for security
Private APIs accessible only within VPC are needed

REST API features:

Request validation against OpenAPI schemas
Request and response mapping using VTL
API Gateway-level caching (0.5GB to 237GB)
Usage plans and API keys for rate limiting per client
AWS WAF integration for DDoS protection and IP filtering
Private endpoints accessible only from VPC
Mock integrations for testing
Canary deployments for gradual rollouts

Cost: $3.50 per million API calls (first 333 million calls per month)

Latency: Higher than HTTP API due to additional processing (validation, transformation)

HTTP API

HTTP API is a newer, simplified offering optimized for building low-latency, cost-effective APIs. It provides core API Gateway functionality - routing, authorization, CORS - without advanced features like VTL transformation or caching.

When to use HTTP API:

You're building a simple proxy to Lambda or HTTP backends
Cost optimization is important (71% cheaper than REST API)
Low latency is critical (up to 60% lower than REST API)
You only need JWT or IAM authorization
You don't need request validation, transformation, or caching
You're building microservices with service-to-service communication

HTTP API features:

Native JWT validation (Cognito, Auth0, Okta, custom)
IAM authorization
Automatic deployments (no manual stage management)
Built-in CORS configuration
OpenID Connect and OAuth 2.0 integration
VPC Link for private integrations
Lambda and HTTP proxy integrations

Cost: $1.00 per million API calls (first 300 million calls per month)

Latency: Lower than REST API (typically 20-60% reduction)

Limitations (compared to REST API):

No request/response transformation (VTL)
No API Gateway caching
No usage plans or API keys
No AWS WAF integration
No request validation
No private APIs (VPC endpoints)
No mock integrations

WebSocket API

WebSocket API enables building real-time, bidirectional communication applications where the server can push data to connected clients without polling. WebSocket connections persist, allowing low-latency message exchange ideal for chat applications, live dashboards, gaming, and collaborative editing.

When to use WebSocket API:

Real-time bidirectional communication (chat, notifications, live updates)
Broadcasting updates to multiple connected clients
Low-latency message delivery (avoiding HTTP polling overhead)
Persistent connections for streaming data
Collaborative applications (document editing, whiteboards)
Gaming and live sports applications
IoT device communication requiring server-initiated messages

WebSocket API features:

Persistent connections (up to 2 hours idle timeout)
Server-to-client message push
Route selection based on message content
Connection management (connect, disconnect, default routes)
Lambda, HTTP, and AWS service integrations
IAM and Lambda authorizers for connection authentication

Cost: $1.00 per million messages + $0.25 per million connection minutes

Connection limits: 500 new connections per second, 120,000 concurrent connections per region

API Type Decision Matrix

Feature	REST API	HTTP API	WebSocket API
Cost (per million)	$3.50	$1.00	$1.00 + connection minutes
Latency	Higher	Lower (20-60% faster)	Very low (persistent)
Authorization	Lambda, IAM, Cognito, API keys	JWT, IAM	Lambda, IAM
Request Validation	Yes	No	No
Request/Response Transform	Yes (VTL)	No	No
Caching	Yes (0.5-237 GB)	No	No
Usage Plans & API Keys	Yes	No	No
AWS WAF	Yes	No	No
Private APIs (VPC)	Yes	No	No
CORS	Manual configuration	Built-in	N/A
WebSocket Support	No	No	Yes
Max Timeout	29 seconds	30 seconds	2 hours (idle)

Decision guidelines:

Choose REST API when you need advanced features that justify higher cost and latency: request validation, VTL transformation, caching, usage plans, or WAF integration. REST APIs suit partner integrations requiring API keys, complex transformation logic, or strict schema validation.

Choose HTTP API for straightforward proxy use cases prioritizing cost and performance. HTTP APIs work well for microservices, serverless backends, and internal APIs where JWT or IAM authentication suffices. Most new APIs should start with HTTP API unless specific REST API features are required.

Choose WebSocket API for real-time bidirectional communication. Use when clients need to receive server-initiated messages without polling, or when connection persistence reduces latency compared to repeated HTTP requests.

For API design principles applicable to all types, see API Design, API Contracts, and API Versioning.

REST API Deep Dive

REST API provides the most comprehensive feature set for building sophisticated API architectures. Understanding its capabilities enables leveraging advanced features while avoiding unnecessary complexity.

Request Validation

API Gateway can validate incoming requests against OpenAPI (Swagger) schemas before invoking backend integrations. This prevents invalid requests from reaching backend services, reducing compute costs and improving security.

# OpenAPI 3.0 schema with request validation
openapi: 3.0.0
info:
  title: Order API
  version: 1.0.0
paths:
  /orders:
    post:
      summary: Create order
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              required:
                - customerId
                - items
              properties:
                customerId:
                  type: string
                  pattern: '^[A-Z0-9]{10}$'
                  description: Customer ID (10 alphanumeric characters)
                items:
                  type: array
                  minItems: 1
                  maxItems: 50
                  items:
                    type: object
                    required:
                      - productId
                      - quantity
                    properties:
                      productId:
                        type: string
                      quantity:
                        type: integer
                        minimum: 1
                        maximum: 100
                totalAmount:
                  type: number
                  minimum: 0.01
                  maximum: 100000
      responses:
        '201':
          description: Order created
        '400':
          description: Invalid request

How validation works: API Gateway compares the request body, query parameters, and headers against the OpenAPI schema. If validation fails, API Gateway returns a 400 Bad Request immediately without invoking the backend. This prevents wasting Lambda invocations on malformed requests.

Validation benefits:

Cost reduction: Invalid requests rejected before Lambda invocation
Improved security: Prevents injection attacks via malformed input
Faster feedback: Clients receive immediate validation errors
Backend simplification: Backend code assumes valid input, reducing validation logic

Validation limitations: Schema validation is structural, not business logic. API Gateway validates that quantity is an integer between 1-100 but cannot validate that product IDs exist in your database. Business validation still occurs in backend code.

Request and Response Transformation

Velocity Template Language (VTL) enables transforming requests and responses without changing backend code. This allows adapting legacy backends to modern API contracts or simplifying client integrations.

## Transform incoming JSON to backend format
#set($inputRoot = $input.path('$'))
{
  "operation": "CREATE_ORDER",
  "payload": {
    "customer": {
      "id": "$inputRoot.customerId"
    },
    "orderLines": [
      #foreach($item in $inputRoot.items)
      {
        "sku": "$item.productId",
        "qty": $item.quantity,
        "price": $item.unitPrice
      }#if($foreach.hasNext),#end
      #end
    ],
    "total": $inputRoot.totalAmount,
    "timestamp": $context.requestTimeEpoch
  }
}

Common transformation use cases:

Adapting legacy systems: Transform modern REST API to legacy SOAP or RPC format
Header manipulation: Add authentication headers, remove internal headers
Data enrichment: Add request ID, timestamp, or client IP to requests
Response filtering: Remove sensitive fields from responses
Error normalization: Transform backend errors to consistent API error format

VTL context variables:

$input: Access request body and path parameters
$context: Request context (request ID, timestamp, source IP, stage)
$stageVariables: Environment-specific configuration values
$util: Utility functions (URL encoding, Base64, escaping)

## Response transformation - normalize error format
#set($inputRoot = $input.path('$'))
{
  "error": {
    "code": "$inputRoot.errorCode",
    "message": "$inputRoot.errorMessage",
    "requestId": "$context.requestId",
    "timestamp": "$context.requestTimeEpoch"
  }
}

VTL considerations: VTL adds complexity and makes APIs harder to test and debug. Use transformation sparingly - prefer making backends API-compatible over using VTL. HTTP API's lack of VTL encourages better backend design. Only use VTL when adapting existing systems that cannot change or for simple enrichment like adding request IDs.

Integration Types

API Gateway supports multiple integration types for connecting APIs to backend services:

Lambda Proxy Integration (most common):

Passes entire request to Lambda as JSON object
Lambda has full access to headers, query params, body, request context
Lambda returns response with statusCode, headers, and body
Simplest integration type - no VTL mapping required
Recommended for most serverless APIs

// Lambda handler for proxy integration
export const handler = async (event: APIGatewayProxyEvent):
    Promise<APIGatewayProxyResult> => {

  console.log('Request:', JSON.stringify(event, null, 2));

  try {
    // Access all request data
    const body = JSON.parse(event.body || '{}');
    const customerId = body.customerId;
    const authHeader = event.headers.Authorization;
    const queryParam = event.queryStringParameters?.filter;

    // Business logic
    const order = await orderService.createOrder({
      customerId,
      items: body.items,
      totalAmount: body.totalAmount
    });

    // Return full response
    return {
      statusCode: 201,
      headers: {
        'Content-Type': 'application/json',
        'X-Request-Id': event.requestContext.requestId
      },
      body: JSON.stringify({
        orderId: order.id,
        status: 'created',
        createdAt: order.createdAt
      })
    };

  } catch (error) {
    console.error('Error creating order:', error);

    return {
      statusCode: 500,
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({
        error: 'Internal server error',
        requestId: event.requestContext.requestId
      })
    };
  }
};

Lambda Integration (non-proxy):

Request transformed via VTL before Lambda invocation
Lambda receives only transformed data
Lambda returns data that's transformed via VTL to response
More complex - requires VTL mapping templates
Use when you need transformation or have existing Lambda functions expecting specific format

HTTP Proxy Integration:

Forwards request directly to HTTP backend
No transformation - direct pass-through
Backend receives original headers, query params, body
Use for proxying to existing HTTP APIs or microservices

HTTP Integration (non-proxy):

Transform request via VTL before HTTP call
Transform response via VTL before returning to client
Use when adapting HTTP backend to different API contract

AWS Service Integration:

Directly call AWS services (DynamoDB, S3, SQS, Kinesis) without Lambda
Reduces costs by eliminating Lambda invocations
Requires IAM execution role with service permissions
Use for simple operations (write to SQS, read from DynamoDB, upload to S3)

# AWS Service integration - direct SQS publish
x-amazon-apigateway-integration:
  type: aws
  uri: arn:aws:apigateway:us-east-1:sqs:path/123456789012/order-queue
  httpMethod: POST
  credentials: arn:aws:iam::123456789012:role/APIGatewayToSQS
  requestParameters:
    integration.request.header.Content-Type: "'application/x-www-form-urlencoded'"
  requestTemplates:
    application/json: |
      Action=SendMessage&MessageBody=$util.urlEncode($input.body)
  responses:
    default:
      statusCode: 200

Mock Integration:

Returns static response without calling any backend
Use for API prototyping, testing, or health check endpoints
Useful during development before backend implementation

Integration selection:

Lambda Proxy: Default choice for serverless APIs - simplest and most flexible
HTTP Proxy: Proxy to existing HTTP backends or microservices
AWS Service: Cost optimization for simple operations (SQS publish, DynamoDB put)
Non-proxy (Lambda/HTTP): Only when VTL transformation is required

For Lambda best practices and cold start optimization, see AWS Compute.

HTTP API Deep Dive

HTTP API offers a streamlined experience focused on performance and cost-effectiveness. Understanding its capabilities and limitations helps determine when it's appropriate.

Built-in JWT Authorization

HTTP API provides native JWT validation without Lambda authorizers. API Gateway validates JWT tokens from Cognito, Auth0, Okta, or custom identity providers, rejecting invalid requests before invoking backends.

// Lambda handler for HTTP API with JWT
export const handler = async (event: APIGatewayProxyEventV2):
    Promise<APIGatewayProxyResultV2> => {

  // JWT already validated by API Gateway
  // Claims available in event.requestContext.authorizer.jwt.claims
  const claims = event.requestContext.authorizer?.jwt.claims;
  const userId = claims?.sub;
  const email = claims?.email;
  const scopes = claims?.scope?.split(' ') || [];

  // Check required scope
  if (!scopes.includes('orders:write')) {
    return {
      statusCode: 403,
      body: JSON.stringify({ error: 'Insufficient permissions' })
    };
  }

  // Business logic with authenticated user context
  const order = await orderService.createOrder({
    userId,
    email,
    items: JSON.parse(event.body || '{}').items
  });

  return {
    statusCode: 201,
    body: JSON.stringify(order)
  };
};

JWT configuration (via CloudFormation):

OrderAPI:
  Type: AWS::ApiGatewayV2::Api
  Properties:
    Name: OrderAPI
    ProtocolType: HTTP

# JWT authorizer for Cognito
CognitoAuthorizer:
  Type: AWS::ApiGatewayV2::Authorizer
  Properties:
    ApiId: !Ref OrderAPI
    AuthorizerType: JWT
    IdentitySource:
      - $request.header.Authorization
    JwtConfiguration:
      Audience:
        - 1a2b3c4d5e6f7g8h9i0j
      Issuer: https://cognito-idp.us-east-1.amazonaws.com/us-east-1_AbCdEfGhI

# Route with authorization
CreateOrderRoute:
  Type: AWS::ApiGatewayV2::Route
  Properties:
    ApiId: !Ref OrderAPI
    RouteKey: POST /orders
    AuthorizationType: JWT
    AuthorizerId: !Ref CognitoAuthorizer
    Target: !Sub integrations/${CreateOrderIntegration}

JWT validation process:

Client includes JWT in Authorization header: Bearer eyJhbGc...
API Gateway downloads JWKS (JSON Web Key Set) from issuer
API Gateway validates JWT signature, expiration, and audience
If valid, request forwarded to backend with claims in context
If invalid, API Gateway returns 401 Unauthorized immediately

Benefits over Lambda authorizers:

No cold starts: No Lambda invocation for authorization
Lower latency: Built-in validation faster than Lambda authorizer
Lower cost: No Lambda charges for authorization
Simplified code: No custom authorizer logic to maintain

Limitations:

Cannot add custom logic during authorization (role lookups, premium checks)
Cannot modify request/response during authorization
Only JWT - no support for API keys, custom tokens, or other auth schemes

For custom authorization logic, use Lambda authorizers even with HTTP API (see Lambda Authorizers).

Automatic Deployments

Unlike REST API requiring manual stage management, HTTP API supports automatic deployments. Changes deploy immediately to the default stage ($default) without creating deployment stages.

REST API deployment (manual):

# REST API requires explicit deployment
aws apigateway create-deployment \
  --rest-api-id abc123 \
  --stage-name prod \
  --description "Deploy new order endpoint"

HTTP API deployment (automatic):

Changes deploy automatically on save
No manual deployment command needed
Default stage ($default) updated immediately
Optional custom stages for multi-environment support

Custom stages for HTTP API:

# Development stage
DevStage:
  Type: AWS::ApiGatewayV2::Stage
  Properties:
    ApiId: !Ref OrderAPI
    StageName: dev
    AutoDeploy: true
    DefaultRouteSettings:
      ThrottlingBurstLimit: 100
      ThrottlingRateLimit: 50

# Production stage
ProdStage:
  Type: AWS::ApiGatewayV2::Stage
  Properties:
    ApiId: !Ref OrderAPI
    StageName: prod
    AutoDeploy: true
    DefaultRouteSettings:
      ThrottlingBurstLimit: 5000
      ThrottlingRateLimit: 2000

Automatic deployment benefits:

Faster iteration during development
No manual deployment step in CI/CD pipelines
Consistent behavior across environments

Considerations: Automatic deployment means changes are immediately live. For production, consider using custom stages with controlled deployment processes.

WebSocket API

WebSocket API enables building real-time bidirectional communication where servers push messages to connected clients. Unlike HTTP APIs requiring clients to poll for updates, WebSocket maintains persistent connections allowing instant server-initiated messages.

Connection Lifecycle

WebSocket connections go through three phases: connect, message exchange, and disconnect. Each phase maps to API Gateway routes that trigger backend logic.

$connect route: Invoked when client establishes WebSocket connection. Use to authenticate, store connection ID, and associate with user identity.

// $connect route handler
export const connectHandler = async (
  event: APIGatewayWebSocketEvent
): Promise<APIGatewayProxyResult> => {

  const connectionId = event.requestContext.connectionId;

  // Extract authentication token from query string
  // wss://api-id.execute-api.region.amazonaws.com/prod?token=xyz
  const token = event.queryStringParameters?.token;

  if (!token) {
    return { statusCode: 401, body: 'Unauthorized' };
  }

  try {
    // Validate token and get user identity
    const user = await authService.validateToken(token);

    // Store connection in DynamoDB
    await dynamodb.put({
      TableName: 'WebSocketConnections',
      Item: {
        connectionId,
        userId: user.id,
        email: user.email,
        connectedAt: Date.now(),
        ttl: Math.floor(Date.now() / 1000) + 7200 // 2 hour TTL
      }
    }).promise();

    console.log(`Connection established: ${connectionId} for user ${user.id}`);

    return { statusCode: 200, body: 'Connected' };

  } catch (error) {
    console.error('Connection failed:', error);
    return { statusCode: 401, body: 'Unauthorized' };
  }
};

Custom routes: Handle specific message types based on route selection expression. Routes determine which Lambda function processes each message.

// Custom route: "sendmessage"
export const sendMessageHandler = async (
  event: APIGatewayWebSocketEvent
): Promise<APIGatewayProxyResult> => {

  const connectionId = event.requestContext.connectionId;
  const body = JSON.parse(event.body || '{}');

  // Get sender's identity from connection
  const connection = await dynamodb.get({
    TableName: 'WebSocketConnections',
    Key: { connectionId }
  }).promise();

  const sender = connection.Item;

  // Business logic - broadcast message to all connections
  const message = {
    type: 'message',
    from: sender.email,
    text: body.message,
    timestamp: Date.now()
  };

  // Get all active connections
  const connections = await dynamodb.scan({
    TableName: 'WebSocketConnections'
  }).promise();

  // Push message to all connected clients
  const apiGateway = new ApiGatewayManagementApi({
    endpoint: `${event.requestContext.domainName}/${event.requestContext.stage}`
  });

  await Promise.all(
    connections.Items?.map(async (conn) => {
      try {
        await apiGateway.postToConnection({
          ConnectionId: conn.connectionId,
          Data: JSON.stringify(message)
        }).promise();

        console.log(`Message sent to ${conn.connectionId}`);

      } catch (error) {
        // Connection stale - remove from database
        if (error.statusCode === 410) {
          await dynamodb.delete({
            TableName: 'WebSocketConnections',
            Key: { connectionId: conn.connectionId }
          }).promise();
        }
      }
    }) || []
  );

  return { statusCode: 200, body: 'Message sent' };
};

$disconnect route: Invoked when client closes connection or connection times out. Use to clean up resources and remove connection from database.

// $disconnect route handler
export const disconnectHandler = async (
  event: APIGatewayWebSocketEvent
): Promise<APIGatewayProxyResult> => {

  const connectionId = event.requestContext.connectionId;

  try {
    // Remove connection from database
    await dynamodb.delete({
      TableName: 'WebSocketConnections',
      Key: { connectionId }
    }).promise();

    console.log(`Connection closed: ${connectionId}`);

    return { statusCode: 200, body: 'Disconnected' };

  } catch (error) {
    console.error('Disconnect cleanup failed:', error);
    // Return success anyway - connection is closed
    return { statusCode: 200, body: 'Disconnected' };
  }
};

Route Selection

Route selection expression determines which route handles each message. The expression evaluates message content to select the appropriate route.

# WebSocket API with route selection
WebSocketAPI:
  Type: AWS::ApiGatewayV2::Api
  Properties:
    Name: ChatAPI
    ProtocolType: WEBSOCKET
    RouteSelectionExpression: $request.body.action

# Routes
ConnectRoute:
  Type: AWS::ApiGatewayV2::Route
  Properties:
    ApiId: !Ref WebSocketAPI
    RouteKey: $connect
    Target: !Sub integrations/${ConnectIntegration}

SendMessageRoute:
  Type: AWS::ApiGatewayV2::Route
  Properties:
    ApiId: !Ref WebSocketAPI
    RouteKey: sendmessage
    Target: !Sub integrations/${SendMessageIntegration}

JoinRoomRoute:
  Type: AWS::ApiGatewayV2::Route
  Properties:
    ApiId: !Ref WebSocketAPI
    RouteKey: joinroom
    Target: !Sub integrations/${JoinRoomIntegration}

DefaultRoute:
  Type: AWS::ApiGatewayV2::Route
  Properties:
    ApiId: !Ref WebSocketAPI
    RouteKey: $default
    Target: !Sub integrations/${DefaultIntegration}

DisconnectRoute:
  Type: AWS::ApiGatewayV2::Route
  Properties:
    ApiId: !Ref WebSocketAPI
    RouteKey: $disconnect
    Target: !Sub integrations/${DisconnectIntegration}

Client messages specify action in message body:

// Client sends message with action
const ws = new WebSocket('wss://abc123.execute-api.us-east-1.amazonaws.com/prod');

ws.onopen = () => {
  // Send message to "sendmessage" route
  ws.send(JSON.stringify({
    action: 'sendmessage',
    message: 'Hello, everyone!'
  }));

  // Send message to "joinroom" route
  ws.send(JSON.stringify({
    action: 'joinroom',
    roomId: 'room-123'
  }));
};

ws.onmessage = (event) => {
  const message = JSON.parse(event.data);
  console.log('Received:', message);
};

$default route: Handles messages that don't match any custom route. Use for error handling or routing logic.

Broadcasting and Fan-Out

WebSocket APIs enable server-initiated messages to connected clients. Common pattern: fan-out messages to multiple connections based on subscriptions or room membership.

/**
 * Broadcast message to all connections in a room.
 */
export const broadcastToRoom = async (
  roomId: string,
  message: any,
  apiGatewayEndpoint: string
): Promise<void> => {

  // Query connections subscribed to room
  const result = await dynamodb.query({
    TableName: 'WebSocketConnections',
    IndexName: 'RoomIdIndex',
    KeyConditionExpression: 'roomId = :roomId',
    ExpressionAttributeValues: {
      ':roomId': roomId
    }
  }).promise();

  const connections = result.Items || [];

  const apiGateway = new ApiGatewayManagementApi({
    endpoint: apiGatewayEndpoint
  });

  // Send to all connections in parallel
  const results = await Promise.allSettled(
    connections.map(conn =>
      apiGateway.postToConnection({
        ConnectionId: conn.connectionId,
        Data: JSON.stringify(message)
      }).promise()
    )
  );

  // Clean up stale connections
  const staleConnections = results
    .map((result, index) => ({ result, conn: connections[index] }))
    .filter(({ result }) =>
      result.status === 'rejected' &&
      result.reason?.statusCode === 410
    )
    .map(({ conn }) => conn);

  if (staleConnections.length > 0) {
    await Promise.all(
      staleConnections.map(conn =>
        dynamodb.delete({
          TableName: 'WebSocketConnections',
          Key: { connectionId: conn.connectionId }
        }).promise()
      )
    );

    console.log(`Cleaned up ${staleConnections.length} stale connections`);
  }
};

WebSocket use cases:

Chat applications: Real-time message delivery to room participants
Live dashboards: Push metrics updates to monitoring dashboards
Collaborative editing: Broadcast document changes to all editors
Gaming: Real-time game state synchronization
IoT notifications: Push alerts to connected devices
Live sports scores: Instant score updates to all viewers

WebSocket limitations:

2-hour idle timeout (connection closes if no messages for 2 hours)
128 KB maximum message size
500 new connections per second rate limit
No built-in message ordering guarantees
Manual connection management required

For real-time patterns and connection management, see Real-Time Communication.

Authorization and Security

Securing API Gateway endpoints is critical. Never expose public APIs without authentication. API Gateway provides multiple authorization mechanisms suitable for different use cases.

Lambda Authorizers

Lambda authorizers (formerly custom authorizers) execute custom Lambda functions to validate requests. This enables any authentication strategy: validating custom tokens, querying databases, checking API keys, or implementing complex authorization logic.

Request-based authorizer: Receives request parameters (headers, query strings) and returns IAM policy.

/**
 * Lambda authorizer for custom token validation.
 */
export const handler = async (
  event: APIGatewayTokenAuthorizerEvent
): Promise<APIGatewayAuthorizerResult> => {

  const token = event.authorizationToken; // "Bearer token-value"

  try {
    // Validate token (check database, verify JWT, call auth service)
    const user = await authService.validateToken(token.replace('Bearer ', ''));

    // Check user permissions
    const hasAccess = await permissionService.canAccessResource(
      user.id,
      event.methodArn
    );

    if (!hasAccess) {
      throw new Error('Insufficient permissions');
    }

    // Generate IAM policy allowing access
    return {
      principalId: user.id,
      policyDocument: {
        Version: '2012-10-17',
        Statement: [
          {
            Action: 'execute-api:Invoke',
            Effect: 'Allow',
            Resource: event.methodArn
          }
        ]
      },
      // Context passed to backend Lambda
      context: {
        userId: user.id,
        email: user.email,
        role: user.role,
        tenantId: user.tenantId
      }
    };

  } catch (error) {
    console.error('Authorization failed:', error);

    // Return explicit deny
    return {
      principalId: 'unauthorized',
      policyDocument: {
        Version: '2012-10-17',
        Statement: [
          {
            Action: 'execute-api:Invoke',
            Effect: 'Deny',
            Resource: event.methodArn
          }
        ]
      }
    };
  }
};

Context propagation: Context from authorizer passed to backend Lambda via event.requestContext.authorizer:

// Backend Lambda receives authorizer context
export const handler = async (event: APIGatewayProxyEvent) => {
  // Access user context from authorizer
  const userId = event.requestContext.authorizer?.userId;
  const email = event.requestContext.authorizer?.email;
  const role = event.requestContext.authorizer?.role;

  // Use context in business logic
  const orders = await orderService.getOrdersForUser(userId);

  return {
    statusCode: 200,
    body: JSON.stringify(orders)
  };
};

Authorizer caching: API Gateway caches authorizer responses to reduce Lambda invocations. Cache key based on token value.

# Configure authorizer with caching
CustomAuthorizer:
  Type: AWS::ApiGateway::Authorizer
  Properties:
    Name: TokenAuthorizer
    Type: TOKEN
    IdentitySource: method.request.header.Authorization
    AuthorizerUri: !Sub arn:aws:apigateway:${AWS::Region}:lambda:path/2015-03-31/functions/${AuthorizerFunction.Arn}/invocations
    AuthorizerResultTtlInSeconds: 300  # Cache for 5 minutes

Cache TTL considerations:

0 seconds: No caching - authorizer invoked for every request (expensive but immediate permission changes)
300 seconds (5 minutes): Balance between cost and freshness (recommended default)
3600 seconds (1 hour): Aggressive caching for rarely-changing permissions (high cost savings but stale permissions)

Cache invalidation: Cached policies persist until TTL expires. If permissions change (user revoked), they remain effective until cache expires. For immediate revocation, use short TTL or include timestamp in token.

IAM Authorization

IAM authorization uses AWS Signature Version 4 signing. Clients sign requests with AWS credentials (access key + secret key). API Gateway validates signatures using IAM policies.

When to use IAM authorization:

Service-to-service communication between AWS services
Internal APIs accessed by applications running on EC2/ECS/Lambda
Developer tools and SDKs with AWS credentials
APIs accessed by AWS SDK (boto3, AWS SDK for Java)

# API Gateway method with IAM authorization
OrdersMethod:
  Type: AWS::ApiGateway::Method
  Properties:
    RestApiId: !Ref OrderAPI
    ResourceId: !Ref OrdersResource
    HttpMethod: POST
    AuthorizationType: AWS_IAM
    Integration:
      Type: AWS_PROXY
      IntegrationHttpMethod: POST
      Uri: !Sub arn:aws:apigateway:${AWS::Region}:lambda:path/2015-03-31/functions/${CreateOrderFunction.Arn}/invocations

IAM policy for API access:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": "execute-api:Invoke",
      "Resource": "arn:aws:execute-api:us-east-1:123456789012:abc123/prod/POST/orders"
    }
  ]
}

Signing requests with AWS SDK:

// Java SDK automatically signs requests
public class OrderApiClient {

    private final AWSApiGateway apiGateway;

    public Order createOrder(CreateOrderRequest request) {
        // SDK signs request with IAM credentials from environment
        // (instance profile, ECS task role, Lambda execution role)

        String endpoint = "https://abc123.execute-api.us-east-1.amazonaws.com/prod";

        HttpPost httpPost = new HttpPost(endpoint + "/orders");
        httpPost.setEntity(new StringEntity(toJson(request)));

        // Sigv4 signing handled by SDK
        HttpResponse response = httpClient.execute(httpPost);

        return parseResponse(response);
    }
}

IAM authorization benefits:

No additional authorization Lambda costs
Native AWS credential management
Fine-grained permissions via IAM policies
Integration with AWS organizations and SCPs

IAM authorization limitations:

Only works for authenticated AWS principals (users, roles)
Cannot use for public APIs accessed by web/mobile clients
Requires AWS SDK or manual Sigv4 signing implementation

For IAM best practices and policy design, see AWS IAM.

Cognito User Pools

Amazon Cognito User Pools provide managed user authentication with JWT tokens. API Gateway validates Cognito JWTs natively.

# REST API with Cognito authorizer
CognitoAuthorizer:
  Type: AWS::ApiGateway::Authorizer
  Properties:
    Name: CognitoAuthorizer
    Type: COGNITO_USER_POOLS
    IdentitySource: method.request.header.Authorization
    RestApiId: !Ref OrderAPI
    ProviderARNs:
      - !GetAtt UserPool.Arn

OrdersMethod:
  Type: AWS::ApiGateway::Method
  Properties:
    RestApiId: !Ref OrderAPI
    ResourceId: !Ref OrdersResource
    HttpMethod: POST
    AuthorizationType: COGNITO_USER_POOLS
    AuthorizerId: !Ref CognitoAuthorizer

Client authentication flow:

// Client authenticates with Cognito
import { CognitoUser, AuthenticationDetails } from 'amazon-cognito-identity-js';

const authenticationDetails = new AuthenticationDetails({
  Username: email,
  Password: password
});

const cognitoUser = new CognitoUser({
  Username: email,
  Pool: userPool
});

cognitoUser.authenticateUser(authenticationDetails, {
  onSuccess: (result) => {
    const idToken = result.getIdToken().getJwtToken();

    // Use ID token in API requests
    fetch('https://api.example.com/orders', {
      method: 'POST',
      headers: {
        'Authorization': `Bearer ${idToken}`,
        'Content-Type': 'application/json'
      },
      body: JSON.stringify(orderData)
    });
  },
  onFailure: (err) => {
    console.error('Authentication failed:', err);
  }
});

Cognito benefits:

Managed user authentication (sign-up, sign-in, password reset)
Built-in JWT validation - no authorizer Lambda required
User pools with groups for role-based access
Multi-factor authentication (MFA) support
Social identity provider integration (Google, Facebook, Amazon)

For authentication and authorization patterns, see Authentication and Authorization.

Throttling and Usage Plans

Throttling protects backend services from traffic spikes and prevents individual clients from consuming all capacity. API Gateway provides multiple throttling mechanisms at different levels.

Throttling Levels

Account limits (AWS defaults):

Regional steady-state rate: 10,000 requests per second
Burst capacity: 5,000 requests (token bucket)
Applies across all APIs in the region
Shared quota - one API can consume entire limit
Contact AWS Support to increase

API/Stage limits:

Set per-stage throttling limits
Applies to all methods in the stage
Overrides account limits (can be lower, not higher)

# Stage-level throttling
ProdStage:
  Type: AWS::ApiGateway::Stage
  Properties:
    RestApiId: !Ref OrderAPI
    StageName: prod
    DeploymentId: !Ref Deployment
    MethodSettings:
      - ResourcePath: /*
        HttpMethod: '*'
        ThrottlingRateLimit: 2000    # 2000 RPS steady-state
        ThrottlingBurstLimit: 1000   # 1000 burst capacity

Method limits:

Set throttling per HTTP method
Most granular control
Useful for protecting expensive operations

# Method-level throttling
MethodSettings:
  - ResourcePath: /orders
    HttpMethod: POST
    ThrottlingRateLimit: 100
    ThrottlingBurstLimit: 50
  - ResourcePath: /orders
    HttpMethod: GET
    ThrottlingRateLimit: 1000
    ThrottlingBurstLimit: 500

Token bucket algorithm: API Gateway uses token bucket for burst handling.

Bucket capacity = burst limit
Bucket refills at rate = steady-state limit per second
Each request consumes one token
When bucket empty, requests throttled

Example: 1000 RPS steady-state, 500 burst capacity

Can handle 1,500 RPS for first second (1000 from rate + 500 from bucket)
Bucket refills at 1,000 tokens/second
Sustained traffic above 1,000 RPS throttled after burst depleted

Usage Plans and API Keys

Usage plans associate API keys with throttling and quota limits, enabling different tiers for partners or customers.

# Usage plan for premium tier
PremiumUsagePlan:
  Type: AWS::ApiGateway::UsagePlan
  Properties:
    UsagePlanName: Premium
    Description: Premium tier with higher limits
    ApiStages:
      - ApiId: !Ref OrderAPI
        Stage: !Ref ProdStage
    Throttle:
      RateLimit: 1000      # 1000 RPS
      BurstLimit: 2000     # 2000 burst
    Quota:
      Limit: 1000000       # 1 million requests
      Period: MONTH        # Per month

# Usage plan for standard tier
StandardUsagePlan:
  Type: AWS::ApiGateway::UsagePlan
  Properties:
    UsagePlanName: Standard
    Description: Standard tier
    ApiStages:
      - ApiId: !Ref OrderAPI
        Stage: !Ref ProdStage
    Throttle:
      RateLimit: 100
      BurstLimit: 200
    Quota:
      Limit: 100000
      Period: MONTH

# API key for customer
CustomerApiKey:
  Type: AWS::ApiGateway::ApiKey
  Properties:
    Name: customer-xyz-api-key
    Enabled: true

# Associate API key with usage plan
PremiumUsagePlanKey:
  Type: AWS::ApiGateway::UsagePlanKey
  Properties:
    KeyId: !Ref CustomerApiKey
    KeyType: API_KEY
    UsagePlanId: !Ref PremiumUsagePlan

Client request with API key:

curl -X POST https://api.example.com/orders \
  -H "x-api-key: AbCdEf123456" \
  -H "Content-Type: application/json" \
  -d '{"customerId": "123", "items": [...]}'

Usage plans enable:

Tiered access for partners (basic, premium, enterprise)
Monthly quotas preventing overuse
Per-customer rate limiting
Monitoring and billing per API key

For rate limiting patterns and algorithms, see Rate Limiting.

Caching

API Gateway caching reduces backend load and improves response times by storing responses at the API Gateway layer. When caching is enabled, API Gateway returns cached responses for identical requests without invoking backends.

Cache Configuration

# Enable caching for production stage
ProdStage:
  Type: AWS::ApiGateway::Stage
  Properties:
    RestApiId: !Ref OrderAPI
    StageName: prod
    DeploymentId: !Ref Deployment
    CacheClusterEnabled: true
    CacheClusterSize: '0.5'  # 0.5 GB (smallest)
    MethodSettings:
      # Cache GET requests
      - ResourcePath: /orders/*
        HttpMethod: GET
        CachingEnabled: true
        CacheTtlInSeconds: 300  # 5 minutes
        CacheDataEncrypted: true
      # Don't cache POST/PUT/DELETE
      - ResourcePath: /orders
        HttpMethod: POST
        CachingEnabled: false

Cache cluster sizes:

0.5 GB: $0.020/hour
1.6 GB: $0.038/hour
6.1 GB: $0.200/hour
13.5 GB: $0.250/hour
28.4 GB: $0.500/hour
58.2 GB: $1.000/hour
118 GB: $1.900/hour
237 GB: $3.800/hour

When to enable caching:

Read-heavy APIs with frequently accessed resources
Expensive backend operations (complex queries, external API calls)
Relatively static data (product catalogs, reference data)
High-traffic endpoints

When NOT to cache:

Write operations (POST, PUT, DELETE)
User-specific data requiring authentication context
Real-time data where freshness is critical
Rapidly changing data

Cache Keys

Cache keys determine when responses are served from cache vs. invoking backends. API Gateway constructs cache keys from request parameters.

Default cache key: HTTP method + resource path

GET /orders -> Cache key: GET-/orders
GET /orders/123 -> Cache key: GET-/orders/123

Cache key parameters: Include query strings, headers, or path parameters in cache key.

MethodSettings:
  - ResourcePath: /orders
    HttpMethod: GET
    CachingEnabled: true
    # Include query parameters in cache key
    CacheKeyParameters:
      - method.request.querystring.customerId
      - method.request.querystring.status

Example:

GET /orders?customerId=123&status=pending -> Cache key: GET-/orders-customerId:123-status:pending
GET /orders?customerId=456&status=pending -> Cache key: GET-/orders-customerId:456-status:pending
GET /orders?customerId=123&status=shipped -> Cache key: GET-/orders-customerId:123-status:shipped

Each unique cache key has separate cache entry. Too many cache key parameters reduce cache hit rate.

Cache Invalidation

Cached responses persist until TTL expires. For immediate invalidation:

Manual invalidation:

# Invalidate entire cache for stage
aws apigateway flush-stage-cache \
  --rest-api-id abc123 \
  --stage-name prod

Per-request cache bypass: Include Cache-Control: max-age=0 header:

curl -X GET https://api.example.com/orders/123 \
  -H "Cache-Control: max-age=0" \
  -H "Authorization: Bearer token"

Backend-driven invalidation: Return Cache-Control headers from backend:

// Lambda response with cache control
return {
  statusCode: 200,
  headers: {
    'Cache-Control': 'max-age=60',  // Cache for 60 seconds
    'Content-Type': 'application/json'
  },
  body: JSON.stringify(order)
};

Cache considerations:

Stale data risk: Cached responses may be outdated until TTL expires
Cache hit ratio: Monitor hit ratio - low ratios indicate caching isn't effective
Cost: Cache clusters incur hourly charges - ensure savings justify cost
Security: Ensure cached responses don't leak data across users

For caching strategies and patterns, see Caching.

Monitoring and Logging

Effective monitoring detects issues, provides insights into API usage, and enables capacity planning.

CloudWatch Metrics

API Gateway publishes metrics to CloudWatch automatically:

Key metrics:

Count: Total number of API requests
4XXError: Client errors (400-499 status codes)
5XXError: Server errors (500-599 status codes)
Latency: Time between receiving request and returning response
IntegrationLatency: Time between API Gateway relaying request to backend and receiving response
CacheHitCount: Number of requests served from cache
CacheMissCount: Number of requests not served from cache

/**
 * Custom metrics and alarming.
 */
import { CloudWatch } from 'aws-sdk';

const cloudwatch = new CloudWatch();

// Create alarm for high error rate
await cloudwatch.putMetricAlarm({
  AlarmName: 'HighAPIErrorRate',
  ComparisonOperator: 'GreaterThanThreshold',
  EvaluationPeriods: 2,
  MetricName: '5XXError',
  Namespace: 'AWS/ApiGateway',
  Period: 300,
  Statistic: 'Sum',
  Threshold: 50,
  ActionsEnabled: true,
  AlarmActions: ['arn:aws:sns:us-east-1:123456789012:api-alerts'],
  AlarmDescription: 'Alert when API 5XX errors exceed 50 in 5 minutes',
  Dimensions: [
    { Name: 'ApiName', Value: 'OrderAPI' },
    { Name: 'Stage', Value: 'prod' }
  ]
}).promise();

// Alarm for high latency
await cloudwatch.putMetricAlarm({
  AlarmName: 'HighAPILatency',
  ComparisonOperator: 'GreaterThanThreshold',
  EvaluationPeriods: 2,
  MetricName: 'Latency',
  Namespace: 'AWS/ApiGateway',
  Period: 300,
  Statistic: 'Average',
  Threshold: 1000,  // 1 second
  ActionsEnabled: true,
  AlarmActions: ['arn:aws:sns:us-east-1:123456789012:api-alerts'],
  AlarmDescription: 'Alert when average API latency exceeds 1 second'
}).promise();

Access Logging

Access logs capture detailed request/response information for each API call.

# Enable access logging
ProdStage:
  Type: AWS::ApiGateway::Stage
  Properties:
    RestApiId: !Ref OrderAPI
    StageName: prod
    DeploymentId: !Ref Deployment
    AccessLogSetting:
      DestinationArn: !GetAtt ApiAccessLogGroup.Arn
      Format: >-
        {
          "requestId":"$context.requestId",
          "ip":"$context.identity.sourceIp",
          "caller":"$context.identity.caller",
          "user":"$context.identity.user",
          "requestTime":"$context.requestTime",
          "httpMethod":"$context.httpMethod",
          "resourcePath":"$context.resourcePath",
          "status":"$context.status",
          "protocol":"$context.protocol",
          "responseLength":"$context.responseLength",
          "latency":"$context.responseLatency",
          "integrationLatency":"$context.integrationLatency",
          "integrationStatus":"$context.integrationStatus",
          "errorMessage":"$context.error.message",
          "errorType":"$context.error.messageString"
        }

Log analysis with CloudWatch Insights:

-- Find slow requests
fields @timestamp, requestId, httpMethod, resourcePath, latency
| filter latency > 1000
| sort latency desc
| limit 20

-- Error rate by endpoint
fields httpMethod, resourcePath, status
| filter status >= 400
| stats count() by httpMethod, resourcePath, status

-- Top clients by request volume
fields ip
| stats count() as requestCount by ip
| sort requestCount desc
| limit 10

For comprehensive observability patterns, see AWS Observability, Logging, and Metrics.

Anti-Patterns

Common mistakes that lead to unreliable, insecure, or expensive APIs:

1. Public APIs Without Authentication

Problem: Exposing APIs without authentication allows unauthorized access, abuse, and potential data breaches.

Solution: Always require authentication. Use Lambda authorizers, JWT validation, IAM, or Cognito. Never deploy public APIs without at least API key validation.

2. Not Implementing Throttling

Problem: Without throttling, traffic spikes overwhelm backends causing cascading failures. Single abusive client can consume all capacity.

Solution: Configure throttling at stage and method levels. Use usage plans for per-client limits. Set burst capacity appropriate for workload.

3. Overly Complex VTL Transformations

Problem: Complex VTL templates are difficult to test, debug, and maintain. Logic errors in VTL cause production issues without clear visibility.

Solution: Minimize VTL usage. Prefer making backends API-compatible over VTL transformation. If transformation needed, keep VTL simple or use Lambda for complex logic.

4. Not Enabling CORS

Problem: Browser-based applications fail to call APIs without proper CORS headers, causing "blocked by CORS policy" errors.

Solution: Enable CORS in API Gateway. For REST API, configure OPTIONS method with CORS headers. For HTTP API, use built-in CORS configuration.

5. Caching User-Specific Data

Problem: Caching personalized responses leaks data across users. User A sees User B's data from cache.

Solution: Only cache generic, non-user-specific data. For user-specific endpoints, either disable caching or include user identity in cache key (reduces cache hit rate).

6. Not Monitoring Cache Hit Ratio

Problem: Paying for cache clusters with low hit ratios wastes money. Low hit ratios indicate caching isn't effective.

Solution: Monitor CacheHitCount and CacheMissCount. Calculate hit ratio = hits / (hits + misses). Target >80% hit ratio. Disable caching if consistently below 50%.

7. Long Lambda Authorizer TTL with Frequently Changing Permissions

Problem: Cached authorization decisions persist after permissions revoked, allowing unauthorized access until TTL expires.

Solution: Use short TTL (0-5 minutes) when permissions change frequently. Include permission version in token to force revalidation on permission changes.

8. Not Using Lambda Proxy Integration

Problem: Non-proxy integrations require complex VTL mapping templates for request/response transformation, increasing complexity and debugging difficulty.

Solution: Use Lambda Proxy integration by default. Lambda receives full request context and returns full response. Use non-proxy only when required.

9. Exposing Internal Error Details

Problem: Returning detailed error messages (stack traces, database errors) to clients exposes internal architecture and potential security vulnerabilities.

Solution: Return generic error messages to clients. Log detailed errors internally. Use consistent error format across all endpoints (see API Design).

10. Not Using Custom Domain Names

Problem: Default API Gateway URLs (https://abc123.execute-api.us-east-1.amazonaws.com) are not user-friendly and cannot be migrated if API Gateway changes.

Solution: Use custom domain names (https://api.example.com) with ACM certificates. Enables API migration and provides professional appearance.

API Design - RESTful API design principles, resource modeling, error handling
API Contracts - OpenAPI specifications, contract-first development
API Versioning - Versioning strategies for evolving APIs
API Patterns - Common API patterns and anti-patterns
AWS Compute - Lambda integration, cold starts, performance optimization
AWS IAM - IAM policies for API Gateway access control
Authentication - Authentication patterns and best practices
Authorization - Authorization strategies and RBAC
Rate Limiting - Rate limiting algorithms and patterns
Caching - Caching strategies and invalidation patterns
AWS Observability - Monitoring, logging, and tracing
Real-Time Communication - WebSocket patterns and use cases

Overview​

Core Principles​

API Gateway Types Comparison​

REST API​

HTTP API​

WebSocket API​

API Type Decision Matrix​

REST API Deep Dive​

Request Validation​

Request and Response Transformation​

Integration Types​

HTTP API Deep Dive​

Built-in JWT Authorization​

Automatic Deployments​

WebSocket API​

Connection Lifecycle​

Route Selection​

Broadcasting and Fan-Out​

Authorization and Security​

Lambda Authorizers​

IAM Authorization​

Cognito User Pools​

Throttling and Usage Plans​

Throttling Levels​

Usage Plans and API Keys​

Caching​

Cache Configuration​

Cache Keys​

Cache Invalidation​

Monitoring and Logging​

CloudWatch Metrics​

Access Logging​

Anti-Patterns​

1. Public APIs Without Authentication​

2. Not Implementing Throttling​

3. Overly Complex VTL Transformations​

4. Not Enabling CORS​

5. Caching User-Specific Data​

6. Not Monitoring Cache Hit Ratio​

7. Long Lambda Authorizer TTL with Frequently Changing Permissions​

8. Not Using Lambda Proxy Integration​

9. Exposing Internal Error Details​

10. Not Using Custom Domain Names​

Related Topics​

Further Reading​

Overview

Core Principles

API Gateway Types Comparison

REST API

HTTP API

WebSocket API

API Type Decision Matrix

REST API Deep Dive

Request Validation

Request and Response Transformation

Integration Types

HTTP API Deep Dive

Built-in JWT Authorization

Automatic Deployments

WebSocket API

Connection Lifecycle

Route Selection

Broadcasting and Fan-Out

Authorization and Security

Lambda Authorizers

IAM Authorization

Cognito User Pools

Throttling and Usage Plans

Throttling Levels

Usage Plans and API Keys

Caching

Cache Configuration

Cache Keys

Cache Invalidation

Monitoring and Logging

CloudWatch Metrics

Access Logging

Anti-Patterns

1. Public APIs Without Authentication

2. Not Implementing Throttling

3. Overly Complex VTL Transformations

4. Not Enabling CORS

5. Caching User-Specific Data

6. Not Monitoring Cache Hit Ratio

7. Long Lambda Authorizer TTL with Frequently Changing Permissions

8. Not Using Lambda Proxy Integration

9. Exposing Internal Error Details

10. Not Using Custom Domain Names

Related Topics

Further Reading