Skip to main content

AWS CLI and Automation

Using the AWS Command Line Interface for automation, scripting, and CI/CD pipeline integration.

Overview

The AWS CLI is a unified command-line tool for managing AWS services. It enables automation of deployment workflows, infrastructure operations, and debugging tasks that would be tedious through the AWS Console. This guide covers CLI fundamentals, CI/CD integration patterns, and practical automation scripts for common operations.

While the AWS SDK (covered in AWS SDK Integration) is preferred for application code, the CLI excels at scripting, CI/CD pipelines, and ad-hoc operations. Understanding when to use each tool is essential for efficient AWS automation.


Core Principles

  • Credential security: Never hardcode credentials; use IAM roles, OIDC federation, or AWS SSO
  • Idempotency: Scripts should safely run multiple times without unintended side effects
  • Error handling: Check exit codes and handle failures gracefully with retries where appropriate
  • Output parsing: Use --query and jq for reliable JSON parsing instead of regex on text output
  • Automation-first: Design CLI scripts to run unattended in CI/CD pipelines without manual intervention

Installation and Configuration

Installing AWS CLI v2

AWS CLI v2 is the current version with improved performance and user experience over v1. Always use v2 for new projects.

Linux/macOS:

# Download and install
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip awscliv2.zip
sudo ./aws/install

# Verify installation
aws --version
# aws-cli/2.15.30 Python/3.11.8 Linux/5.15.0-1047-aws exe/x86_64.ubuntu.22

Windows:

# Download MSI installer from https://awscli.amazonaws.com/AWSCLIV2.msi
# Run installer, then verify
aws --version

Docker (for CI/CD):

FROM amazon/aws-cli:2.15.30
# Pre-installed AWS CLI in container

Basic Configuration

The AWS CLI uses a configuration file (~/.aws/config) and credentials file (~/.aws/credentials) to store settings.

Configure interactively:

aws configure
# AWS Access Key ID: AKIAIOSFODNN7EXAMPLE
# AWS Secret Access Key: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
# Default region name: us-east-1
# Default output format: json

This creates two files:

~/.aws/credentials:

[default]
aws_access_key_id = AKIAIOSFODNN7EXAMPLE
aws_secret_access_key = wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY

~/.aws/config:

[default]
region = us-east-1
output = json

Important: The interactive configuration approach stores long-term credentials in plaintext files. This is acceptable for local development by human users, but never use this for applications or CI/CD. Use IAM roles and temporary credentials instead (covered below).

Named Profiles

Profiles enable managing multiple AWS accounts or environments from the same machine.

Configure multiple profiles:

aws configure --profile production
aws configure --profile staging

~/.aws/config:

[default]
region = us-east-1

[profile production]
region = us-east-1
output = json

[profile staging]
region = us-west-2
output = json

Use specific profile:

# Specify profile per command
aws s3 ls --profile production

# Or set environment variable for session
export AWS_PROFILE=production
aws s3 ls # Uses production profile

For multi-account strategies and cross-account access patterns, see AWS IAM.

Credential Precedence

The AWS CLI searches for credentials in this order (highest priority first):

  1. Command line options: --profile, --region
  2. Environment variables: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN
  3. Web identity token: From environment variables (used by EKS pods via IRSA)
  4. Shared credentials file: ~/.aws/credentials
  5. Shared config file: ~/.aws/config
  6. Container credentials: ECS task role (from AWS_CONTAINER_CREDENTIALS_RELATIVE_URI)
  7. Instance metadata: EC2 instance profile (from IMDS)

This precedence chain enables secure credential management: local development uses ~/.aws/credentials, CI/CD uses environment variables or OIDC, and production services use IAM roles via instance profiles or task roles.

Understanding credential precedence helps debug authentication issues: if a script works locally but fails in CI/CD, it's often because the credential source changed (e.g., from credentials file to environment variables).


AWS CLI vs AWS SDK

Both tools interact with AWS APIs, but they serve different purposes:

AspectAWS CLIAWS SDK
Use caseScripting, automation, CI/CD, ad-hoc operationsApplication code, complex logic, production services
LanguageBash/shell scriptsJava, JavaScript, Python, Go, etc.
Error handlingExit codes, stderrTry-catch blocks, typed exceptions
Type safetyNone (strings only)Full type checking (in typed languages)
PerformanceProcess invocation overheadIn-process library calls
Debugging--debug flagApplication logging, debugger
Best forDeployments, infrastructure ops, debuggingBusiness logic, API integrations, data processing

Use CLI when:

  • Automating deployments in GitLab CI/CD pipelines
  • Writing operational scripts (backups, migrations, maintenance)
  • Ad-hoc debugging and troubleshooting
  • Rapid prototyping before SDK implementation

Use SDK when:

  • Implementing application business logic (see AWS SDK Integration)
  • Need type safety and IDE autocomplete
  • Complex error handling and retry logic
  • Performance-critical operations (avoid process overhead)

Common CLI Patterns

Output Formats and Filtering

The CLI supports three output formats: json, text, and table. JSON is the most reliable for scripting because it's machine-parseable.

JSON output (default):

aws ec2 describe-instances
{
"Reservations": [
{
"Instances": [
{
"InstanceId": "i-1234567890abcdef0",
"State": {"Name": "running"},
"PrivateIpAddress": "10.0.1.5"
}
]
}
]
}

Table output (human-readable):

aws ec2 describe-instances --output table
-------------------------------------------------------
| DescribeInstances |
+-----------------------------------------------------+
|| Instances ||
|+----------------+----------------------------------+|
|| InstanceId | i-1234567890abcdef0 ||
|| State | running ||
|| PrivateIp | 10.0.1.5 ||
|+----------------+----------------------------------+|

Text output (space-delimited):

aws ec2 describe-instances --output text
RESERVATIONS r-1234567890abcdef0
INSTANCES i-1234567890abcdef0 running 10.0.1.5

Filtering with --query (JMESPath)

The --query parameter filters JSON output using JMESPath expressions. This is more reliable than piping to grep because it operates on parsed JSON structure.

Example: Get all running instance IDs

aws ec2 describe-instances \
--query 'Reservations[*].Instances[?State.Name==`running`].InstanceId' \
--output text

i-abc123 i-def456 i-ghi789

JMESPath basics:

  • [*] - flatten array (all elements)
  • [?condition] - filter array elements
  • .field - access object property
  • [0] - access first element

Example: Get instance ID and private IP

aws ec2 describe-instances \
--query 'Reservations[*].Instances[*].[InstanceId,PrivateIpAddress]' \
--output text

i-abc123 10.0.1.5
i-def456 10.0.1.6

Example: Filter by tag

aws ec2 describe-instances \
--query 'Reservations[*].Instances[?Tags[?Key==`Environment` && Value==`production`]].InstanceId' \
--output text

Parsing with jq

For complex JSON transformations beyond --query capabilities, pipe to jq:

Example: Extract nested data

aws ecs describe-services --cluster prod --services payment-api | \
jq -r '.services[0].deployments[] | "\(.status): \(.desiredCount) tasks"'

PRIMARY: 3 tasks

Example: Create CSV output

aws s3api list-objects --bucket my-bucket | \
jq -r '.Contents[] | [.Key, .Size, .LastModified] | @csv'

"file1.txt",1024,"2025-01-15T10:30:00Z"
"file2.txt",2048,"2025-01-15T11:00:00Z"

Example: Conditional formatting

aws cloudwatch describe-alarms | \
jq -r '.MetricAlarms[] | select(.StateValue == "ALARM") | .AlarmName'

For more jq patterns, see the jq manual.

Pagination

Many AWS APIs return paginated results. Without handling pagination, you only see the first page (typically 100-1000 items).

Manual pagination (fragile):

aws s3api list-objects --bucket my-bucket --max-items 100
# Only returns first 100 objects

Automatic pagination:

aws s3api list-objects --bucket my-bucket --page-size 100
# CLI automatically fetches all pages, returns complete result set

What's happening:

  • --max-items: Limits total output (CLI stops after N items)
  • --page-size: Controls API call size (fetches N items per request, continues until all retrieved)

For scripting, always use --page-size to ensure you process all results. Smaller page sizes reduce per-request memory but increase total API calls.

Error Handling

The CLI returns exit code 0 on success, non-zero on failure. Always check exit codes in scripts.

Basic error handling:

if aws s3 cp myfile.txt s3://my-bucket/; then
echo "Upload successful"
else
echo "Upload failed with exit code $?"
exit 1
fi

Capture stderr:

ERROR_OUTPUT=$(aws s3 cp myfile.txt s3://my-bucket/ 2>&1)
if [ $? -ne 0 ]; then
echo "Error: $ERROR_OUTPUT"
exit 1
fi

Retry with exponential backoff:

retry_count=0
max_retries=3
until aws s3 cp myfile.txt s3://my-bucket/; do
retry_count=$((retry_count + 1))
if [ $retry_count -ge $max_retries ]; then
echo "Failed after $max_retries attempts"
exit 1
fi
sleep_time=$((2 ** retry_count)) # 2, 4, 8 seconds
echo "Retry $retry_count after $sleep_time seconds"
sleep $sleep_time
done

For resilience patterns, see Spring Boot Resilience.


GitLab CI/CD Integration

The CLI is commonly used in GitLab CI/CD pipelines for deploying to AWS. Secure authentication is critical - never hardcode credentials.

OpenID Connect (OIDC) federation enables GitLab to assume AWS IAM roles without long-term credentials. GitLab issues a JWT token that AWS trusts, exchanging it for temporary credentials.

Why OIDC:

  • No credentials stored in GitLab (eliminates credential leakage risk)
  • Temporary credentials (auto-expire after session)
  • Auditable (CloudTrail logs all AssumeRoleWithWebIdentity calls)
  • Principle of least privilege (role permissions scope exactly what pipeline needs)

Setup: Create OIDC identity provider in AWS

# Terraform configuration
resource "aws_iam_openid_connect_provider" "gitlab" {
url = "https://gitlab.com"

client_id_list = [
"https://gitlab.com"
]

# GitLab's TLS certificate thumbprint
thumbprint_list = [
"b3dd7606d2b5a8b4a13771dbecc9ee1cecafa38a" # GitLab.com thumbprint
]
}

resource "aws_iam_role" "gitlab_deploy" {
name = "gitlab-deploy-role"

assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Effect = "Allow"
Principal = {
Federated = aws_iam_openid_connect_provider.gitlab.arn
}
Action = "sts:AssumeRoleWithWebIdentity"
Condition = {
StringEquals = {
"gitlab.com:sub" = "project_path:my-org/my-project:ref_type:branch:ref:main"
}
}
}]
})
}

resource "aws_iam_role_policy" "deploy_permissions" {
role = aws_iam_role.gitlab_deploy.id

policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Effect = "Allow"
Action = [
"ecr:GetAuthorizationToken",
"ecr:BatchCheckLayerAvailability",
"ecr:PutImage",
"ecr:InitiateLayerUpload",
"ecr:UploadLayerPart",
"ecr:CompleteLayerUpload"
]
Resource = "*"
},
{
Effect = "Allow"
Action = [
"ecs:UpdateService",
"ecs:DescribeServices"
]
Resource = "arn:aws:ecs:us-east-1:123456789012:service/prod-cluster/payment-api"
}
]
})
}

GitLab CI/CD pipeline using OIDC:

deploy:
stage: deploy
image:
name: amazon/aws-cli:2.15.30
entrypoint: [""]
id_tokens:
GITLAB_OIDC_TOKEN:
aud: https://gitlab.com
before_script:
# Assume AWS role using GitLab OIDC token
- >
export $(printf "AWS_ACCESS_KEY_ID=%s AWS_SECRET_ACCESS_KEY=%s AWS_SESSION_TOKEN=%s"
$(aws sts assume-role-with-web-identity
--role-arn arn:aws:iam::123456789012:role/gitlab-deploy-role
--role-session-name "gitlab-${CI_PIPELINE_ID}"
--web-identity-token ${GITLAB_OIDC_TOKEN}
--duration-seconds 3600
--query 'Credentials.[AccessKeyId,SecretAccessKey,SessionToken]'
--output text))
script:
# Now AWS CLI uses temporary credentials from assumed role
- aws sts get-caller-identity # Verify authentication
- aws ecr get-login-password | docker login --username AWS --password-stdin $ECR_REGISTRY
- docker push $ECR_REGISTRY/payment-api:${CI_COMMIT_SHA}
- aws ecs update-service --cluster prod-cluster --service payment-api --force-new-deployment
only:
- main

What's happening:

  1. GitLab issues JWT token (GITLAB_OIDC_TOKEN) identifying the pipeline
  2. Pipeline calls sts assume-role-with-web-identity with the token
  3. AWS validates token against OIDC provider, returns temporary credentials
  4. Export credentials as environment variables
  5. All subsequent AWS CLI calls use temporary credentials

For IAM role patterns and policy design, see AWS IAM. For complete pipeline patterns, see CI/CD Pipelines.

Environment Variables (Alternative)

For simpler setups (or when OIDC isn't available), store IAM user credentials as GitLab CI/CD variables:

GitLab Settings → CI/CD → Variables:

  • AWS_ACCESS_KEY_ID (protected, masked)
  • AWS_SECRET_ACCESS_KEY (protected, masked)
  • AWS_DEFAULT_REGION

Pipeline configuration:

deploy:
stage: deploy
image: amazon/aws-cli:2.15.30
script:
- aws sts get-caller-identity # Verify credentials work
- aws s3 sync ./build s3://my-bucket/
only:
- main

Important: This requires creating an IAM user with long-term credentials. OIDC is preferred because it eliminates credential storage and provides automatic rotation.


Service-Specific Operations

ECR (Elastic Container Registry)

ECR stores Docker images. Common operations include authentication, pushing images, and lifecycle management.

Authenticate Docker to ECR:

aws ecr get-login-password --region us-east-1 | \
docker login --username AWS --password-stdin 123456789012.dkr.ecr.us-east-1.amazonaws.com

Create repository:

aws ecr create-repository --repository-name payment-api --region us-east-1

Tag and push image:

docker tag payment-api:latest 123456789012.dkr.ecr.us-east-1.amazonaws.com/payment-api:v1.2.3
docker push 123456789012.dkr.ecr.us-east-1.amazonaws.com/payment-api:v1.2.3

List images:

aws ecr describe-images --repository-name payment-api --region us-east-1 \
--query 'imageDetails[*].[imageTags[0],imagePushedAt]' --output table

Scan image for vulnerabilities:

# Trigger scan
aws ecr start-image-scan --repository-name payment-api --image-id imageTag=v1.2.3

# Wait for scan completion
aws ecr wait image-scan-complete --repository-name payment-api --image-id imageTag=v1.2.3

# Get scan results
aws ecr describe-image-scan-findings --repository-name payment-api --image-id imageTag=v1.2.3 \
--query 'imageScanFindings.findingSeverityCounts'

Delete old images (lifecycle management):

# Get images older than 30 days
aws ecr describe-images --repository-name payment-api \
--query 'imageDetails[?imagePushedAt<`'$(date -d '30 days ago' --iso-8601)'`].[imageDigest]' \
--output text | \
while read digest; do
aws ecr batch-delete-image --repository-name payment-api --image-ids imageDigest=$digest
done

For container image best practices, see Docker Guidelines.

ECS (Elastic Container Service)

ECS runs Docker containers. Common operations include updating services and checking deployment status.

Update service (force new deployment):

aws ecs update-service \
--cluster prod-cluster \
--service payment-api \
--force-new-deployment

Update service with new image:

# Get current task definition
TASK_DEF=$(aws ecs describe-services --cluster prod-cluster --services payment-api \
--query 'services[0].taskDefinition' --output text)

# Register new task definition with updated image
NEW_TASK_DEF=$(aws ecs describe-task-definition --task-definition $TASK_DEF \
--query 'taskDefinition' | \
jq '.containerDefinitions[0].image="123456789012.dkr.ecr.us-east-1.amazonaws.com/payment-api:v1.2.3"' | \
jq 'del(.taskDefinitionArn, .revision, .status, .requiresAttributes, .compatibilities, .registeredAt, .registeredBy)' | \
aws ecs register-task-definition --cli-input-json file:///dev/stdin \
--query 'taskDefinition.taskDefinitionArn' --output text)

# Update service to use new task definition
aws ecs update-service --cluster prod-cluster --service payment-api --task-definition $NEW_TASK_DEF

Check deployment status:

aws ecs describe-services --cluster prod-cluster --services payment-api \
--query 'services[0].deployments[*].[status,desiredCount,runningCount]' \
--output table

Wait for deployment completion:

aws ecs wait services-stable --cluster prod-cluster --services payment-api
echo "Deployment completed successfully"

List running tasks:

aws ecs list-tasks --cluster prod-cluster --service-name payment-api \
--desired-status RUNNING --output text --query 'taskArns[*]'

Get task logs (via CloudWatch):

# Get task details to find log stream
TASK_ID=$(aws ecs list-tasks --cluster prod-cluster --service-name payment-api --desired-status RUNNING \
--query 'taskArns[0]' --output text | awk -F/ '{print $NF}')

# Fetch recent logs
aws logs tail /ecs/prod-cluster/payment-api --since 10m --follow --format short

For ECS deployment patterns, see AWS Compute.

EKS (Elastic Kubernetes Service)

EKS runs Kubernetes clusters. The CLI primarily configures kubectl access.

Update kubeconfig for cluster access:

aws eks update-kubeconfig --name prod-cluster --region us-east-1

# Verify connectivity
kubectl get nodes

This command:

  • Fetches cluster endpoint and certificate
  • Creates/updates ~/.kube/config with cluster details
  • Configures aws eks get-token as authentication provider
  • Enables kubectl to authenticate via IAM

Get cluster status:

aws eks describe-cluster --name prod-cluster --query 'cluster.status'

List node groups:

aws eks list-nodegroups --cluster-name prod-cluster --output table

Scale node group:

aws eks update-nodegroup-config \
--cluster-name prod-cluster \
--nodegroup-name general-purpose \
--scaling-config desiredSize=5,minSize=3,maxSize=10

For Kubernetes operations, use kubectl after configuring access. See Kubernetes Guidelines and AWS EKS for cluster management patterns.

S3 (Simple Storage Service)

S3 stores objects (files). The CLI provides high-level commands (s3) and low-level API commands (s3api).

Upload file:

aws s3 cp myfile.txt s3://my-bucket/path/to/myfile.txt

Upload directory recursively:

aws s3 sync ./dist s3://my-bucket/static/ --delete
# --delete removes files in S3 not present locally (keeps S3 in sync)

Download file:

aws s3 cp s3://my-bucket/data.json ./data.json

List bucket contents:

aws s3 ls s3://my-bucket/logs/ --recursive --human-readable --summarize

Generate presigned URL (temporary download link):

aws s3 presign s3://my-bucket/private-file.pdf --expires-in 3600
# Returns URL valid for 1 hour

Set object ACL:

aws s3api put-object-acl --bucket my-bucket --key file.txt --acl private

Enable versioning:

aws s3api put-bucket-versioning --bucket my-bucket --versioning-configuration Status=Enabled

Lifecycle policy (delete old versions):

aws s3api put-bucket-lifecycle-configuration --bucket my-bucket --lifecycle-configuration file://lifecycle.json

lifecycle.json:

{
"Rules": [{
"Id": "DeleteOldVersions",
"Status": "Enabled",
"NoncurrentVersionExpiration": {
"NoncurrentDays": 90
}
}]
}

For comprehensive S3 patterns, see File Storage Guidelines.

RDS (Relational Database Service)

RDS manages relational databases. CLI operations include backups, snapshots, and instance management.

Create snapshot:

aws rds create-db-snapshot \
--db-instance-identifier prod-db \
--db-snapshot-identifier prod-db-$(date +%Y%m%d-%H%M%S)

Restore from snapshot:

aws rds restore-db-instance-from-db-snapshot \
--db-instance-identifier restored-db \
--db-snapshot-identifier prod-db-20250115-100000

Modify instance (change instance type):

aws rds modify-db-instance \
--db-instance-identifier prod-db \
--db-instance-class db.r6g.xlarge \
--apply-immediately

Get connection endpoint:

aws rds describe-db-instances --db-instance-identifier prod-db \
--query 'DBInstances[0].[Endpoint.Address,Endpoint.Port]' --output text

For database patterns, see AWS Databases.

Lambda

Lambda executes serverless functions. CLI operations include invocation and deployment.

Invoke function:

aws lambda invoke \
--function-name payment-processor \
--payload '{"customerId": "123", "amount": 100}' \
--cli-binary-format raw-in-base64-out \
response.json

cat response.json

Update function code:

# Package code
zip function.zip index.js

# Update Lambda function
aws lambda update-function-code \
--function-name payment-processor \
--zip-file fileb://function.zip

Update environment variables:

aws lambda update-function-configuration \
--function-name payment-processor \
--environment Variables="{DB_HOST=db.example.com,LOG_LEVEL=INFO}"

Get function logs:

aws logs tail /aws/lambda/payment-processor --follow

For Lambda patterns, see AWS Compute.


Automation Patterns

Deployment Script Template

deploy.sh:

#!/bin/bash
set -euo pipefail # Exit on error, undefined variable, pipe failure

# Configuration
CLUSTER="prod-cluster"
SERVICE="payment-api"
IMAGE_TAG="${1:-latest}" # Default to 'latest' if no argument
ECR_REGISTRY="123456789012.dkr.ecr.us-east-1.amazonaws.com"
IMAGE="${ECR_REGISTRY}/payment-api:${IMAGE_TAG}"

echo "Deploying ${SERVICE} with image ${IMAGE}"

# Authenticate to ECR
echo "Authenticating to ECR..."
aws ecr get-login-password --region us-east-1 | \
docker login --username AWS --password-stdin "${ECR_REGISTRY}"

# Push image
echo "Pushing image..."
docker tag payment-api:latest "${IMAGE}"
docker push "${IMAGE}"

# Update task definition
echo "Updating task definition..."
TASK_FAMILY="${SERVICE}"
CURRENT_TASK_DEF=$(aws ecs describe-task-definition --task-definition "${TASK_FAMILY}" --query 'taskDefinition')

NEW_TASK_DEF=$(echo "${CURRENT_TASK_DEF}" | \
jq --arg IMAGE "${IMAGE}" '.containerDefinitions[0].image=$IMAGE' | \
jq 'del(.taskDefinitionArn, .revision, .status, .requiresAttributes, .compatibilities, .registeredAt, .registeredBy)')

NEW_TASK_ARN=$(echo "${NEW_TASK_DEF}" | \
aws ecs register-task-definition --cli-input-json file:///dev/stdin \
--query 'taskDefinition.taskDefinitionArn' --output text)

echo "Registered new task definition: ${NEW_TASK_ARN}"

# Update service
echo "Updating service..."
aws ecs update-service \
--cluster "${CLUSTER}" \
--service "${SERVICE}" \
--task-definition "${NEW_TASK_ARN}" \
--query 'service.serviceName' \
--output text

# Wait for deployment
echo "Waiting for deployment to complete..."
aws ecs wait services-stable --cluster "${CLUSTER}" --services "${SERVICE}"

# Verify deployment
RUNNING_COUNT=$(aws ecs describe-services --cluster "${CLUSTER}" --services "${SERVICE}" \
--query 'services[0].runningCount' --output text)

DESIRED_COUNT=$(aws ecs describe-services --cluster "${CLUSTER}" --services "${SERVICE}" \
--query 'services[0].desiredCount' --output text)

if [ "${RUNNING_COUNT}" -eq "${DESIRED_COUNT}" ]; then
echo "✓ Deployment successful: ${RUNNING_COUNT}/${DESIRED_COUNT} tasks running"
exit 0
else
echo "✗ Deployment verification failed: ${RUNNING_COUNT}/${DESIRED_COUNT} tasks running"
exit 1
fi

Usage:

./deploy.sh v1.2.3

Backup Automation

backup-rds.sh:

#!/bin/bash
set -euo pipefail

DB_INSTANCE="prod-db"
SNAPSHOT_PREFIX="automated-backup"
RETENTION_DAYS=30

# Create snapshot
SNAPSHOT_ID="${SNAPSHOT_PREFIX}-$(date +%Y%m%d-%H%M%S)"
echo "Creating snapshot: ${SNAPSHOT_ID}"

aws rds create-db-snapshot \
--db-instance-identifier "${DB_INSTANCE}" \
--db-snapshot-identifier "${SNAPSHOT_ID}" \
--tags Key=Type,Value=AutomatedBackup Key=CreatedAt,Value="$(date --iso-8601=seconds)"

# Wait for snapshot completion
echo "Waiting for snapshot to complete..."
aws rds wait db-snapshot-completed --db-snapshot-identifier "${SNAPSHOT_ID}"

echo "✓ Snapshot created successfully"

# Delete old snapshots
echo "Cleaning up snapshots older than ${RETENTION_DAYS} days..."
CUTOFF_DATE=$(date -d "${RETENTION_DAYS} days ago" --iso-8601)

aws rds describe-db-snapshots --db-instance-identifier "${DB_INSTANCE}" \
--query "DBSnapshots[?starts_with(DBSnapshotIdentifier, '${SNAPSHOT_PREFIX}')].[DBSnapshotIdentifier,SnapshotCreateTime]" \
--output text | \
while read -r snapshot_id snapshot_time; do
if [[ "${snapshot_time}" < "${CUTOFF_DATE}" ]]; then
echo "Deleting old snapshot: ${snapshot_id} (${snapshot_time})"
aws rds delete-db-snapshot --db-snapshot-identifier "${snapshot_id}"
fi
done

echo "✓ Backup completed"

Run via cron:

# Run daily at 2 AM
0 2 * * * /path/to/backup-rds.sh >> /var/log/backup-rds.log 2>&1

Multi-Account Operations

cross-account-deploy.sh:

#!/bin/bash
set -euo pipefail

# Assume role in target account
TARGET_ROLE="arn:aws:iam::999999999999:role/DeployRole"
SESSION_NAME="deploy-session-$(date +%s)"

echo "Assuming role: ${TARGET_ROLE}"
CREDENTIALS=$(aws sts assume-role \
--role-arn "${TARGET_ROLE}" \
--role-session-name "${SESSION_NAME}" \
--query 'Credentials.[AccessKeyId,SecretAccessKey,SessionToken]' \
--output text)

# Export temporary credentials
export AWS_ACCESS_KEY_ID=$(echo "${CREDENTIALS}" | awk '{print $1}')
export AWS_SECRET_ACCESS_KEY=$(echo "${CREDENTIALS}" | awk '{print $2}')
export AWS_SESSION_TOKEN=$(echo "${CREDENTIALS}" | awk '{print $3}')

# Verify assumed identity
echo "Current identity:"
aws sts get-caller-identity

# Perform operations in target account
echo "Deploying to target account..."
aws ecs update-service --cluster prod-cluster --service payment-api --force-new-deployment

echo "✓ Cross-account deployment completed"

For cross-account IAM patterns, see AWS IAM.


Debugging and Troubleshooting

Debug Output

Enable verbose output to see API calls and responses:

aws s3 ls s3://my-bucket/ --debug 2> debug.log

What's included:

  • API endpoint URLs
  • Request headers and body
  • Response status and headers
  • Credential resolution steps
  • Retry attempts

Use cases:

  • Authentication failures (which credentials are being used?)
  • API errors (what's the exact error message?)
  • Performance issues (how many retries? network latency?)

Credential Verification

# Which credentials is CLI using?
aws sts get-caller-identity

# Output shows:
{
"UserId": "AIDAI23HXW2EQ",
"Account": "123456789012",
"Arn": "arn:aws:iam::123456789012:user/developer"
}

This confirms:

  • Authentication is working
  • Which IAM principal is active
  • Which AWS account you're operating in

CloudTrail for Audit

Every AWS CLI operation creates a CloudTrail event. Use CloudTrail to debug permission issues or unexpected behavior.

Find who deleted an S3 object:

aws cloudtrail lookup-events \
--lookup-attributes AttributeKey=EventName,AttributeValue=DeleteObject \
--max-results 10 \
--query 'Events[*].[EventTime,Username,CloudTrailEvent]' \
--output text

Find all ECS service updates:

aws cloudtrail lookup-events \
--lookup-attributes AttributeKey=EventName,AttributeValue=UpdateService \
--start-time $(date -d '24 hours ago' --iso-8601) \
--query 'Events[*].[EventTime,Username]'

For observability patterns, see AWS Observability.

Common Errors

"Unable to locate credentials"

  • Solution: Configure credentials via aws configure, environment variables, or IAM role

"Access Denied"

  • Solution: Check IAM policy attached to your user/role has required permissions
  • Debug: Run with --debug to see exact API call being denied
  • Verify: Use aws sts get-caller-identity to confirm which principal is active

"Rate exceeded"

  • Solution: AWS API rate limits exceeded, implement exponential backoff retry
  • Prevention: Reduce API call frequency, use pagination properly

"InvalidParameterException"

  • Solution: Check parameter format (e.g., ARNs must be fully qualified)
  • Debug: Run with --debug to see request payload

Best Practices

Security Best Practices

  • Never hardcode credentials: Use IAM roles, OIDC federation, or environment variables
  • Principle of least privilege: Grant only permissions required for specific operations
  • Use temporary credentials: Prefer sts assume-role over long-term access keys
  • Rotate access keys regularly: If IAM users are unavoidable, rotate keys every 90 days
  • Enable MFA for sensitive operations: Require MFA for production deployments

Scripting Best Practices

  • Use set -euo pipefail: Exit on errors, undefined variables, pipe failures
  • Check exit codes: Always validate command success before proceeding
  • Use --query for filtering: More reliable than regex on text output
  • Implement idempotency: Scripts should safely run multiple times
  • Add logging: Echo progress messages for debugging
  • Handle pagination: Use --page-size to fetch all results

Performance Best Practices

  • Batch operations: Use batch-delete-image instead of individual deletes
  • Parallel execution: Use xargs -P or background jobs for independent operations
  • Cache credentials: Assumed role credentials valid for 1 hour (don't re-assume every command)
  • Use local filtering: Filter with --query before downloading large result sets

Anti-Patterns

Security Anti-Patterns

  • Hardcoded credentials in scripts: Credentials leak via version control, logs, error messages
  • Root account usage: Root has unrestricted access; use IAM users/roles instead
  • Overly permissive policies: "Action": "*" on "Resource": "*" is never acceptable
  • Long-term credentials in CI/CD: Use OIDC federation instead of storing access keys

Scripting Anti-Patterns

  • Ignoring errors: Not checking exit codes leads to cascade failures
  • Parsing text output: Use JSON output and --query or jq for reliability
  • Missing pagination: Only processing first page of results
  • No retry logic: Network failures and rate limits require exponential backoff
  • Synchronous operations: Running serial commands when parallel execution possible

Operational Anti-Patterns

  • Manual deployments: Automate via CI/CD pipelines for consistency and auditability
  • No rollback plan: Always know how to revert deployments
  • Deploying untested changes: Test in lower environments first
  • No deployment verification: Check service health after deployment completes

Summary

The AWS CLI is a powerful tool for automation, scripting, and CI/CD integration:

Key Capabilities:

  • Unified interface for all AWS services
  • Credential management with IAM roles and OIDC federation
  • JSON output parsing with --query and jq
  • Pagination handling for large result sets
  • GitLab CI/CD integration for automated deployments

Common Operations:

  • ECR: Docker image push, vulnerability scanning, lifecycle management
  • ECS: Service updates, deployment verification, task management
  • EKS: Kubeconfig setup, cluster access configuration
  • S3: File upload/download, sync, presigned URLs
  • RDS: Snapshots, backups, instance management
  • Lambda: Function invocation, code deployment, log tailing

Best Practices:

  • Use OIDC federation for CI/CD (no stored credentials)
  • Implement error handling with retries and exponential backoff
  • Filter output with --query before piping to other tools
  • Handle pagination properly to process all results
  • Add idempotency to scripts for safe re-execution

When to Use:

  • GitLab CI/CD pipelines for deployments
  • Operational scripts (backups, migrations, cleanup)
  • Ad-hoc debugging and troubleshooting
  • Rapid prototyping before SDK implementation

When to Use SDK Instead:

  • Application business logic (see AWS SDK Integration)
  • Complex error handling and retry logic
  • Type-safe operations with IDE support
  • Performance-critical operations

Cross-References:

Further Reading: