Skip to main content

AWS S3 (Simple Storage Service)

Overview

S3 is AWS's object storage service and the foundational building block for cloud storage. S3 stores objects (files) in buckets with a flat namespace, accessible via HTTP/HTTPS. S3 should be your default choice for storing user-generated content, backups, static assets, and data lakes due to its unlimited scalability, 99.999999999% (11 9's) durability, and extensive ecosystem integration.

Unlike file systems, S3 doesn't have folders - it uses object keys with "/" delimiters to simulate directory structures. S3 automatically replicates objects across multiple devices in multiple facilities within a region, providing exceptional durability without requiring manual configuration.

Core Principles

  1. S3 by default - Use S3 for most storage needs unless you specifically require block or file system access
  2. Optimize with storage classes - Use Intelligent-Tiering or lifecycle policies to reduce costs by 80-90%
  3. Always encrypt - Enable default encryption for all buckets (SSE-S3 at minimum)
  4. Block public access - Enable Block Public Access unless you explicitly need public buckets
  5. Use presigned URLs - Never make buckets public; use presigned URLs for temporary access

S3 Architecture and Key Concepts

Buckets: Top-level containers for objects. Bucket names are globally unique across all AWS accounts. Each bucket belongs to a specific region but can be accessed from anywhere.

Objects: Files stored in S3, consisting of data (the file content) and metadata (key-value pairs describing the object). Objects are identified by keys within a bucket. Maximum object size is 5 TB.

Keys: Unique identifiers for objects within a bucket. Keys can include "/" to simulate folder structures (e.g., users/12345/profile.jpg), but S3 treats the entire key as a single string.

Storage Classes: S3 offers multiple storage classes with different cost and access characteristics. Objects can automatically transition between classes based on lifecycle rules.

Durability and Availability: S3 automatically replicates objects across multiple devices in multiple facilities within a region, providing 99.999999999% (11 9's) durability and 99.99% availability for Standard storage class.

11 9's durability explained: If you store 10,000,000 objects in S3, you can expect to lose one object every 10,000 years on average. This is achieved through automatic replication across multiple devices and facilities within a region.

S3 Storage Classes

S3 offers multiple storage classes optimized for different access patterns and cost requirements. Choosing the right storage class can reduce costs by 80-90% for infrequently accessed data.

S3 Standard - Default storage class for frequently accessed data:

  • Cost: $0.023/GB-month (first 50 TB)
  • Latency: Milliseconds
  • Durability: 99.999999999% (11 9's)
  • Availability: 99.99%
  • Retrieval cost: None
  • Use cases: Active user uploads, frequently accessed application data, CDN origins, dynamic website content

S3 Intelligent-Tiering - Automatic cost optimization for unpredictable access patterns:

  • Cost: $0.023/GB-month (frequent access tier) + $0.0025/1000 objects monitoring fee
  • Automatic tiering: Moves objects between frequent and infrequent access tiers based on usage
  • No retrieval fees: Unlike Standard-IA
  • Latency: Milliseconds
  • Use cases: Unknown or changing access patterns, general-purpose storage when you want automatic optimization without lifecycle management complexity

S3 Standard-IA (Infrequent Access) - Lower cost for infrequently accessed data:

  • Cost: $0.0125/GB-month storage + $0.01/GB retrieval
  • Latency: Milliseconds (same as Standard)
  • Minimum storage duration: 30 days
  • Minimum object size: 128 KB (smaller objects charged as 128 KB)
  • Use cases: Backups, disaster recovery data, infrequently accessed user content (old photos)

S3 One Zone-IA - Lower cost, single AZ storage:

  • Cost: $0.01/GB-month storage + $0.01/GB retrieval
  • Latency: Milliseconds
  • Durability: 99.999999999% within a single AZ (data lost if AZ is destroyed)
  • Availability: 99.5% (lower than Standard)
  • Use cases: Reproducible data, thumbnails that can be regenerated, secondary backups

S3 Glacier Instant Retrieval - Archive storage with millisecond retrieval:

  • Cost: $0.004/GB-month storage + $0.03/GB retrieval
  • Latency: Milliseconds (same as Standard)
  • Minimum storage duration: 90 days
  • Use cases: Medical images, news media archives, user-generated content archives accessed quarterly

S3 Glacier Flexible Retrieval - Low-cost archive with retrieval times from minutes to hours:

  • Cost: $0.0036/GB-month storage
  • Retrieval options:
    • Expedited: 1-5 minutes ($0.03/GB + $0.01 per request)
    • Standard: 3-5 hours ($0.01/GB)
    • Bulk: 5-12 hours ($0.0025/GB)
  • Minimum storage duration: 90 days
  • Use cases: Long-term backups, compliance archives, data accessed once or twice per year

S3 Glacier Deep Archive - Lowest cost for long-term archival:

  • Cost: $0.00099/GB-month (~$1/TB-month)
  • Retrieval options:
    • Standard: 12 hours ($0.02/GB)
    • Bulk: 48 hours ($0.0025/GB)
  • Minimum storage duration: 180 days
  • Use cases: Compliance data retained for 7-10 years, regulatory archives, data rarely or never accessed

Choosing storage classes:

  • Unknown access patterns: Use Intelligent-Tiering (simplest, automatic)
  • Frequent access (daily/weekly): Use S3 Standard
  • Infrequent access (monthly): Use Standard-IA or Intelligent-Tiering
  • Archive (quarterly/yearly): Use Glacier Instant Retrieval
  • Compliance/long-term retention: Use Glacier Flexible Retrieval or Deep Archive

Cost comparison example (1 TB for 1 year):

  • S3 Standard: $276
  • S3 Intelligent-Tiering: ~$150 (average with mixed access)
  • S3 Standard-IA: $150
  • S3 Glacier Instant: $48
  • S3 Glacier Flexible: $43.20
  • S3 Glacier Deep Archive: $11.88

S3 Lifecycle Policies

Lifecycle policies automatically transition objects between storage classes or delete them based on age, reducing costs without manual intervention. This is one of the most effective cost optimization strategies for S3.

// Configure comprehensive S3 lifecycle policies
@Service
public class S3LifecycleService {

private final S3Client s3Client;

public void configureLifecyclePolicy(String bucketName) {
LifecycleRule rule = LifecycleRule.builder()
.id("optimize-storage-costs")
.status(ExpirationStatus.ENABLED)
.filter(LifecycleRuleFilter.builder()
.prefix("user-uploads/")
.build())
.transitions(
// Move to Intelligent-Tiering immediately (simplest approach)
Transition.builder()
.days(0)
.storageClass(TransitionStorageClass.INTELLIGENT_TIERING)
.build()

// OR use manual transitions for predictable access patterns:
// Move to Standard-IA after 30 days
// Transition.builder()
// .days(30)
// .storageClass(TransitionStorageClass.STANDARD_IA)
// .build(),

// Move to Glacier Instant Retrieval after 90 days
// Transition.builder()
// .days(90)
// .storageClass(TransitionStorageClass.GLACIER_IR)
// .build(),

// Move to Glacier Flexible Retrieval after 180 days
// Transition.builder()
// .days(180)
// .storageClass(TransitionStorageClass.GLACIER)
// .build(),

// Move to Deep Archive after 365 days
// Transition.builder()
// .days(365)
// .storageClass(TransitionStorageClass.DEEP_ARCHIVE)
// .build()
)
.expiration(LifecycleExpiration.builder()
.days(2555) // Delete after 7 years
.build())
.noncurrentVersionExpiration(NoncurrentVersionExpiration.builder()
.noncurrentDays(30) // Delete old versions after 30 days
.build())
.abortIncompleteMultipartUpload(AbortIncompleteMultipartUpload.builder()
.daysAfterInitiation(7) // Clean up abandoned multipart uploads
.build())
.build();

s3Client.putBucketLifecycleConfiguration(
PutBucketLifecycleConfigurationRequest.builder()
.bucket(bucketName)
.lifecycleConfiguration(BucketLifecycleConfiguration.builder()
.rules(rule)
.build())
.build()
);

log.info("Configured lifecycle policy for bucket: {}", bucketName);
}

// Configure different policies for different prefixes
public void configureMultipleLifecyclePolicies(String bucketName) {
List<LifecycleRule> rules = new ArrayList<>();

// User uploads - use Intelligent-Tiering
rules.add(LifecycleRule.builder()
.id("user-uploads-intelligent-tiering")
.status(ExpirationStatus.ENABLED)
.filter(LifecycleRuleFilter.builder()
.prefix("uploads/")
.build())
.transitions(Transition.builder()
.days(0)
.storageClass(TransitionStorageClass.INTELLIGENT_TIERING)
.build())
.build());

// Application logs - aggressive archival
rules.add(LifecycleRule.builder()
.id("logs-archival")
.status(ExpirationStatus.ENABLED)
.filter(LifecycleRuleFilter.builder()
.prefix("logs/")
.build())
.transitions(
Transition.builder()
.days(7)
.storageClass(TransitionStorageClass.GLACIER)
.build(),
Transition.builder()
.days(90)
.storageClass(TransitionStorageClass.DEEP_ARCHIVE)
.build()
)
.expiration(LifecycleExpiration.builder()
.days(2555) // 7 years retention
.build())
.build());

// Temporary files - short retention
rules.add(LifecycleRule.builder()
.id("temp-files-cleanup")
.status(ExpirationStatus.ENABLED)
.filter(LifecycleRuleFilter.builder()
.prefix("temp/")
.build())
.expiration(LifecycleExpiration.builder()
.days(1) // Delete after 24 hours
.build())
.build());

// Database backups - transition to cheaper storage
rules.add(LifecycleRule.builder()
.id("backup-archival")
.status(ExpirationStatus.ENABLED)
.filter(LifecycleRuleFilter.builder()
.prefix("backups/")
.build())
.transitions(
Transition.builder()
.days(30)
.storageClass(TransitionStorageClass.STANDARD_IA)
.build(),
Transition.builder()
.days(90)
.storageClass(TransitionStorageClass.GLACIER_IR)
.build()
)
.expiration(LifecycleExpiration.builder()
.days(365) // Keep backups for 1 year
.build())
.build());

s3Client.putBucketLifecycleConfiguration(
PutBucketLifecycleConfigurationRequest.builder()
.bucket(bucketName)
.lifecycleConfiguration(BucketLifecycleConfiguration.builder()
.rules(rules)
.build())
.build()
);
}
}

Lifecycle policy best practices:

  • Use Intelligent-Tiering for unknown access patterns (simplest approach, no manual transitions needed)
  • Transition to Standard-IA after 30 days for known infrequent access
  • Move compliance data directly to Glacier Deep Archive with long retention
  • Always configure abort incomplete multipart uploads (prevents abandoned upload costs)
  • Use versioning with noncurrent version expiration to manage old versions
  • Create separate policies for different prefixes (logs vs user uploads vs backups)

Cost savings example: 1 TB of data with lifecycle policy:

  • Month 0-30: Standard ($23)
  • Month 30-90: Standard-IA ($12.50)
  • Month 90-365: Glacier Flexible ($3.60)
  • Month 365+: Deep Archive ($0.99)
  • Average monthly cost: ~$8 vs $23 without lifecycle management (65% savings)

S3 Security and Access Control

S3 provides multiple layers of security to protect your data. Always enable Block Public Access and encryption by default.

Block Public Access

Block Public Access is an account and bucket-level setting that overrides all other permissions, preventing accidental public exposure.

// Enable block public access at bucket level
@Configuration
public class S3SecurityConfig {

private final S3Client s3Client;

public void enforceBlockPublicAccess(String bucketName) {
s3Client.putPublicAccessBlock(
PutPublicAccessBlockRequest.builder()
.bucket(bucketName)
.publicAccessBlockConfiguration(PublicAccessBlockConfiguration.builder()
.blockPublicAcls(true) // Block public ACLs on this bucket
.blockPublicPolicy(true) // Block public bucket policies
.ignorePublicAcls(true) // Ignore existing public ACLs
.restrictPublicBuckets(true) // Restrict cross-account access
.build())
.build()
);

log.info("Enabled block public access for bucket: {}", bucketName);
}

// Enable at account level (recommended)
public void enforceAccountLevelBlockPublicAccess() {
S3ControlClient s3Control = S3ControlClient.create();

s3Control.putPublicAccessBlock(
software.amazon.awssdk.services.s3control.model.PutPublicAccessBlockRequest.builder()
.accountId("123456789012")
.publicAccessBlockConfiguration(
software.amazon.awssdk.services.s3control.model.PublicAccessBlockConfiguration.builder()
.blockPublicAcls(true)
.blockPublicPolicy(true)
.ignorePublicAcls(true)
.restrictPublicBuckets(true)
.build())
.build()
);

log.info("Enabled block public access at account level");
}
}

Always enable Block Public Access unless you have a specific reason for public buckets (like static website hosting). Most S3 data breaches occur because of accidentally public buckets.

Bucket Policies

Bucket policies are JSON-based access control policies attached to buckets, defining who can access objects and under what conditions.

// Comprehensive secure bucket policy
@Service
public class S3BucketPolicyService {

private final S3Client s3Client;

public void applySecureBucketPolicy(String bucketName, String accountId) {
String policy = """
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "DenyInsecureTransport",
"Effect": "Deny",
"Principal": "*",
"Action": "s3:*",
"Resource": [
"arn:aws:s3:::%s",
"arn:aws:s3:::%s/*"
],
"Condition": {
"Bool": {
"aws:SecureTransport": "false"
}
}
},
{
"Sid": "DenyUnencryptedObjectUploads",
"Effect": "Deny",
"Principal": "*",
"Action": "s3:PutObject",
"Resource": "arn:aws:s3:::%s/*",
"Condition": {
"StringNotEquals": {
"s3:x-amz-server-side-encryption": ["AES256", "aws:kms"]
}
}
},
{
"Sid": "RequireMFAForDelete",
"Effect": "Deny",
"Principal": "*",
"Action": [
"s3:DeleteObject",
"s3:DeleteObjectVersion"
],
"Resource": "arn:aws:s3:::%s/*",
"Condition": {
"BoolIfExists": {
"aws:MultiFactorAuthPresent": "false"
}
}
}
]
}
""".formatted(bucketName, bucketName, bucketName, bucketName);

s3Client.putBucketPolicy(
PutBucketPolicyRequest.builder()
.bucket(bucketName)
.policy(policy)
.build()
);

log.info("Applied secure bucket policy to: {}", bucketName);
}
}

Bucket policy use cases:

  • Enforce HTTPS-only access (deny non-SSL requests)
  • Require encryption for all uploads
  • Restrict access to specific IP ranges or VPCs
  • Require MFA for sensitive operations (delete)
  • Grant cross-account access

For comprehensive IAM and access control patterns, see AWS IAM and Authorization.

S3 Encryption

S3 supports multiple encryption options for protecting data at rest and in transit. Always enable default encryption for all buckets.

Server-Side Encryption (SSE) - S3 encrypts objects at rest:

SSE-S3 (S3-managed keys):

  • Uses AES-256 encryption
  • S3 manages encryption keys automatically
  • No additional cost
  • Default and recommended option for most use cases

SSE-KMS (KMS-managed keys):

  • Uses AWS KMS for key management
  • Provides audit trail via CloudTrail
  • Additional KMS API costs ($0.03 per 10,000 requests)
  • Required for compliance scenarios needing key rotation and access control

SSE-C (Customer-provided keys):

  • You provide and manage encryption keys
  • Keys must be sent with every request
  • Use only when you have specific requirements to manage keys externally
// Configure S3 with default encryption
@Service
public class S3EncryptionService {

private final S3Client s3Client;

// Enable default SSE-S3 encryption
public void enableDefaultEncryption(String bucketName) {
s3Client.putBucketEncryption(
PutBucketEncryptionRequest.builder()
.bucket(bucketName)
.serverSideEncryptionConfiguration(
ServerSideEncryptionConfiguration.builder()
.rules(ServerSideEncryptionRule.builder()
.applyServerSideEncryptionByDefault(
ServerSideEncryptionByDefault.builder()
.sseAlgorithm(ServerSideEncryption.AES256)
.build())
.bucketKeyEnabled(true)
.build())
.build())
.build()
);

log.info("Enabled SSE-S3 default encryption for bucket: {}", bucketName);
}

// Enable SSE-KMS encryption with custom key
public void enableKMSEncryption(String bucketName, String kmsKeyId) {
s3Client.putBucketEncryption(
PutBucketEncryptionRequest.builder()
.bucket(bucketName)
.serverSideEncryptionConfiguration(
ServerSideEncryptionConfiguration.builder()
.rules(ServerSideEncryptionRule.builder()
.applyServerSideEncryptionByDefault(
ServerSideEncryptionByDefault.builder()
.sseAlgorithm(ServerSideEncryption.AWS_KMS)
.kmsMasterKeyId(kmsKeyId)
.build())
.bucketKeyEnabled(true) // Reduces KMS API calls and costs
.build())
.build())
.build()
);

log.info("Enabled SSE-KMS encryption for bucket: {} with key: {}", bucketName, kmsKeyId);
}

// Upload with explicit encryption
public void uploadEncryptedFile(String bucketName, String key, byte[] data) {
s3Client.putObject(
PutObjectRequest.builder()
.bucket(bucketName)
.key(key)
.serverSideEncryption(ServerSideEncryption.AES256)
.build(),
RequestBody.fromBytes(data)
);
}

// Upload with SSE-KMS and specific key
public void uploadWithKMS(String bucketName, String key, byte[] data, String kmsKeyId) {
s3Client.putObject(
PutObjectRequest.builder()
.bucket(bucketName)
.key(key)
.serverSideEncryption(ServerSideEncryption.AWS_KMS)
.ssekmsKeyId(kmsKeyId)
.build(),
RequestBody.fromBytes(data)
);
}
}

Bucket Key - Reduces KMS costs by using a bucket-level key that changes periodically, rather than calling KMS for every object. Enable this with bucketKeyEnabled(true) when using SSE-KMS.

For comprehensive secrets management and encryption key handling, see Secrets Management.

S3 Versioning and Object Lock

Versioning

Versioning preserves all versions of objects, protecting against accidental deletion and overwrites. When versioning is enabled, deleting an object creates a delete marker rather than permanently removing it.

// Enable and manage versioning
@Service
public class S3VersioningService {

private final S3Client s3Client;

public void enableVersioning(String bucketName) {
s3Client.putBucketVersioning(
PutBucketVersioningRequest.builder()
.bucket(bucketName)
.versioningConfiguration(VersioningConfiguration.builder()
.status(BucketVersioningStatus.ENABLED)
.build())
.build()
);

log.info("Enabled versioning for bucket: {}", bucketName);
}

// List all versions of an object
public List<S3ObjectVersion> listVersions(String bucketName, String key) {
ListObjectVersionsResponse response = s3Client.listObjectVersions(
ListObjectVersionsRequest.builder()
.bucket(bucketName)
.prefix(key)
.build()
);

return response.versions();
}

// Restore previous version by copying it
public void restorePreviousVersion(String bucketName, String key, String versionId) {
s3Client.copyObject(
CopyObjectRequest.builder()
.sourceBucket(bucketName)
.sourceKey(key)
.sourceVersionId(versionId)
.destinationBucket(bucketName)
.destinationKey(key)
.build()
);

log.info("Restored version {} for object {}", versionId, key);
}

// Permanently delete a specific version
public void permanentlyDeleteVersion(String bucketName, String key, String versionId) {
s3Client.deleteObject(
DeleteObjectRequest.builder()
.bucket(bucketName)
.key(key)
.versionId(versionId)
.build()
);

log.info("Permanently deleted version {} of object {}", versionId, key);
}

// Delete all versions of an object (permanent deletion)
public void permanentlyDeleteAllVersions(String bucketName, String key) {
List<S3ObjectVersion> versions = listVersions(bucketName, key);

for (S3ObjectVersion version : versions) {
permanentlyDeleteVersion(bucketName, key, version.versionId());
}

log.info("Permanently deleted all versions of object: {}", key);
}
}

Versioning costs: You pay for storage of all versions. Use lifecycle policies with NoncurrentVersionExpiration to delete old versions automatically:

// Lifecycle rule to clean up old versions
LifecycleRule versioningRule = LifecycleRule.builder()
.id("cleanup-old-versions")
.status(ExpirationStatus.ENABLED)
.noncurrentVersionExpiration(NoncurrentVersionExpiration.builder()
.noncurrentDays(30) // Delete noncurrent versions after 30 days
.newerNoncurrentVersions(3) // Keep 3 most recent noncurrent versions
.build())
.noncurrentVersionTransitions(
NoncurrentVersionTransition.builder()
.noncurrentDays(7)
.storageClass(TransitionStorageClass.GLACIER)
.build()
)
.build();

When to use versioning:

  • Critical data that must be protected from accidental deletion
  • Compliance requirements for audit trails
  • Collaborative environments where multiple users modify objects
  • Applications requiring rollback capabilities

Object Lock

Object Lock provides Write-Once-Read-Many (WORM) protection for compliance, preventing object deletion or modification for a specified retention period.

// Enable Object Lock (must be enabled at bucket creation)
@Service
public class S3ObjectLockService {

private final S3Client s3Client;

public String createBucketWithObjectLock(String bucketName) {
s3Client.createBucket(
CreateBucketRequest.builder()
.bucket(bucketName)
.objectLockEnabledForBucket(true) // Must be set at creation
.build()
);

// Configure default retention
s3Client.putObjectLockConfiguration(
PutObjectLockConfigurationRequest.builder()
.bucket(bucketName)
.objectLockConfiguration(ObjectLockConfiguration.builder()
.objectLockEnabled(ObjectLockEnabled.ENABLED)
.rule(ObjectLockRule.builder()
.defaultRetention(DefaultRetention.builder()
.mode(ObjectLockRetentionMode.COMPLIANCE)
.days(2555) // 7 years
.build())
.build())
.build())
.build()
);

log.info("Created bucket {} with Object Lock enabled", bucketName);
return bucketName;
}

// Upload object with specific retention
public void uploadWithRetention(String bucketName, String key, byte[] data, int retentionDays) {
Instant retainUntil = Instant.now().plus(retentionDays, ChronoUnit.DAYS);

s3Client.putObject(
PutObjectRequest.builder()
.bucket(bucketName)
.key(key)
.objectLockMode(ObjectLockMode.COMPLIANCE)
.objectLockRetainUntilDate(retainUntil)
.build(),
RequestBody.fromBytes(data)
);

log.info("Uploaded {} with {} day retention until {}", key, retentionDays, retainUntil);
}

// Apply legal hold (indefinite retention)
public void applyLegalHold(String bucketName, String key) {
s3Client.putObjectLegalHold(
PutObjectLegalHoldRequest.builder()
.bucket(bucketName)
.key(key)
.legalHold(ObjectLockLegalHold.builder()
.status(ObjectLockLegalHoldStatus.ON)
.build())
.build()
);

log.info("Applied legal hold to object: {}", key);
}

// Remove legal hold
public void removeLegalHold(String bucketName, String key) {
s3Client.putObjectLegalHold(
PutObjectLegalHoldRequest.builder()
.bucket(bucketName)
.key(key)
.legalHold(ObjectLockLegalHold.builder()
.status(ObjectLockLegalHoldStatus.OFF)
.build())
.build()
);

log.info("Removed legal hold from object: {}", key);
}
}

Retention modes:

  • Compliance mode: No one (including root user) can delete or modify the object during retention period. Cannot be shortened.
  • Governance mode: Users with special permissions (s3:BypassGovernanceRetention) can delete objects or shorten retention periods.

Legal hold: Independent of retention periods, provides indefinite WORM protection until explicitly removed. Useful for litigation or investigations.

Object Lock use cases:

  • Financial records retention (SEC requirements)
  • Healthcare data (HIPAA compliance)
  • Legal documents under litigation hold
  • Regulatory compliance requiring immutable records

For disaster recovery and backup strategies, see Disaster Recovery.

S3 Performance Optimization

S3 automatically scales to handle high request rates, but understanding performance characteristics helps optimize application design.

Request Rate Performance

S3 supports 3,500 PUT/POST/DELETE and 5,500 GET requests per second per prefix. A prefix is the portion of the object key between the bucket name and object name.

Examples of prefixes:

  • my-bucket/users/12345/photo.jpg - prefix is users/12345/
  • my-bucket/2024/01/15/log.txt - prefix is 2024/01/15/

To achieve higher aggregate throughput, distribute objects across multiple prefixes:

// Distribute objects across prefixes for higher throughput
@Service
public class S3HighThroughputService {

private final S3Client s3Client;

// Distribute uploads across multiple prefixes using hash
public String generateOptimizedKey(String userId, String filename) {
// Use first 4 chars of hash as prefix for distribution
String hashPrefix = Integer.toHexString(userId.hashCode()).substring(0, 4);

// Results in keys like: a3f2/users/12345/file.jpg
// This distributes load across 65,536 prefixes (16^4)
return String.format("%s/users/%s/%s", hashPrefix, userId, filename);
}

// Alternative: use date-based prefixes for time-series data
public String generateDateBasedKey(String category, String filename) {
LocalDate now = LocalDate.now();

// Results in keys like: 2024/01/15/logs/app.log
// Each day gets its own prefix
return String.format("%d/%02d/%02d/%s/%s",
now.getYear(), now.getMonthValue(), now.getDayOfMonth(),
category, filename);
}
}

Why prefix distribution matters: If all your uploads go to the same prefix and you exceed 3,500 PUT/s, requests will be throttled. By distributing across multiple prefixes, each prefix gets its own 3,500 PUT/s limit.

Multipart Upload

Multipart upload splits large files into parts and uploads them in parallel, improving performance and reliability for files larger than 100 MB.

// Multipart upload for large files
@Service
public class S3MultipartUploadService {

private final S3Client s3Client;
private static final long PART_SIZE = 10 * 1024 * 1024; // 10 MB

public void uploadLargeFile(String bucketName, String key, File file) throws IOException {
if (file.length() < 100 * 1024 * 1024) {
// Small file - use regular upload
uploadDirect(bucketName, key, file);
} else {
// Large file - use multipart upload
uploadMultipart(bucketName, key, file);
}
}

private void uploadDirect(String bucketName, String key, File file) {
s3Client.putObject(
PutObjectRequest.builder()
.bucket(bucketName)
.key(key)
.build(),
RequestBody.fromFile(file)
);
}

private void uploadMultipart(String bucketName, String key, File file) throws IOException {
// Initiate multipart upload
CreateMultipartUploadResponse initResponse = s3Client.createMultipartUpload(
CreateMultipartUploadRequest.builder()
.bucket(bucketName)
.key(key)
.serverSideEncryption(ServerSideEncryption.AES256)
.build()
);

String uploadId = initResponse.uploadId();
List<CompletedPart> completedParts = new ArrayList<>();

try {
long fileLength = file.length();
int partNumber = 1;

try (RandomAccessFile raf = new RandomAccessFile(file, "r")) {
for (long position = 0; position < fileLength; position += PART_SIZE) {
long currentPartSize = Math.min(PART_SIZE, fileLength - position);

byte[] buffer = new byte[(int) currentPartSize];
raf.seek(position);
raf.readFully(buffer);

UploadPartResponse partResponse = s3Client.uploadPart(
UploadPartRequest.builder()
.bucket(bucketName)
.key(key)
.uploadId(uploadId)
.partNumber(partNumber)
.build(),
RequestBody.fromBytes(buffer)
);

completedParts.add(CompletedPart.builder()
.partNumber(partNumber)
.eTag(partResponse.eTag())
.build());

log.info("Uploaded part {} of {}", partNumber,
(fileLength + PART_SIZE - 1) / PART_SIZE);

partNumber++;
}
}

// Complete multipart upload
s3Client.completeMultipartUpload(
CompleteMultipartUploadRequest.builder()
.bucket(bucketName)
.key(key)
.uploadId(uploadId)
.multipartUpload(CompletedMultipartUpload.builder()
.parts(completedParts)
.build())
.build()
);

log.info("Completed multipart upload for {}", key);

} catch (Exception e) {
// Abort upload on failure
s3Client.abortMultipartUpload(
AbortMultipartUploadRequest.builder()
.bucket(bucketName)
.key(key)
.uploadId(uploadId)
.build()
);

log.error("Multipart upload failed, aborted upload ID: {}", uploadId);
throw new RuntimeException("Failed to upload file", e);
}
}

// Upload parts in parallel for maximum performance
public void uploadMultipartParallel(String bucketName, String key, File file)
throws IOException, InterruptedException, ExecutionException {

CreateMultipartUploadResponse initResponse = s3Client.createMultipartUpload(
CreateMultipartUploadRequest.builder()
.bucket(bucketName)
.key(key)
.serverSideEncryption(ServerSideEncryption.AES256)
.build()
);

String uploadId = initResponse.uploadId();
long fileLength = file.length();
int numParts = (int) ((fileLength + PART_SIZE - 1) / PART_SIZE);

ExecutorService executor = Executors.newFixedThreadPool(10);
List<Future<CompletedPart>> futures = new ArrayList<>();

try (RandomAccessFile raf = new RandomAccessFile(file, "r")) {
for (int partNumber = 1; partNumber <= numParts; partNumber++) {
long position = (partNumber - 1) * PART_SIZE;
long currentPartSize = Math.min(PART_SIZE, fileLength - position);

byte[] buffer = new byte[(int) currentPartSize];
raf.seek(position);
raf.readFully(buffer);

int finalPartNumber = partNumber;
Future<CompletedPart> future = executor.submit(() -> {
UploadPartResponse response = s3Client.uploadPart(
UploadPartRequest.builder()
.bucket(bucketName)
.key(key)
.uploadId(uploadId)
.partNumber(finalPartNumber)
.build(),
RequestBody.fromBytes(buffer)
);

return CompletedPart.builder()
.partNumber(finalPartNumber)
.eTag(response.eTag())
.build();
});

futures.add(future);
}

// Wait for all uploads to complete
List<CompletedPart> completedParts = new ArrayList<>();
for (Future<CompletedPart> future : futures) {
completedParts.add(future.get());
}

// Sort by part number
completedParts.sort(Comparator.comparingInt(CompletedPart::partNumber));

// Complete multipart upload
s3Client.completeMultipartUpload(
CompleteMultipartUploadRequest.builder()
.bucket(bucketName)
.key(key)
.uploadId(uploadId)
.multipartUpload(CompletedMultipartUpload.builder()
.parts(completedParts)
.build())
.build()
);

log.info("Completed parallel multipart upload for {}", key);

} catch (Exception e) {
s3Client.abortMultipartUpload(
AbortMultipartUploadRequest.builder()
.bucket(bucketName)
.key(key)
.uploadId(uploadId)
.build()
);
throw e;
} finally {
executor.shutdown();
}
}
}

Multipart upload benefits:

  • Improved throughput: Upload parts in parallel
  • Quick recovery: Resume from failed parts instead of restarting entire upload
  • Required for files > 5 GB: Single PUT operations are limited to 5 GB
  • Recommended for files > 100 MB: Better performance and reliability

Transfer Acceleration

Transfer Acceleration uploads files to CloudFront edge locations (closer to users), then transfers them to S3 over AWS's optimized network. This improves upload speeds for users far from the S3 bucket region.

// Enable Transfer Acceleration
@Service
public class S3TransferAccelerationService {

private final S3Client s3Client;

public void enableTransferAcceleration(String bucketName) {
s3Client.putBucketAccelerateConfiguration(
PutBucketAccelerateConfigurationRequest.builder()
.bucket(bucketName)
.accelerateConfiguration(AccelerateConfiguration.builder()
.status(BucketAccelerateStatus.ENABLED)
.build())
.build()
);

log.info("Enabled Transfer Acceleration for bucket: {}", bucketName);
}

// Upload using accelerated endpoint
public void uploadWithAcceleration(String bucketName, String key, File file) {
S3Client acceleratedClient = S3Client.builder()
.endpointOverride(URI.create(
String.format("https://%s.s3-accelerate.amazonaws.com", bucketName)))
.build();

acceleratedClient.putObject(
PutObjectRequest.builder()
.bucket(bucketName)
.key(key)
.build(),
RequestBody.fromFile(file)
);

log.info("Uploaded {} using Transfer Acceleration", key);
}
}

Transfer Acceleration pricing: Additional $0.04 per GB for uploads, $0.08 per GB for downloads. Test with the S3 Transfer Acceleration Speed Comparison tool to see if it's beneficial for your use case.

Byte-Range Fetches

Download specific byte ranges of large objects for parallel downloads or partial retrieval.

// Download specific byte range
@Service
public class S3RangeFetchService {

private final S3Client s3Client;

public byte[] downloadRange(String bucketName, String key, long start, long end) {
GetObjectResponse response = s3Client.getObject(
GetObjectRequest.builder()
.bucket(bucketName)
.key(key)
.range(String.format("bytes=%d-%d", start, end))
.build(),
ResponseTransformer.toBytes()
);

return response.asByteArray();
}

// Parallel downloads using byte ranges
public void downloadLargeFileParallel(String bucketName, String key, File outputFile)
throws InterruptedException, ExecutionException, IOException {

// Get object size
HeadObjectResponse metadata = s3Client.headObject(
HeadObjectRequest.builder()
.bucket(bucketName)
.key(key)
.build()
);

long fileSize = metadata.contentLength();
long partSize = 10 * 1024 * 1024; // 10 MB per thread
int numParts = (int) ((fileSize + partSize - 1) / partSize);

ExecutorService executor = Executors.newFixedThreadPool(10);
List<Future<byte[]>> futures = new ArrayList<>();

for (int i = 0; i < numParts; i++) {
long start = i * partSize;
long end = Math.min(start + partSize - 1, fileSize - 1);

Future<byte[]> future = executor.submit(() ->
downloadRange(bucketName, key, start, end)
);

futures.add(future);
}

// Write parts to file in order
try (FileOutputStream fos = new FileOutputStream(outputFile)) {
for (Future<byte[]> future : futures) {
fos.write(future.get());
}
}

executor.shutdown();
log.info("Downloaded {} in parallel to {}", key, outputFile);
}

// Download only first N bytes (e.g., file header)
public byte[] downloadHeader(String bucketName, String key, int headerSize) {
return downloadRange(bucketName, key, 0, headerSize - 1);
}
}

Byte-range fetch use cases:

  • Parallel downloads for large files
  • Streaming video (download chunks as needed)
  • Reading file headers without downloading entire file
  • Resume interrupted downloads

S3 Presigned URLs

Presigned URLs provide temporary access to S3 objects without requiring AWS credentials. This is the recommended pattern for allowing client applications to upload and download files directly to/from S3.

// Generate presigned URLs for uploads and downloads
@Service
public class S3PresignedUrlService {

private final S3Presigner s3Presigner;
private final FileMetadataRepository fileMetadataRepository;

@Value("${s3.bucket.name}")
private String bucketName;

// Generate presigned URL for upload
public PresignedUploadResponse generateUploadUrl(String userId, String filename,
String contentType, long fileSize) {
// Validate file size
if (fileSize > 100 * 1024 * 1024) { // 100 MB limit
throw new IllegalArgumentException("File size exceeds maximum allowed");
}

// Generate unique key
String fileId = UUID.randomUUID().toString();
String key = String.format("uploads/%s/%s", userId, fileId);

// Create metadata record
FileMetadata metadata = new FileMetadata();
metadata.setId(UUID.fromString(fileId));
metadata.setStorageKey(key);
metadata.setOriginalFilename(filename);
metadata.setContentType(contentType);
metadata.setSizeBytes(fileSize);
metadata.setUploadedBy(userId);
metadata.setStatus(FileStatus.PENDING);

fileMetadataRepository.save(metadata);

// Generate presigned PUT URL
PutObjectRequest putRequest = PutObjectRequest.builder()
.bucket(bucketName)
.key(key)
.contentType(contentType)
.metadata(Map.of(
"user-id", userId,
"original-filename", filename
))
.build();

PutObjectPresignRequest presignRequest = PutObjectPresignRequest.builder()
.signatureDuration(Duration.ofMinutes(15))
.putObjectRequest(putRequest)
.build();

PresignedPutObjectRequest presigned = s3Presigner.presignPutObject(presignRequest);

return new PresignedUploadResponse(
fileId,
presigned.url().toString(),
presigned.expiration()
);
}

// Generate presigned URL for download
public String generateDownloadUrl(String fileId, String userId) {
// Verify user has access
FileMetadata metadata = fileMetadataRepository.findById(UUID.fromString(fileId))
.orElseThrow(() -> new FileNotFoundException(fileId));

if (!metadata.getUploadedBy().equals(userId)) {
throw new AccessDeniedException("User does not have access to this file");
}

if (metadata.getStatus() != FileStatus.AVAILABLE) {
throw new IllegalStateException("File is not available for download");
}

GetObjectRequest getRequest = GetObjectRequest.builder()
.bucket(bucketName)
.key(metadata.getStorageKey())
.responseContentDisposition(
String.format("attachment; filename=\"%s\"",
metadata.getOriginalFilename()))
.responseContentType(metadata.getContentType())
.build();

GetObjectPresignRequest presignRequest = GetObjectPresignRequest.builder()
.signatureDuration(Duration.ofMinutes(15))
.getObjectRequest(getRequest)
.build();

PresignedGetObjectRequest presigned = s3Presigner.presignGetObject(presignRequest);

return presigned.url().toString();
}

// Generate presigned URL for multipart upload
public MultipartUploadUrls generateMultipartUploadUrls(String userId, String filename,
String contentType, int numParts) {
String fileId = UUID.randomUUID().toString();
String key = String.format("uploads/%s/%s", userId, fileId);

// Initiate multipart upload
CreateMultipartUploadRequest createRequest = CreateMultipartUploadRequest.builder()
.bucket(bucketName)
.key(key)
.contentType(contentType)
.build();

CreateMultipartUploadResponse createResponse =
s3Client.createMultipartUpload(createRequest);

String uploadId = createResponse.uploadId();

// Generate presigned URLs for each part
List<String> partUrls = new ArrayList<>();
for (int partNumber = 1; partNumber <= numParts; partNumber++) {
UploadPartRequest uploadPartRequest = UploadPartRequest.builder()
.bucket(bucketName)
.key(key)
.uploadId(uploadId)
.partNumber(partNumber)
.build();

UploadPartPresignRequest presignRequest = UploadPartPresignRequest.builder()
.signatureDuration(Duration.ofHours(1))
.uploadPartRequest(uploadPartRequest)
.build();

PresignedUploadPartRequest presigned =
s3Presigner.presignUploadPart(presignRequest);

partUrls.add(presigned.url().toString());
}

return new MultipartUploadUrls(fileId, uploadId, partUrls);
}

// Complete multipart upload
public void completeMultipartUpload(String fileId, String uploadId,
List<CompletedPartDto> parts) {
FileMetadata metadata = fileMetadataRepository.findById(UUID.fromString(fileId))
.orElseThrow(() -> new FileNotFoundException(fileId));

List<CompletedPart> completedParts = parts.stream()
.map(p -> CompletedPart.builder()
.partNumber(p.getPartNumber())
.eTag(p.getETag())
.build())
.toList();

s3Client.completeMultipartUpload(
CompleteMultipartUploadRequest.builder()
.bucket(bucketName)
.key(metadata.getStorageKey())
.uploadId(uploadId)
.multipartUpload(CompletedMultipartUpload.builder()
.parts(completedParts)
.build())
.build()
);

metadata.setStatus(FileStatus.AVAILABLE);
fileMetadataRepository.save(metadata);
}
}

Frontend implementation using presigned URLs:

// Upload file directly to S3 using presigned URL
async function uploadToS3(file: File, userId: string) {
// Request presigned URL from backend
const response = await fetch('/api/files/upload-url', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
filename: file.name,
contentType: file.type,
size: file.size,
userId
})
});

const { fileId, presignedUrl, expiration } = await response.json();

// Upload directly to S3 (bypasses backend)
await fetch(presignedUrl, {
method: 'PUT',
headers: {
'Content-Type': file.type,
},
body: file
});

// Notify backend that upload is complete
await fetch(`/api/files/${fileId}/complete`, {
method: 'POST'
});

return fileId;
}

// Download file via presigned URL
async function downloadFromS3(fileId: string) {
const response = await fetch(`/api/files/${fileId}/download`);
const { presignedUrl } = await response.json();

// Browser downloads directly from S3
window.location.href = presignedUrl;
}

// Multipart upload for large files
async function uploadLargeFile(file: File, userId: string) {
const partSize = 10 * 1024 * 1024; // 10 MB parts
const numParts = Math.ceil(file.size / partSize);

// Get presigned URLs for all parts
const response = await fetch('/api/files/multipart-upload-urls', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
filename: file.name,
contentType: file.type,
numParts,
userId
})
});

const { fileId, uploadId, partUrls } = await response.json();

// Upload each part
const uploadedParts = [];
for (let i = 0; i < numParts; i++) {
const start = i * partSize;
const end = Math.min(start + partSize, file.size);
const part = file.slice(start, end);

const partResponse = await fetch(partUrls[i], {
method: 'PUT',
body: part
});

const etag = partResponse.headers.get('ETag');
uploadedParts.push({
partNumber: i + 1,
eTag: etag
});
}

// Complete multipart upload
await fetch(`/api/files/${fileId}/complete-multipart`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
uploadId,
parts: uploadedParts
})
});

return fileId;
}

Presigned URL security best practices:

  • Keep expiration times short (15 minutes for uploads, 5-15 minutes for downloads)
  • Validate file metadata (size, content type) before generating URLs
  • Check user permissions before generating download URLs
  • Log all presigned URL generation for audit trails
  • Use HTTPS-only URLs
  • Include content-type in upload presigned URLs to prevent type confusion

For comprehensive file upload/download patterns, see File Storage.

S3 Event Notifications

S3 can trigger notifications when objects are created, deleted, or restored, enabling event-driven architectures.

// Configure S3 event notifications
@Service
public class S3EventService {

private final S3Client s3Client;

// Configure notifications to Lambda
public void configureEventNotifications(String bucketName, String lambdaArn) {
s3Client.putBucketNotificationConfiguration(
PutBucketNotificationConfigurationRequest.builder()
.bucket(bucketName)
.notificationConfiguration(NotificationConfiguration.builder()
.lambdaFunctionConfigurations(
LambdaFunctionConfiguration.builder()
.id("process-uploads")
.lambdaFunctionArn(lambdaArn)
.events(
Event.S3_OBJECT_CREATED_PUT,
Event.S3_OBJECT_CREATED_POST,
Event.S3_OBJECT_CREATED_COMPLETE_MULTIPART_UPLOAD
)
.filter(NotificationConfigurationFilter.builder()
.key(S3KeyFilter.builder()
.filterRules(
FilterRule.builder()
.name("prefix")
.value("uploads/")
.build(),
FilterRule.builder()
.name("suffix")
.value(".jpg")
.build()
)
.build())
.build())
.build()
)
.build())
.build()
);

log.info("Configured event notifications for bucket: {}", bucketName);
}

// Configure notifications to SQS
public void configureNotificationsToSQS(String bucketName, String queueArn) {
s3Client.putBucketNotificationConfiguration(
PutBucketNotificationConfigurationRequest.builder()
.bucket(bucketName)
.notificationConfiguration(NotificationConfiguration.builder()
.queueConfigurations(
QueueConfiguration.builder()
.id("queue-notifications")
.queueArn(queueArn)
.events(Event.S3_OBJECT_CREATED)
.build()
)
.build())
.build()
);
}

// Configure notifications to SNS
public void configureNotificationsToSNS(String bucketName, String topicArn) {
s3Client.putBucketNotificationConfiguration(
PutBucketNotificationConfigurationRequest.builder()
.bucket(bucketName)
.notificationConfiguration(NotificationConfiguration.builder()
.topicConfigurations(
TopicConfiguration.builder()
.id("topic-notifications")
.topicArn(topicArn)
.events(
Event.S3_OBJECT_CREATED,
Event.S3_OBJECT_REMOVED
)
.build()
)
.build())
.build()
);
}
}

// Lambda function to process S3 events
@Component
public class S3EventHandler implements RequestHandler<S3Event, Void> {

private final FileProcessingService fileProcessingService;

@Override
public Void handleRequest(S3Event event, Context context) {
for (S3EventNotification.S3EventNotificationRecord record : event.getRecords()) {
String eventName = record.getEventName();
String bucketName = record.getS3().getBucket().getName();
String key = record.getS3().getObject().getKey();
long size = record.getS3().getObject().getSizeAsLong();

log.info("S3 Event: {} for s3://{}/{} ({} bytes)",
eventName, bucketName, key, size);

if (eventName.startsWith("ObjectCreated")) {
fileProcessingService.processUploadedFile(bucketName, key);
} else if (eventName.startsWith("ObjectRemoved")) {
fileProcessingService.handleFileRemoval(bucketName, key);
}
}
return null;
}
}

S3 event use cases:

  • Trigger virus scanning on file upload
  • Generate thumbnails for images
  • Process and validate uploaded documents
  • Update database metadata when files are added/removed
  • Replicate files to another bucket or service
  • Archive old files automatically

For event-driven architecture patterns, see Event-Driven Architecture and AWS Messaging.

S3 Cross-Region Replication

Cross-Region Replication (CRR) automatically replicates objects to buckets in different regions for disaster recovery, compliance, and low-latency access.

// Configure cross-region replication
@Service
public class S3ReplicationService {

private final S3Client s3Client;

public void configureCrossRegionReplication(
String sourceBucket,
String destinationBucket,
String destinationRegion,
String replicationRoleArn) {

// Enable versioning (required for replication)
s3Client.putBucketVersioning(
PutBucketVersioningRequest.builder()
.bucket(sourceBucket)
.versioningConfiguration(VersioningConfiguration.builder()
.status(BucketVersioningStatus.ENABLED)
.build())
.build()
);

// Configure replication
s3Client.putBucketReplication(
PutBucketReplicationRequest.builder()
.bucket(sourceBucket)
.replicationConfiguration(ReplicationConfiguration.builder()
.role(replicationRoleArn)
.rules(ReplicationRule.builder()
.id("replicate-all-to-" + destinationRegion)
.status(ReplicationRuleStatus.ENABLED)
.priority(1)
.filter(ReplicationRuleFilter.builder()
.prefix("") // Replicate all objects
.build())
.destination(Destination.builder()
.bucket("arn:aws:s3:::" + destinationBucket)
.storageClass(StorageClass.STANDARD)
.replicationTime(ReplicationTime.builder()
.status(ReplicationTimeStatus.ENABLED)
.time(ReplicationTimeValue.builder()
.minutes(15) // Replication Time Control (RTC)
.build())
.build())
.metrics(Metrics.builder()
.status(MetricsStatus.ENABLED)
.eventThreshold(ReplicationTimeValue.builder()
.minutes(15)
.build())
.build())
.build())
.deleteMarkerReplication(DeleteMarkerReplication.builder()
.status(DeleteMarkerReplicationStatus.ENABLED)
.build())
.build())
.build())
.build()
);

log.info("Configured CRR from {} to {} in region {}",
sourceBucket, destinationBucket, destinationRegion);
}
}

Replication features:

  • Replication Time Control (RTC): Guarantees 99.99% of objects replicate within 15 minutes (additional cost)
  • Delete marker replication: Replicates delete markers (optional)
  • Replica modification sync: Keep metadata changes synchronized
  • Metrics and notifications: Monitor replication progress and failures

Replication use cases:

  • Disaster recovery: Maintain copy in different region
  • Compliance: Meet data residency requirements
  • Latency reduction: Serve content from regions closer to users
  • Backup: Maintain separate copy for data protection

For disaster recovery strategies, see Disaster Recovery.

S3 Cost Optimization

S3 costs come from storage ($), requests ($), and data transfer ($). Optimizing each component reduces total costs.

// Monitor S3 costs and usage
@Service
public class S3CostMonitorService {

private final CloudWatchClient cloudWatchClient;
private final S3Client s3Client;

@Scheduled(cron = "0 0 8 * * MON") // Weekly on Monday 8 AM
public void analyzeCosts() {
List<Bucket> buckets = s3Client.listBuckets().buckets();

for (Bucket bucket : buckets) {
analyzeBucketCosts(bucket.name());
}
}

private void analyzeBucketCosts(String bucketName) {
// Get storage class distribution
Map<String, Long> storageClasses = getStorageClassDistribution(bucketName);

long totalSize = storageClasses.values().stream().mapToLong(Long::longValue).sum();
double totalGB = totalSize / (1024.0 * 1024.0 * 1024.0);

// Calculate estimated monthly cost
double monthlyCost = calculateMonthlyCost(storageClasses);

log.info("Bucket: {}, Size: {:.2f} GB, Monthly Cost: ${:.2f}",
bucketName, totalGB, monthlyCost);

// Provide recommendations
if (storageClasses.getOrDefault("STANDARD", 0L) > totalSize * 0.5) {
log.warn("Bucket {} has >50% in Standard storage. " +
"Consider Intelligent-Tiering or lifecycle policies.", bucketName);
}

// Check for lifecycle policy
try {
s3Client.getBucketLifecycleConfiguration(
GetBucketLifecycleConfigurationRequest.builder()
.bucket(bucketName)
.build()
);
} catch (NoSuchLifecycleConfigurationException e) {
log.warn("Bucket {} has no lifecycle policy. " +
"Configure policies to reduce costs.", bucketName);
}

// Check for incomplete multipart uploads
ListMultipartUploadsResponse multipartUploads = s3Client.listMultipartUploads(
ListMultipartUploadsRequest.builder()
.bucket(bucketName)
.build()
);

if (!multipartUploads.uploads().isEmpty()) {
log.warn("Bucket {} has {} incomplete multipart uploads. " +
"Configure lifecycle rule to abort after 7 days.",
bucketName, multipartUploads.uploads().size());
}
}

private Map<String, Long> getStorageClassDistribution(String bucketName) {
Map<String, Long> distribution = new HashMap<>();

ListObjectsV2Request request = ListObjectsV2Request.builder()
.bucket(bucketName)
.build();

ListObjectsV2Response response;
do {
response = s3Client.listObjectsV2(request);

for (S3Object object : response.contents()) {
String storageClass = object.storageClassAsString();
if (storageClass == null) storageClass = "STANDARD";
distribution.merge(storageClass, object.size(), Long::sum);
}

request = request.toBuilder()
.continuationToken(response.nextContinuationToken())
.build();

} while (response.isTruncated());

return distribution;
}

private double calculateMonthlyCost(Map<String, Long> storageClasses) {
double cost = 0.0;

for (Map.Entry<String, Long> entry : storageClasses.entrySet()) {
double gb = entry.getValue() / (1024.0 * 1024.0 * 1024.0);
double pricePerGB = switch (entry.getKey()) {
case "STANDARD" -> 0.023;
case "INTELLIGENT_TIERING" -> 0.023; // Average
case "STANDARD_IA" -> 0.0125;
case "ONEZONE_IA" -> 0.01;
case "GLACIER_IR" -> 0.004;
case "GLACIER" -> 0.0036;
case "DEEP_ARCHIVE" -> 0.00099;
default -> 0.023;
};

cost += gb * pricePerGB;
}

return cost;
}
}

S3 cost optimization strategies:

  1. Use Intelligent-Tiering for unpredictable access patterns (automatic optimization, no guesswork)
  2. Configure lifecycle policies for predictable access patterns (manual transitions to IA, Glacier)
  3. Delete incomplete multipart uploads after 7 days (hidden costs from abandoned uploads)
  4. Use S3 Select to query only needed data instead of downloading entire objects (reduce data transfer)
  5. Enable S3 Inventory to analyze storage patterns and identify optimization opportunities
  6. Compress before uploading (especially logs, JSON, text files - can reduce size by 80-90%)
  7. Use CloudFront for frequent downloads (reduce S3 GET request costs, data transfer within AWS is free)
  8. Delete old versions with versioning lifecycle policies (noncurrent versions accumulate quickly)
  9. Use One Zone-IA for reproducible data (thumbnails, processed files that can be regenerated)
  10. Monitor and alert on storage growth to catch unexpected cost increases early

For comprehensive cost optimization across AWS, see AWS Cost Optimization.

Anti-Patterns

Public S3 buckets: Leaving buckets publicly accessible causes data breaches. Always enable Block Public Access unless explicitly required (e.g., static website hosting).

No lifecycle policies: Storing all data in S3 Standard wastes money. Use Intelligent-Tiering or lifecycle policies - can reduce costs by 65-95%.

Not enabling versioning: Losing data from accidental deletions or overwrites. Enable versioning for critical buckets and use lifecycle policies to manage old versions.

Hardcoded credentials: Embedding AWS access keys in code. Use IAM roles for EC2/ECS/Lambda or environment variables.

Single-region storage: No disaster recovery plan. Use Cross-Region Replication for critical data.

Ignoring encryption: Storing sensitive data unencrypted. Enable default encryption (SSE-S3 minimum) for all buckets.

Not monitoring costs: S3 costs grow silently. Monitor storage metrics, configure cost alerts, and regularly review storage class distribution.

Making buckets public for file access: Use presigned URLs instead of public buckets - provides temporary, secure access without making entire bucket public.

Not cleaning up multipart uploads: Abandoned multipart uploads continue to accrue storage costs. Configure lifecycle rule to abort incomplete uploads after 7 days.

Using Standard storage for archives: Glacier Deep Archive costs $1/TB-month vs $23/TB-month for Standard - 23x cheaper for long-term retention.

Further Reading