Storage: S3, EBS & EFS

AWS storage is three different animals: S3 is object storage for files, backups, static assets, and data lakes; EBS is block storage attached to EC2 like a virtual disk; EFS is shared NFS for multiple instances. Pick wrong and you pay 10× for the wrong tier, lose data on instance termination, or expose a bucket to the internet. This chapter covers the object model, storage classes, encryption, lifecycle, replication, and when to use each service in production.

developer devops architect S3 global · EBS/EFS regional S3 pay-per-use

S3 deep dive

Amazon S3 stores objects (files + metadata) in buckets. There is no directory hierarchy — only flat keys that look like paths. Understanding the object model, storage classes, and consistency model is the foundation for everything else: backups, static hosting, data lakes, and cross-region DR.

The object model

Every S3 object has a key (e.g. uploads/2024/invoice.pdf), a value (0 bytes to 5 TB), and metadata (system + user-defined). Buckets are globally unique names in the s3.amazonaws.com namespace — choose names carefully; you cannot rename a bucket without creating a new one and copying objects.

Concept Details Production note
Bucket Container in a region; name is global One bucket per environment per purpose — avoid mega-buckets with mixed sensitivity
Key / prefix Flat namespace; / is a convention, not a folder Design prefixes for lifecycle rules, IAM conditions, and S3 Inventory reports
Version ID Present when versioning enabled; null for unversioned Enable on production buckets — protects against accidental overwrite and ransomware
Storage class Per-object tier — Standard, IA, Glacier, etc. Set via upload or lifecycle transition; wrong class = wrong cost profile
Object Lock WORM retention — governance or compliance mode Requires versioning; bucket must be created with Object Lock enabled
🔬 Under the Hood

S3 is strongly consistent for all operations — after a successful PUT, subsequent GETs immediately return the new object. This changed in December 2020 (previously read-after-write was eventually consistent for overwrites and deletes in some cases). For exam purposes: S3 is strongly consistent today.

Storage classes — comparison

Storage class determines durability, availability, minimum storage duration, retrieval fees, and access latency. Match the class to access pattern — not every object belongs in Standard.

Storage class Use case Min duration Retrieval Availability
S3 Standard Hot data — frequent access, low latency None Instant, no fee 99.99%
S3 Standard-IA Infrequent access — backups, DR copies 30 days Instant + per-GB fee 99.9%
S3 One Zone-IA Recreatable infrequent data — lower cost, single AZ 30 days Instant + per-GB fee 99.5%
S3 Intelligent-Tiering Unknown or changing access patterns — auto-moves tiers None (small monitoring fee) Instant (no retrieval fee in Frequent/Infrequent tiers) 99.9%
S3 Glacier Instant Retrieval Archive with millisecond access — quarterly access OK 90 days Instant + per-GB fee 99.9%
S3 Glacier Flexible Retrieval Archive — minutes to hours retrieval (formerly Glacier) 90 days Expedited (1–5 min), Standard (3–5 hr), Bulk (5–12 hr) 99.99% (after restore)
S3 Glacier Deep Archive Long-term compliance — annual access or less 180 days Standard (12 hr), Bulk (48 hr) 99.99% (after restore)
💰 Cost

Standard-IA and Glacier tiers charge retrieval fees — a "cheap" archive bucket that gets read daily will cost more than Standard. One Zone-IA saves ~20% vs Standard-IA but loses cross-AZ redundancy; only use for data you can rebuild. Intelligent-Tiering adds a small monitoring fee per object but eliminates manual tier management — good default for mixed workloads with objects > 128 KB.

Lifecycle rules

Lifecycle configurations automatically transition objects between storage classes or expire (delete) them based on age, prefix, tags, or current storage class. Rules run once per day — not real-time.

  • Transition — move to IA after 30 days, Glacier after 90 days, Deep Archive after 365 days
  • Expiration — delete objects or noncurrent versions after N days
  • Abort incomplete multipart uploads — after 7 days to stop paying for orphaned parts
  • Filter — apply rules to prefix (logs/) or object tags

Versioning

With versioning enabled, every PUT creates a new version; DELETE adds a delete marker (doesn't remove data). Restore by deleting the delete marker or copying a previous version. Pair with lifecycle rules to expire noncurrent versions after N days — otherwise storage grows forever.

Replication — CRR and SRR

S3 Replication copies objects from a source bucket to a destination bucket automatically. Requires versioning on both buckets. IAM role must allow replication actions.

Type Scope Typical use
CRR (Cross-Region) Source region → different region DR, lower latency for global users, compliance residency copy
SRR (Same-Region) Source → bucket in same region Aggregate logs, separate prod/analytics copies, compliance isolation
RTC (Replication Time Control) CRR with 15-minute SLA Regulated DR with predictable RPO; additional cost

CRR copies across regions (DR, compliance); SRR within the same region (log aggregation). Both require versioning on source and destination. RTC adds a 15-minute replication SLA.

Production bucket — encryption + lifecycle

saved globally
bash
aws s3api create-bucket --bucket my-app-artifacts-prod-eu \
  --region eu-west-1 \
  --create-bucket-configuration LocationConstraint=eu-west-1

aws s3api put-bucket-versioning --bucket my-app-artifacts-prod-eu \
  --versioning-configuration Status=Enabled

aws s3api put-bucket-encryption --bucket my-app-artifacts-prod-eu \
  --server-side-encryption-configuration '{
    "Rules": [{
      "ApplyServerSideEncryptionByDefault": {
        "SSEAlgorithm": "aws:kms",
        "KMSMasterKeyID": "alias/my-app-s3-key"
      },
      "BucketKeyEnabled": true
    }]
  }'

aws s3api put-public-access-block --bucket my-app-artifacts-prod-eu \
  --public-access-block-configuration \
    BlockPublicAcls=true,IgnorePublicAcls=true,BlockPublicPolicy=true,RestrictPublicBuckets=true

cat > /tmp/lifecycle.json <<'EOF'
{
  "Rules": [{
    "ID": "archive-and-expire",
    "Status": "Enabled",
    "Filter": { "Prefix": "uploads/" },
    "Transitions": [
      { "Days": 30, "StorageClass": "STANDARD_IA" },
      { "Days": 90, "StorageClass": "GLACIER" }
    ],
    "NoncurrentVersionExpiration": { "NoncurrentDays": 30 },
    "AbortIncompleteMultipartUpload": { "DaysAfterInitiation": 7 }
  }]
}
EOF

aws s3api put-bucket-lifecycle-configuration --bucket my-app-artifacts-prod-eu \
  --lifecycle-configuration file:///tmp/lifecycle.json
hcl
resource "aws_s3_bucket" "artifacts" {
  bucket = "my-app-artifacts-prod-eu"
}

resource "aws_s3_bucket_versioning" "artifacts" {
  bucket = aws_s3_bucket.artifacts.id
  versioning_configuration { status = "Enabled" }
}

resource "aws_s3_bucket_server_side_encryption_configuration" "artifacts" {
  bucket = aws_s3_bucket.artifacts.id
  rule {
    apply_server_side_encryption_by_default {
      sse_algorithm     = "aws:kms"
      kms_master_key_id = aws_kms_key.s3.arn
    }
    bucket_key_enabled = true
  }
}

resource "aws_s3_bucket_public_access_block" "artifacts" {
  bucket                  = aws_s3_bucket.artifacts.id
  block_public_acls       = true
  block_public_policy     = true
  ignore_public_acls      = true
  restrict_public_buckets = true
}

resource "aws_s3_bucket_lifecycle_configuration" "artifacts" {
  bucket = aws_s3_bucket.artifacts.id
  rule {
    id     = "archive-and-expire"
    status = "Enabled"
    filter { prefix = "uploads/" }
    transition {
      days          = 30
      storage_class = "STANDARD_IA"
    }
    transition {
      days          = 90
      storage_class = "GLACIER"
    }
    noncurrent_version_expiration { noncurrent_days = 30 }
    abort_incomplete_multipart_upload { days_after_initiation = 7 }
  }
}
typescript
import * as s3 from 'aws-cdk-lib/aws-s3';
import * as kms from 'aws-cdk-lib/aws-kms';
import { Duration, RemovalPolicy } from 'aws-cdk-lib';

const key = kms.Key.fromLookup(this, 'S3Key', { aliasName: 'alias/my-app-s3-key' });

new s3.Bucket(this, 'ArtifactsBucket', {
  bucketName: 'my-app-artifacts-prod-eu',
  versioned: true,
  encryption: s3.BucketEncryption.KMS,
  encryptionKey: key,
  bucketKeyEnabled: true,
  blockPublicAccess: s3.BlockPublicAccess.BLOCK_ALL,
  enforceSSL: true,
  lifecycleRules: [{
    id: 'archive-and-expire',
    prefix: 'uploads/',
    transitions: [
      { storageClass: s3.StorageClass.INFREQUENT_ACCESS, transitionAfter: Duration.days(30) },
      { storageClass: s3.StorageClass.GLACIER, transitionAfter: Duration.days(90) },
    ],
    noncurrentVersionExpiration: Duration.days(30),
    abortIncompleteMultipartUploadAfter: Duration.days(7),
  }],
  removalPolicy: RemovalPolicy.RETAIN,
});
🎯 Exam Tip

Versioning must be enabled on both source and destination for replication. CRR does not replicate existing objects by default — only new/changed objects after rule creation (unless you use S3 Batch Replication for backfill). Glacier and Deep Archive objects cannot be replicated directly — transition happens at destination per its lifecycle rules.

S3 security & performance

Most S3 breaches are misconfiguration, not sophisticated attacks. Block Public Access, bucket policies, encryption enforcement, and presigned URLs are your toolkit. On the performance side: prefix design, multipart upload, and S3 Select reduce latency and cost at scale.

Block Public Access

Four account-level and bucket-level settings that override any policy making a bucket public. Enable at the account level in Organizations — defense in depth with SCPs and bucket policies. Even with Block Public Access on, a misconfigured bucket policy can still grant overly broad access to authenticated AWS principals — BPA only blocks anonymous/public access.

Bucket policies vs IAM policies

S3 bucket policies are resource-based — they attach to the bucket and can grant cross-account access without an IAM policy on the other side (though the other principal often still needs permission to call S3). Use bucket policies for: CloudFront OAC, cross-account read, denying unencrypted uploads, requiring VPC endpoint access.

json
{
  "Version": "2012-10-17",
  "Statement": [{
    "Sid": "DenyUnencryptedUploads",
    "Effect": "Deny",
    "Principal": "*",
    "Action": "s3:PutObject",
    "Resource": "arn:aws:s3:::my-app-artifacts-prod-eu/*",
    "Condition": {
      "StringNotEquals": {
        "s3:x-amz-server-side-encryption": "aws:kms"
      }
    }
  }]
}

Presigned URLs and POST

Presigned URLs grant temporary access to a specific object (GET or PUT) without making the bucket public. Generated by signing with IAM credentials — expiry from seconds to 7 days (SigV4). Presigned POST allows browser direct upload via HTML form — common for user file uploads from a Spring Boot app without proxying bytes through your API.

Generate a presigned download URL

saved globally
bash
# Presigned GET — share invoice download for 15 minutes
aws s3 presign s3://my-app-artifacts-prod-eu/invoices/2024/inv-001.pdf \
  --expires-in 900 \
  --region eu-west-1

# Presigned PUT — client uploads directly (Spring returns URL to browser)
aws s3 presign s3://my-app-artifacts-prod-eu/uploads/user-42/doc.pdf \
  --expires-in 3600 \
  --region eu-west-1 \
  --http-method PUT

# Verify the URL works
curl -I "$(aws s3 presign s3://my-app-artifacts-prod-eu/invoices/2024/inv-001.pdf --expires-in 300)"
hcl
# Presign at runtime — grant s3:GetObject on the prefix to the app role.
resource "aws_iam_role_policy" "presign" {
  role = aws_iam_role.api.id
  policy = jsonencode({
    Statement = [{
      Effect   = "Allow"
      Action   = ["s3:GetObject"]
      Resource = "${aws_s3_bucket.artifacts.arn}/invoices/*"
    }]
  })
}
typescript
taskRole.addToPolicy(new iam.PolicyStatement({
  actions: ['s3:GetObject'],
  resources: [`${bucket.bucketArn}/invoices/*`],
}));
// Runtime: S3Presigner.create().presignGetObject(...) — 15 min expiry

Encryption — SSE-S3, SSE-KMS, SSE-C

Type Keys managed by When to use
SSE-S3 (AES256) AWS — no KMS charges Default for non-sensitive bulk storage; simplest setup
SSE-KMS AWS KMS — audit trail, key rotation, cross-account Production default for PII/financial data; enable Bucket Key to cut KMS API costs
SSE-C Customer provides key per request Rare — you manage key lifecycle; AWS never stores the key
🔒 Security

Enforce encryption at rest with a bucket policy Deny on s3:PutObject when s3:x-amz-server-side-encryption is missing or wrong. Pair with aws:SecureTransport deny for HTTP. For SSE-KMS, the caller needs kms:Decrypt and kms:GenerateDataKey on the key — a common "access denied" on GetObject.

Prefix strategy and request rate

S3 scales automatically, but extremely hot prefixes (millions of PUTs/sec to one prefix) can throttle. For high-throughput workloads, use hex hash prefixes: uploads/a3/f9/object-id instead of uploads/2024/06/10/object-id when date clustering creates hotspots. For most backend apps, date-based prefixes are fine.

Multipart upload

Required for objects > 5 GB; recommended for > 100 MB. Upload parts in parallel (5 MB–5 GB each, max 10,000 parts). Failed uploads leave parts that cost money — lifecycle rule to abort incomplete uploads after 7 days is mandatory. Complete with CompleteMultipartUpload; list in-progress with ListMultipartUploads.

S3 Select and events

  • S3 Select — SQL on CSV/JSON/Parquet in-place; filter without full download
  • Event notifications — Lambda, SQS, SNS, EventBridge on object create/delete (at-least-once)

Object Lock (WORM)

Retention or legal hold prevents deletion — Governance (admins can override) vs Compliance (nobody deletes until expiry). Requires versioning; bucket created with Object Lock enabled.

⚠️ Pitfall

Presigned URLs inherit the permissions of the signer — if you presign with an admin role that has s3:*, the URL grants that access. Sign with a minimal role scoped to the specific key prefix. Never log presigned URLs — they are bearer tokens until expiry.

💡 Pro Tip

Enable S3 Bucket Key with SSE-KMS — reduces KMS API calls by up to 99% for high-throughput buckets. One KMS call per bucket per request batch instead of per object. Costs drop significantly on workloads with millions of small objects.

EBS deep dive

Amazon EBS provides block storage volumes attached to EC2 instances — like a virtual hard drive. Data persists independently of the instance lifecycle (unlike instance store). Choose volume type based on IOPS, throughput, and cost; snapshot to S3 for backup and cross-region DR.

Block storage fundamentals

EBS volumes live in a single Availability Zone. An EC2 instance must be in the same AZ to attach. You can detach and reattach to another instance in the same AZ (stop instance first for root volumes on Nitro). Size and type can be modified online for most volume types. Max 128 volumes per instance (quota increaseable).

Volume types

Type Use case IOPS Throughput Notes
gp3 General purpose — boot volumes, apps, databases 3,000–16,000 (independent of size) 125–1,000 MB/s Default choice — decouple IOPS/throughput from capacity
gp2 Legacy general purpose 3 IOPS/GB 128–250 MB/s Migrate to gp3
io2 Block Express Mission-critical databases — Oracle, SAP HANA Up to 256,000 Up to 4,000 MB/s 99.999% durability; supports Multi-Attach
io2 High-IOPS databases Up to 64,000 Up to 1,000 MB/s Multi-Attach enabled (io2 only, not io1)
st1 Throughput-optimized — big data, logs, Kafka 500 IOPS baseline Up to 500 MB/s HDD; cannot be boot volume
sc1 Cold HDD — infrequent access 250 IOPS baseline Up to 250 MB/s Lowest cost block storage; cannot be boot volume
💰 Cost

gp3 is ~20% cheaper than gp2 at the same size with baseline 3,000 IOPS included. You pay separately for provisioned IOPS and throughput above baseline — right-size instead of over-provisioning a 1 TB gp2 when 100 GB gp3 with 3,000 IOPS suffices. st1/sc1 are cheaper per GB but HDD latency — never use for database data files.

Snapshots and encryption

EBS snapshots are incremental backups in S3 (AWS-managed). Copy cross-region for DR; use DLM for automated schedules. Enable account-level EBS encryption by default — all new volumes use KMS; no performance penalty on Nitro.

Multi-Attach (io2 only)

One volume on up to 16 instances in the same AZ — requires cluster-aware FS (Oracle RAC, GFS2). Standard ext4/xfs without clustering will corrupt. Use EFS for shared POSIX without cluster software.

EBS vs instance store

Feature EBS Instance store
Persistence Survives instance stop/start; independent lifecycle Ephemeral — lost on stop, terminate, or hardware failure
Performance Network-attached; gp3/io2 predictable Local NVMe — lowest latency, highest IOPS on i3/d/r instances
Use case Boot volumes, databases, anything that must persist Caches, temp processing, Kafka log dirs (with replication), Spark shuffle
Snapshots Yes — incremental to S3 No — data gone when instance gone

Create a gp3 volume and attach

saved globally
bash
aws ec2 create-volume \
  --availability-zone eu-west-1a \
  --size 100 \
  --volume-type gp3 \
  --iops 3000 \
  --throughput 125 \
  --encrypted \
  --kms-key-id alias/my-app-ebs-key \
  --tag-specifications 'ResourceType=volume,Tags=[{Key=Name,Value=order-db-data}]'

# Attach to running instance (data volume — not root)
aws ec2 attach-volume \
  --volume-id vol-0abc123def456 \
  --instance-id i-0fedcba987654 \
  --device /dev/sdf

# On the instance: mkfs, mount, fstab
# sudo mkfs -t xfs /dev/nvme1n1
# sudo mount /dev/nvme1n1 /data
hcl
resource "aws_ebs_volume" "order_db_data" {
  availability_zone = "eu-west-1a"
  size              = 100
  type              = "gp3"
  iops              = 3000
  throughput        = 125
  encrypted         = true
  kms_key_id        = aws_kms_key.ebs.arn

  tags = { Name = "order-db-data" }
}

resource "aws_volume_attachment" "order_db" {
  device_name = "/dev/sdf"
  volume_id   = aws_ebs_volume.order_db_data.id
  instance_id = aws_instance.order_db.id
}
typescript
new ec2.Volume(this, 'OrderDbData', {
  availabilityZone: 'eu-west-1a',
  size: cdk.Size.gibibytes(100),
  volumeType: ec2.EbsDeviceVolumeType.GP3,
  iops: 3000, throughput: 125,
  encrypted: true, encryptionKey: key,
});
🎯 Exam Tip

gp3 when the question says "general purpose SSD" or "cost-optimize boot volume." io2 when IOPS > 16,000 or Multi-Attach is required. st1 for streaming sequential reads (log processing), not random I/O. Instance store when latency matters and data is ephemeral/replicated elsewhere.

EFS — shared file storage

Amazon EFS is managed NFS (Network File System) — multiple EC2 instances, ECS tasks, and Lambda functions mount the same filesystem concurrently. Regional and Multi-AZ by default. Pay for what you use; no capacity planning like EBS.

NFS fundamentals

Mount EFS with standard NFS clients (mount -t nfs4 or EFS mount helper). Access via security groups (port 2049) and EFS access points for per-application POSIX identities. Works across all AZs in a region — unlike EBS, no AZ lock-in.

Performance modes

Mode Latency Throughput Use case
General Purpose Lowest — default Scales with size (baseline) or provisioned Web serving, CMS, dev environments, most workloads
Max I/O Higher — more variance Higher aggregate, more ops/sec Big data, media processing, high connector count

Throughput modes

  • Bursting — throughput scales with filesystem size; burst credits for spikes (default, cost-effective for small FS)
  • Provisioned — set throughput independent of size; for consistent high throughput on small datasets
  • Elastic (recommended) — automatically scales throughput up/down; pay for peak usage; replaces bursting for most new workloads

EFS storage classes

Standard for active files; EFS-IA moves files not accessed for 30 days (lifecycle policy). One Zone variants for dev/test — lower cost, no cross-AZ redundancy.

Selection guide — EFS vs EBS vs S3

Requirement Choose Why
Shared files across many EC2/ECS instances EFS POSIX NFS; concurrent mounts; no clustering software needed
Database data directory (PostgreSQL, MySQL) EBS gp3/io2 Block storage; low latency; single instance (or Multi-Attach + cluster FS)
Boot volume for EC2 EBS gp3 Only EBS supports root block devices
Static assets, backups, data lake, user uploads S3 Object storage; HTTP access; unlimited scale; cheapest at rest
Content management with file locking semantics EFS POSIX file locking; WordPress/Drupal shared uploads directory
Lowest latency scratch space on one instance Instance store Local NVMe; ephemeral; no network hop
Serve images/PDFs to browsers globally S3 + CloudFront HTTP CDN; not a filesystem mount from app servers
Lambda processing shared input files EFS Lambda mount targets; S3 for larger immutable inputs
⚖️ Trade-off

EFS vs EBS for shared storage: EFS is simpler (true shared NFS) but higher cost per GB and higher latency than EBS. Don't mount EFS for database files — use RDS/Aurora instead. EFS shines for shared config, upload directories, and CI artifact caches across a fleet.

📦 Real World

ECS services running WordPress or legacy Java apps with local file uploads often use EFS access points — one access point per service with isolated root directory and POSIX user mapping. Modern greenfield apps store uploads in S3 and keep containers stateless; EFS is the bridge for lift-and-shift.

Storage patterns

Production storage is never one service — it's backup tiers, cross-region copies, lifecycle automation, and serving static assets without hitting your Spring Boot app. These patterns appear in every Well-Architected review and Solutions Architect exam scenario.

Backup strategy — 3-2-1 on AWS

Three copies (prod + snapshot + cross-region), two media types (EBS block + S3 object), one offsite (CRR or AWS Backup vault copy).

Source Backup mechanism Restore target
EBS volumes DLM snapshots → optional cross-region copy New volume in any AZ (same region) or DR region
RDS / Aurora Automated backups + manual snapshots Point-in-time restore, cross-region read replica
S3 buckets Versioning + CRR + lifecycle to Glacier Restore version, fail over to DR bucket
EFS AWS Backup or EFS-to-EFS backup Restore to new filesystem
Hybrid / on-prem AWS Storage Gateway or DataSync → S3 EC2 in cloud or reverse sync

Cross-region DR with S3

Primary bucket with versioning + CRR to a DR region; lifecycle moves DR copies to Standard-IA. On regional failure, point config/DNS at the DR bucket. Enable RTC for RPO < 15 minutes.

Spring Boot static assets on S3 + CloudFront

Don't serve static JS/CSS from your JVM — offload to S3 behind CloudFront with Origin Access Control (OAC). CI uploads hashed assets; Spring templates reference CDN URLs in prod.

  • CI runs aws s3 sync with long cache on hashed assets, short cache on index.html
  • CloudFront OAC → private S3 origin; ACM cert on custom domain
  • Spring Boot serves API only in prod; templates reference CDN URLs, not classpath static
bash
# CI deploy after ./mvnw package
aws s3 sync target/classes/static/ s3://my-app-static-prod/assets/ \
  --cache-control "public, max-age=31536000, immutable" --exclude "index.html"
aws s3 cp target/classes/static/index.html s3://my-app-static-prod/index.html \
  --cache-control "public, max-age=60"
aws cloudfront create-invalidation --distribution-id E1234567890ABC \
  --paths "/index.html" "/assets/*"

Operational checklist

  • Account-level Block Public Access + SCP deny on public bucket policies; S3 Access Analyzer weekly
  • Versioning on prod buckets; lifecycle expires noncurrent versions; abort incomplete MPU after 7 days
  • CRR for critical buckets; EBS snapshot cross-region copy via DLM or AWS Backup
  • EBS encryption by default; S3 SSE-KMS with Bucket Key; org-wide Storage Lens for cost anomalies
🔒 Security

Enable S3 Object Ownership = Bucket owner enforced to disable ACLs — simplifies permissions to bucket policies and IAM only. For ransomware resilience: versioning + Object Lock (compliance mode) + cross-account replication to a security account where the destination bucket policy denies deletes from prod roles.

🎯 Exam Tip

When the scenario says "share files across 50 EC2 instances in multiple AZs" → EFS. "Lowest cost archival with retrieval in milliseconds" → Glacier Instant Retrieval, not Deep Archive. "Prevent accidental deletion for compliance" → Object Lock or MFA Delete on versioning. "Static website with global users" → S3 + CloudFront, not public S3 website endpoint alone.