Storage: S3, EBS & EFS

S3 deep dive

Amazon S3 stores objects (files + metadata) in buckets. There is no directory hierarchy — only flat keys that look like paths. Understanding the object model, storage classes, and consistency model is the foundation for everything else: backups, static hosting, data lakes, and cross-region DR.

The object model

Every S3 object has a key (e.g. uploads/2024/invoice.pdf), a value (0 bytes to 5 TB), and metadata (system + user-defined). Buckets are globally unique names in the s3.amazonaws.com namespace — choose names carefully; you cannot rename a bucket without creating a new one and copying objects.

Concept	Details	Production note
Bucket	Container in a region; name is global	One bucket per environment per purpose — avoid mega-buckets with mixed sensitivity
Key / prefix	Flat namespace; / is a convention, not a folder	Design prefixes for lifecycle rules, IAM conditions, and S3 Inventory reports
Version ID	Present when versioning enabled; null for unversioned	Enable on production buckets — protects against accidental overwrite and ransomware
Storage class	Per-object tier — Standard, IA, Glacier, etc.	Set via upload or lifecycle transition; wrong class = wrong cost profile
Object Lock	WORM retention — governance or compliance mode	Requires versioning; bucket must be created with Object Lock enabled

🔬 Under the Hood

S3 is strongly consistent for all operations — after a successful PUT, subsequent GETs immediately return the new object. This changed in December 2020 (previously read-after-write was eventually consistent for overwrites and deletes in some cases). For exam purposes: S3 is strongly consistent today.

Storage classes — comparison

Storage class determines durability, availability, minimum storage duration, retrieval fees, and access latency. Match the class to access pattern — not every object belongs in Standard.

Storage class	Use case	Min duration	Retrieval	Availability
S3 Standard	Hot data — frequent access, low latency	None	Instant, no fee	99.99%
S3 Standard-IA	Infrequent access — backups, DR copies	30 days	Instant + per-GB fee	99.9%
S3 One Zone-IA	Recreatable infrequent data — lower cost, single AZ	30 days	Instant + per-GB fee	99.5%
S3 Intelligent-Tiering	Unknown or changing access patterns — auto-moves tiers	None (small monitoring fee)	Instant (no retrieval fee in Frequent/Infrequent tiers)	99.9%
S3 Glacier Instant Retrieval	Archive with millisecond access — quarterly access OK	90 days	Instant + per-GB fee	99.9%
S3 Glacier Flexible Retrieval	Archive — minutes to hours retrieval (formerly Glacier)	90 days	Expedited (1–5 min), Standard (3–5 hr), Bulk (5–12 hr)	99.99% (after restore)
S3 Glacier Deep Archive	Long-term compliance — annual access or less	180 days	Standard (12 hr), Bulk (48 hr)	99.99% (after restore)

💰 Cost

Standard-IA and Glacier tiers charge retrieval fees — a "cheap" archive bucket that gets read daily will cost more than Standard. One Zone-IA saves ~20% vs Standard-IA but loses cross-AZ redundancy; only use for data you can rebuild. Intelligent-Tiering adds a small monitoring fee per object but eliminates manual tier management — good default for mixed workloads with objects > 128 KB.

Lifecycle rules

Lifecycle configurations automatically transition objects between storage classes or expire (delete) them based on age, prefix, tags, or current storage class. Rules run once per day — not real-time.

Transition — move to IA after 30 days, Glacier after 90 days, Deep Archive after 365 days
Expiration — delete objects or noncurrent versions after N days
Abort incomplete multipart uploads — after 7 days to stop paying for orphaned parts
Filter — apply rules to prefix (logs/) or object tags

Versioning

With versioning enabled, every PUT creates a new version; DELETE adds a delete marker (doesn't remove data). Restore by deleting the delete marker or copying a previous version. Pair with lifecycle rules to expire noncurrent versions after N days — otherwise storage grows forever.

Replication — CRR and SRR

S3 Replication copies objects from a source bucket to a destination bucket automatically. Requires versioning on both buckets. IAM role must allow replication actions.

Type	Scope	Typical use
CRR (Cross-Region)	Source region → different region	DR, lower latency for global users, compliance residency copy
SRR (Same-Region)	Source → bucket in same region	Aggregate logs, separate prod/analytics copies, compliance isolation
RTC (Replication Time Control)	CRR with 15-minute SLA	Regulated DR with predictable RPO; additional cost

CRR copies across regions (DR, compliance); SRR within the same region (log aggregation). Both require versioning on source and destination. RTC adds a 15-minute replication SLA.

Production bucket — encryption + lifecycle

saved globally

aws s3api create-bucket --bucket my-app-artifacts-prod-eu \
  --region eu-west-1 \
  --create-bucket-configuration LocationConstraint=eu-west-1

aws s3api put-bucket-versioning --bucket my-app-artifacts-prod-eu \
  --versioning-configuration Status=Enabled

aws s3api put-bucket-encryption --bucket my-app-artifacts-prod-eu \
  --server-side-encryption-configuration '{
    "Rules": [{
      "ApplyServerSideEncryptionByDefault": {
        "SSEAlgorithm": "aws:kms",
        "KMSMasterKeyID": "alias/my-app-s3-key"
      },
      "BucketKeyEnabled": true
    }]
  }'

aws s3api put-public-access-block --bucket my-app-artifacts-prod-eu \
  --public-access-block-configuration \
    BlockPublicAcls=true,IgnorePublicAcls=true,BlockPublicPolicy=true,RestrictPublicBuckets=true

cat > /tmp/lifecycle.json <<'EOF'
{
  "Rules": [{
    "ID": "archive-and-expire",
    "Status": "Enabled",
    "Filter": { "Prefix": "uploads/" },
    "Transitions": [
      { "Days": 30, "StorageClass": "STANDARD_IA" },
      { "Days": 90, "StorageClass": "GLACIER" }
    ],
    "NoncurrentVersionExpiration": { "NoncurrentDays": 30 },
    "AbortIncompleteMultipartUpload": { "DaysAfterInitiation": 7 }
  }]
}
EOF

aws s3api put-bucket-lifecycle-configuration --bucket my-app-artifacts-prod-eu \
  --lifecycle-configuration file:///tmp/lifecycle.json

resource "aws_s3_bucket" "artifacts" {
  bucket = "my-app-artifacts-prod-eu"
}

resource "aws_s3_bucket_versioning" "artifacts" {
  bucket = aws_s3_bucket.artifacts.id
  versioning_configuration { status = "Enabled" }
}

resource "aws_s3_bucket_server_side_encryption_configuration" "artifacts" {
  bucket = aws_s3_bucket.artifacts.id
  rule {
    apply_server_side_encryption_by_default {
      sse_algorithm     = "aws:kms"
      kms_master_key_id = aws_kms_key.s3.arn
    }
    bucket_key_enabled = true
  }
}

resource "aws_s3_bucket_public_access_block" "artifacts" {
  bucket                  = aws_s3_bucket.artifacts.id
  block_public_acls       = true
  block_public_policy     = true
  ignore_public_acls      = true
  restrict_public_buckets = true
}

resource "aws_s3_bucket_lifecycle_configuration" "artifacts" {
  bucket = aws_s3_bucket.artifacts.id
  rule {
    id     = "archive-and-expire"
    status = "Enabled"
    filter { prefix = "uploads/" }
    transition {
      days          = 30
      storage_class = "STANDARD_IA"
    }
    transition {
      days          = 90
      storage_class = "GLACIER"
    }
    noncurrent_version_expiration { noncurrent_days = 30 }
    abort_incomplete_multipart_upload { days_after_initiation = 7 }
  }
}

import * as s3 from 'aws-cdk-lib/aws-s3';
import * as kms from 'aws-cdk-lib/aws-kms';
import { Duration, RemovalPolicy } from 'aws-cdk-lib';

const key = kms.Key.fromLookup(this, 'S3Key', { aliasName: 'alias/my-app-s3-key' });

new s3.Bucket(this, 'ArtifactsBucket', {
  bucketName: 'my-app-artifacts-prod-eu',
  versioned: true,
  encryption: s3.BucketEncryption.KMS,
  encryptionKey: key,
  bucketKeyEnabled: true,
  blockPublicAccess: s3.BlockPublicAccess.BLOCK_ALL,
  enforceSSL: true,
  lifecycleRules: [{
    id: 'archive-and-expire',
    prefix: 'uploads/',
    transitions: [
      { storageClass: s3.StorageClass.INFREQUENT_ACCESS, transitionAfter: Duration.days(30) },
      { storageClass: s3.StorageClass.GLACIER, transitionAfter: Duration.days(90) },
    ],
    noncurrentVersionExpiration: Duration.days(30),
    abortIncompleteMultipartUploadAfter: Duration.days(7),
  }],
  removalPolicy: RemovalPolicy.RETAIN,
});

🎯 Exam Tip

Versioning must be enabled on both source and destination for replication. CRR does not replicate existing objects by default — only new/changed objects after rule creation (unless you use S3 Batch Replication for backfill). Glacier and Deep Archive objects cannot be replicated directly — transition happens at destination per its lifecycle rules.

S3 security & performance

Most S3 breaches are misconfiguration, not sophisticated attacks. Block Public Access, bucket policies, encryption enforcement, and presigned URLs are your toolkit. On the performance side: prefix design, multipart upload, and S3 Select reduce latency and cost at scale.

Block Public Access

Four account-level and bucket-level settings that override any policy making a bucket public. Enable at the account level in Organizations — defense in depth with SCPs and bucket policies. Even with Block Public Access on, a misconfigured bucket policy can still grant overly broad access to authenticated AWS principals — BPA only blocks anonymous/public access.

Bucket policies vs IAM policies

S3 bucket policies are resource-based — they attach to the bucket and can grant cross-account access without an IAM policy on the other side (though the other principal often still needs permission to call S3). Use bucket policies for: CloudFront OAC, cross-account read, denying unencrypted uploads, requiring VPC endpoint access.

{
  "Version": "2012-10-17",
  "Statement": [{
    "Sid": "DenyUnencryptedUploads",
    "Effect": "Deny",
    "Principal": "*",
    "Action": "s3:PutObject",
    "Resource": "arn:aws:s3:::my-app-artifacts-prod-eu/*",
    "Condition": {
      "StringNotEquals": {
        "s3:x-amz-server-side-encryption": "aws:kms"
      }
    }
  }]
}

Presigned URLs and POST

Presigned URLs grant temporary access to a specific object (GET or PUT) without making the bucket public. Generated by signing with IAM credentials — expiry from seconds to 7 days (SigV4). Presigned POST allows browser direct upload via HTML form — common for user file uploads from a Spring Boot app without proxying bytes through your API.

Generate a presigned download URL

saved globally

# Presigned GET — share invoice download for 15 minutes
aws s3 presign s3://my-app-artifacts-prod-eu/invoices/2024/inv-001.pdf \
  --expires-in 900 \
  --region eu-west-1

# Presigned PUT — client uploads directly (Spring returns URL to browser)
aws s3 presign s3://my-app-artifacts-prod-eu/uploads/user-42/doc.pdf \
  --expires-in 3600 \
  --region eu-west-1 \
  --http-method PUT

# Verify the URL works
curl -I "$(aws s3 presign s3://my-app-artifacts-prod-eu/invoices/2024/inv-001.pdf --expires-in 300)"

# Presign at runtime — grant s3:GetObject on the prefix to the app role.
resource "aws_iam_role_policy" "presign" {
  role = aws_iam_role.api.id
  policy = jsonencode({
    Statement = [{
      Effect   = "Allow"
      Action   = ["s3:GetObject"]
      Resource = "${aws_s3_bucket.artifacts.arn}/invoices/*"
    }]
  })
}

taskRole.addToPolicy(new iam.PolicyStatement({
  actions: ['s3:GetObject'],
  resources: [`${bucket.bucketArn}/invoices/*`],
}));
// Runtime: S3Presigner.create().presignGetObject(...) — 15 min expiry

Encryption — SSE-S3, SSE-KMS, SSE-C

Type	Keys managed by	When to use
SSE-S3 (AES256)	AWS — no KMS charges	Default for non-sensitive bulk storage; simplest setup
SSE-KMS	AWS KMS — audit trail, key rotation, cross-account	Production default for PII/financial data; enable Bucket Key to cut KMS API costs
SSE-C	Customer provides key per request	Rare — you manage key lifecycle; AWS never stores the key

🔒 Security

Enforce encryption at rest with a bucket policy Deny on s3:PutObject when s3:x-amz-server-side-encryption is missing or wrong. Pair with aws:SecureTransport deny for HTTP. For SSE-KMS, the caller needs kms:Decrypt and kms:GenerateDataKey on the key — a common "access denied" on GetObject.

Prefix strategy and request rate

S3 scales automatically, but extremely hot prefixes (millions of PUTs/sec to one prefix) can throttle. For high-throughput workloads, use hex hash prefixes: uploads/a3/f9/object-id instead of uploads/2024/06/10/object-id when date clustering creates hotspots. For most backend apps, date-based prefixes are fine.

Multipart upload

Required for objects > 5 GB; recommended for > 100 MB. Upload parts in parallel (5 MB–5 GB each, max 10,000 parts). Failed uploads leave parts that cost money — lifecycle rule to abort incomplete uploads after 7 days is mandatory. Complete with CompleteMultipartUpload; list in-progress with ListMultipartUploads.

S3 Select and events

S3 Select — SQL on CSV/JSON/Parquet in-place; filter without full download
Event notifications — Lambda, SQS, SNS, EventBridge on object create/delete (at-least-once)

Object Lock (WORM)

Retention or legal hold prevents deletion — Governance (admins can override) vs Compliance (nobody deletes until expiry). Requires versioning; bucket created with Object Lock enabled.

⚠️ Pitfall

Presigned URLs inherit the permissions of the signer — if you presign with an admin role that has s3:*, the URL grants that access. Sign with a minimal role scoped to the specific key prefix. Never log presigned URLs — they are bearer tokens until expiry.

💡 Pro Tip

Enable S3 Bucket Key with SSE-KMS — reduces KMS API calls by up to 99% for high-throughput buckets. One KMS call per bucket per request batch instead of per object. Costs drop significantly on workloads with millions of small objects.

EBS deep dive

Amazon EBS provides block storage volumes attached to EC2 instances — like a virtual hard drive. Data persists independently of the instance lifecycle (unlike instance store). Choose volume type based on IOPS, throughput, and cost; snapshot to S3 for backup and cross-region DR.

Block storage fundamentals

EBS volumes live in a single Availability Zone. An EC2 instance must be in the same AZ to attach. You can detach and reattach to another instance in the same AZ (stop instance first for root volumes on Nitro). Size and type can be modified online for most volume types. Max 128 volumes per instance (quota increaseable).

Volume types

Type	Use case	IOPS	Throughput	Notes
gp3	General purpose — boot volumes, apps, databases	3,000–16,000 (independent of size)	125–1,000 MB/s	Default choice — decouple IOPS/throughput from capacity
gp2	Legacy general purpose	3 IOPS/GB	128–250 MB/s	Migrate to gp3
io2 Block Express	Mission-critical databases — Oracle, SAP HANA	Up to 256,000	Up to 4,000 MB/s	99.999% durability; supports Multi-Attach
io2	High-IOPS databases	Up to 64,000	Up to 1,000 MB/s	Multi-Attach enabled (io2 only, not io1)
st1	Throughput-optimized — big data, logs, Kafka	500 IOPS baseline	Up to 500 MB/s	HDD; cannot be boot volume
sc1	Cold HDD — infrequent access	250 IOPS baseline	Up to 250 MB/s	Lowest cost block storage; cannot be boot volume

💰 Cost

gp3 is ~20% cheaper than gp2 at the same size with baseline 3,000 IOPS included. You pay separately for provisioned IOPS and throughput above baseline — right-size instead of over-provisioning a 1 TB gp2 when 100 GB gp3 with 3,000 IOPS suffices. st1/sc1 are cheaper per GB but HDD latency — never use for database data files.

Snapshots and encryption

EBS snapshots are incremental backups in S3 (AWS-managed). Copy cross-region for DR; use DLM for automated schedules. Enable account-level EBS encryption by default — all new volumes use KMS; no performance penalty on Nitro.

Multi-Attach (io2 only)

One volume on up to 16 instances in the same AZ — requires cluster-aware FS (Oracle RAC, GFS2). Standard ext4/xfs without clustering will corrupt. Use EFS for shared POSIX without cluster software.

EBS vs instance store

Feature	EBS	Instance store
Persistence	Survives instance stop/start; independent lifecycle	Ephemeral — lost on stop, terminate, or hardware failure
Performance	Network-attached; gp3/io2 predictable	Local NVMe — lowest latency, highest IOPS on i3/d/r instances
Use case	Boot volumes, databases, anything that must persist	Caches, temp processing, Kafka log dirs (with replication), Spark shuffle
Snapshots	Yes — incremental to S3	No — data gone when instance gone

Create a gp3 volume and attach

saved globally

aws ec2 create-volume \
  --availability-zone eu-west-1a \
  --size 100 \
  --volume-type gp3 \
  --iops 3000 \
  --throughput 125 \
  --encrypted \
  --kms-key-id alias/my-app-ebs-key \
  --tag-specifications 'ResourceType=volume,Tags=[{Key=Name,Value=order-db-data}]'

# Attach to running instance (data volume — not root)
aws ec2 attach-volume \
  --volume-id vol-0abc123def456 \
  --instance-id i-0fedcba987654 \
  --device /dev/sdf

# On the instance: mkfs, mount, fstab
# sudo mkfs -t xfs /dev/nvme1n1
# sudo mount /dev/nvme1n1 /data

resource "aws_ebs_volume" "order_db_data" {
  availability_zone = "eu-west-1a"
  size              = 100
  type              = "gp3"
  iops              = 3000
  throughput        = 125
  encrypted         = true
  kms_key_id        = aws_kms_key.ebs.arn

  tags = { Name = "order-db-data" }
}

resource "aws_volume_attachment" "order_db" {
  device_name = "/dev/sdf"
  volume_id   = aws_ebs_volume.order_db_data.id
  instance_id = aws_instance.order_db.id
}

new ec2.Volume(this, 'OrderDbData', {
  availabilityZone: 'eu-west-1a',
  size: cdk.Size.gibibytes(100),
  volumeType: ec2.EbsDeviceVolumeType.GP3,
  iops: 3000, throughput: 125,
  encrypted: true, encryptionKey: key,
});

🎯 Exam Tip

gp3 when the question says "general purpose SSD" or "cost-optimize boot volume." io2 when IOPS > 16,000 or Multi-Attach is required. st1 for streaming sequential reads (log processing), not random I/O. Instance store when latency matters and data is ephemeral/replicated elsewhere.

EFS — shared file storage

Amazon EFS is managed NFS (Network File System) — multiple EC2 instances, ECS tasks, and Lambda functions mount the same filesystem concurrently. Regional and Multi-AZ by default. Pay for what you use; no capacity planning like EBS.

NFS fundamentals

Mount EFS with standard NFS clients (mount -t nfs4 or EFS mount helper). Access via security groups (port 2049) and EFS access points for per-application POSIX identities. Works across all AZs in a region — unlike EBS, no AZ lock-in.

Performance modes

Mode	Latency	Throughput	Use case
General Purpose	Lowest — default	Scales with size (baseline) or provisioned	Web serving, CMS, dev environments, most workloads
Max I/O	Higher — more variance	Higher aggregate, more ops/sec	Big data, media processing, high connector count

Throughput modes

Bursting — throughput scales with filesystem size; burst credits for spikes (default, cost-effective for small FS)
Provisioned — set throughput independent of size; for consistent high throughput on small datasets
Elastic (recommended) — automatically scales throughput up/down; pay for peak usage; replaces bursting for most new workloads

EFS storage classes

Standard for active files; EFS-IA moves files not accessed for 30 days (lifecycle policy). One Zone variants for dev/test — lower cost, no cross-AZ redundancy.

Selection guide — EFS vs EBS vs S3

Requirement	Choose	Why
Shared files across many EC2/ECS instances	EFS	POSIX NFS; concurrent mounts; no clustering software needed
Database data directory (PostgreSQL, MySQL)	EBS gp3/io2	Block storage; low latency; single instance (or Multi-Attach + cluster FS)
Boot volume for EC2	EBS gp3	Only EBS supports root block devices
Static assets, backups, data lake, user uploads	S3	Object storage; HTTP access; unlimited scale; cheapest at rest
Content management with file locking semantics	EFS	POSIX file locking; WordPress/Drupal shared uploads directory
Lowest latency scratch space on one instance	Instance store	Local NVMe; ephemeral; no network hop
Serve images/PDFs to browsers globally	S3 + CloudFront	HTTP CDN; not a filesystem mount from app servers
Lambda processing shared input files	EFS	Lambda mount targets; S3 for larger immutable inputs

⚖️ Trade-off

EFS vs EBS for shared storage: EFS is simpler (true shared NFS) but higher cost per GB and higher latency than EBS. Don't mount EFS for database files — use RDS/Aurora instead. EFS shines for shared config, upload directories, and CI artifact caches across a fleet.

📦 Real World

ECS services running WordPress or legacy Java apps with local file uploads often use EFS access points — one access point per service with isolated root directory and POSIX user mapping. Modern greenfield apps store uploads in S3 and keep containers stateless; EFS is the bridge for lift-and-shift.

Storage patterns

Production storage is never one service — it's backup tiers, cross-region copies, lifecycle automation, and serving static assets without hitting your Spring Boot app. These patterns appear in every Well-Architected review and Solutions Architect exam scenario.

Backup strategy — 3-2-1 on AWS

Three copies (prod + snapshot + cross-region), two media types (EBS block + S3 object), one offsite (CRR or AWS Backup vault copy).

Source	Backup mechanism	Restore target
EBS volumes	DLM snapshots → optional cross-region copy	New volume in any AZ (same region) or DR region
RDS / Aurora	Automated backups + manual snapshots	Point-in-time restore, cross-region read replica
S3 buckets	Versioning + CRR + lifecycle to Glacier	Restore version, fail over to DR bucket
EFS	AWS Backup or EFS-to-EFS backup	Restore to new filesystem
Hybrid / on-prem	AWS Storage Gateway or DataSync → S3	EC2 in cloud or reverse sync

Cross-region DR with S3

Primary bucket with versioning + CRR to a DR region; lifecycle moves DR copies to Standard-IA. On regional failure, point config/DNS at the DR bucket. Enable RTC for RPO < 15 minutes.

Spring Boot static assets on S3 + CloudFront

Don't serve static JS/CSS from your JVM — offload to S3 behind CloudFront with Origin Access Control (OAC). CI uploads hashed assets; Spring templates reference CDN URLs in prod.

CI runs aws s3 sync with long cache on hashed assets, short cache on index.html
CloudFront OAC → private S3 origin; ACM cert on custom domain
Spring Boot serves API only in prod; templates reference CDN URLs, not classpath static

# CI deploy after ./mvnw package
aws s3 sync target/classes/static/ s3://my-app-static-prod/assets/ \
  --cache-control "public, max-age=31536000, immutable" --exclude "index.html"
aws s3 cp target/classes/static/index.html s3://my-app-static-prod/index.html \
  --cache-control "public, max-age=60"
aws cloudfront create-invalidation --distribution-id E1234567890ABC \
  --paths "/index.html" "/assets/*"

Operational checklist

Account-level Block Public Access + SCP deny on public bucket policies; S3 Access Analyzer weekly
Versioning on prod buckets; lifecycle expires noncurrent versions; abort incomplete MPU after 7 days
CRR for critical buckets; EBS snapshot cross-region copy via DLM or AWS Backup
EBS encryption by default; S3 SSE-KMS with Bucket Key; org-wide Storage Lens for cost anomalies

🔒 Security

Enable S3 Object Ownership = Bucket owner enforced to disable ACLs — simplifies permissions to bucket policies and IAM only. For ransomware resilience: versioning + Object Lock (compliance mode) + cross-account replication to a security account where the destination bucket policy denies deletes from prod roles.

🎯 Exam Tip

When the scenario says "share files across 50 EC2 instances in multiple AZs" → EFS. "Lowest cost archival with retrieval in milliseconds" → Glacier Instant Retrieval, not Deep Archive. "Prevent accidental deletion for compliance" → Object Lock or MFA Delete on versioning. "Static website with global users" → S3 + CloudFront, not public S3 website endpoint alone.