AWS Core

Learn AWS the way production teams actually run it

Amazon Web Services through the lens of a backend Java/Spring engineer who already knows Docker, Kubernetes, Kafka, and microservices. No "what is cloud computing" — straight to IAM policy evaluation, VPC design, Graviton economics, Multi-AZ failover, and what actually happens inside AWS when you call CreateBucket. Every resource shown three ways: AWS CLI + Terraform + CDK.

Browse all sections ↓ The AWS mental model Start with IAM

What this site covers

IAM done right — roles, policies, federation, least privilege in practice
Production VPC design — three-tier networks, endpoints, NAT economics
EC2/ECS/EKS/Lambda — when to use which, with Java-specific guidance
S3, RDS, Aurora, DynamoDB, ElastiCache — internals and failure modes
Cost: pricing model + estimation formula + optimization per service
Security, encryption, and network isolation for every service
Solutions Architect Associate/Professional exam angles throughout

What it deliberately skips

"What is the cloud" intros — you ship software already
Console-only click-through tutorials with no IaC
Niche services you'll never touch (Ground Station, RoboMaker…)
Machine-learning service catalog tours — see Applied AI instead
Docker/K8s fundamentals — covered in DockerCore and K8sCore

The AWS mental model: everything is an API call

There is no magic in the console. Every button click, every CLI command, every Terraform apply, every CDK deploy becomes a signed HTTPS request to a regional service endpoint like ec2.eu-west-1.amazonaws.com. Once you internalize this, AWS stops being 200+ products and becomes one consistent system: authenticate → authorize (IAM) → execute → audit (CloudTrail).

Caller

Console, CLI, SDK, Terraform, CDK — all build the same API request

SigV4 signature

Request signed with credentials (never sent raw) + timestamp + region + service

IAM evaluation

Who are you? Is this action on this resource allowed? Explicit deny wins

Service executes

Control plane mutates state; data plane serves traffic

CloudTrail

Who called what API, when, from where — the audit trail

🔬 Under the Hood

Control plane vs data plane: AWS services split management operations (RunInstances, CreateTable — low volume, strongly consistent, regional) from data operations (GetObject, Query — massive volume, highly available, often zonal). During large AWS incidents the data plane usually keeps serving while the control plane is degraded — which is why "design so you don't need control-plane actions during failover" is a core resilience principle.

🎯 Exam Tip

IAM, CloudFront, Route53, and WAF (for CloudFront) are global; almost everything else is regional. S3 bucket names are global but buckets live in one region. Exam questions love testing whether you know a resource's scope.

ARN anatomy — how AWS names everything

Every resource has an Amazon Resource Name. You will write hundreds of them in IAM policies, so learn the segments cold. Hover or tap each part:

Interactive — click each segment

:::::

One resource, three tools — the pattern used everywhere on this site

Every resource example on this site ships as AWS CLI + Terraform + CDK. Pick your tool once — the choice persists across all pages. Here's a production-grade S3 bucket (encrypted, versioned, no public access):

saved globally

aws s3api create-bucket --bucket my-app-artifacts --region us-east-1

aws s3api put-public-access-block --bucket my-app-artifacts \
  --public-access-block-configuration \
  BlockPublicAcls=true,IgnorePublicAcls=true,BlockPublicPolicy=true,RestrictPublicBuckets=true

aws s3api put-bucket-versioning --bucket my-app-artifacts \
  --versioning-configuration Status=Enabled

aws s3api put-bucket-encryption --bucket my-app-artifacts \
  --server-side-encryption-configuration \
  '{"Rules":[{"ApplyServerSideEncryptionByDefault":{"SSEAlgorithm":"aws:kms"}}]}'

resource "aws_s3_bucket" "artifacts" {
  bucket = "my-app-artifacts"
}

resource "aws_s3_bucket_public_access_block" "artifacts" {
  bucket                  = aws_s3_bucket.artifacts.id
  block_public_acls       = true
  ignore_public_acls      = true
  block_public_policy     = true
  restrict_public_buckets = true
}

resource "aws_s3_bucket_versioning" "artifacts" {
  bucket = aws_s3_bucket.artifacts.id
  versioning_configuration {
    status = "Enabled"
  }
}

resource "aws_s3_bucket_server_side_encryption_configuration" "artifacts" {
  bucket = aws_s3_bucket.artifacts.id
  rule {
    apply_server_side_encryption_by_default {
      sse_algorithm = "aws:kms"
    }
  }
}

import * as s3 from 'aws-cdk-lib/aws-s3';
import { RemovalPolicy } from 'aws-cdk-lib';

new s3.Bucket(this, 'Artifacts', {
  bucketName: 'my-app-artifacts',
  encryption: s3.BucketEncryption.KMS_MANAGED,
  blockPublicAccess: s3.BlockPublicAccess.BLOCK_ALL,
  versioned: true,
  removalPolicy: RemovalPolicy.RETAIN, // never delete data by accident
});

💡 Pro Tip

Notice the CDK version is 8 lines vs ~30 for CLI/Terraform — L2 constructs encode AWS best practices as defaults. The CLI version requires four separate API calls because that's what's really happening: there is no single "create secure bucket" API. The console hides this; IaC doesn't.

Global infrastructure 2024 numbers

AWS is physically organized as a hierarchy. Each layer trades blast-radius isolation against latency and cost. Knowing this hierarchy is the foundation for every availability decision you'll make.

Regions

Fully independent geographic deployments (us-east-1, eu-west-1). Separate power grids, separate control planes. Your data never leaves a region unless you move it.
33 regions worldwide
Availability Zones

One or more discrete data centers with independent power, cooling, and networking — connected by <2ms fiber. The unit of fault isolation for Multi-AZ design.
105 AZs · 3+ per region
Local Zones

Compute/storage extensions in metro areas (Los Angeles, Lagos) for single-digit-ms latency to end users. Subset of services, parent-region control plane.
35+ metro areas
Wavelength Zones

AWS compute embedded inside 5G carrier networks (Verizon, Vodafone) — traffic never leaves the telecom network. Ultra-low-latency mobile edge.
Inside carrier 5G networks
Edge Locations

CloudFront points of presence — caching, Lambda@Edge, Route53 DNS, AWS Shield. Far more numerous than regions; this is where your users actually connect.
600+ CloudFront PoPs · 90+ countries

⚠️ Pitfall

us-east-1 (N. Virginia) is special and dangerous. It's the oldest, biggest, and cheapest region, hosts global control planes (IAM, CloudFront certificates must live there), and historically has the most outages. Don't put your only production deployment there just because tutorials do. Also: inter-region and inter-AZ data transfer costs real money — region choice is a cost decision, not just a latency one.

🎯 Exam Tip

Multi-AZ = high availability (survive an AZ failure, synchronous, same region). Multi-region = disaster recovery + latency (survive a region failure, asynchronous). The exam tests this distinction constantly: "company needs RPO of seconds across geographic failure" → multi-region; "survive data center failure with no data loss" → Multi-AZ.

The shared responsibility model

AWS secures the cloud; you secure what you put in the cloud. Almost every "AWS data breach" in the news was a customer misconfiguration — a public S3 bucket, leaked access keys, an overprivileged role. The line moves depending on the service: more managed = more AWS responsibility.

Physical security — data centers, hardware, network infrastructure, supply chain
Hypervisor & host OS — the Nitro system isolating your EC2 instances from other tenants
Managed service infrastructure — RDS host patching, Lambda runtime environment, S3 storage durability (11 nines)
Global network — DDoS absorption at the edge (Shield Standard), inter-region backbone encryption
Compliance of the platform — SOC 2, ISO 27001, PCI DSS certification of AWS infrastructure itself

IAM — who can call which APIs on which resources. The single most common failure point
OS patches — on EC2 you patch the guest OS (on RDS/Fargate/Lambda, AWS does)
Application code — your vulnerabilities are your problem, wherever they run
Data encryption — enabling encryption at rest, enforcing TLS in transit, key policies
Network configuration — security groups, NACLs, public vs private subnets, S3 Block Public Access
Compliance of your workload — AWS being PCI-certified doesn't make your app PCI-compliant

📦 Real World

The 2019 Capital One breach: a misconfigured WAF allowed SSRF against the EC2 metadata service, which handed over credentials for an overprivileged IAM role with s3:List* and s3:Get* on everything — 100M records exfiltrated. Every control that failed was on the customer side of the line: IMDSv1 (not v2), role scope, bucket policies. This single incident motivates half of the IAM chapter.

The AWS service map — what you actually need

AWS lists 200+ services. A production backend uses maybe 25. This is the map of the ones that matter, grouped the way this guide is organized.

Compute

EC2 — virtual machines, the foundation
ECS / Fargate — containers, AWS-native
EKS — managed Kubernetes
Lambda — event-driven functions

Storage

S3 — object storage, 11 nines durability
EBS — block storage for EC2
EFS — shared NFS across AZs
Glacier — archival tiers

Database

RDS / Aurora — managed PostgreSQL/MySQL
DynamoDB — serverless NoSQL, single-digit ms
ElastiCache — managed Redis/Memcached
Redshift — columnar analytics warehouse

Networking

VPC — your isolated network
Route53 — DNS + routing policies
CloudFront — CDN, 600+ PoPs
ALB / NLB — L7 / L4 load balancing
API Gateway — managed API front door

Messaging

SQS — queues, the workhorse
SNS — pub/sub fan-out
EventBridge — event bus + scheduler
Kinesis — real-time streams (Kafka-like)

Security

IAM — identity and access, learn this first
KMS — encryption keys, envelope encryption
Secrets Manager — rotated credentials
WAF + Shield + GuardDuty — defense layers

Developer Tools

ECR — container registry
CodePipeline / CodeBuild — AWS-native CI/CD
CDK — infrastructure in TypeScript/Java
CloudFormation — the IaC engine underneath

Observability

CloudWatch — metrics, logs, alarms
X-Ray — distributed tracing
CloudTrail — API audit log

⚖️ Trade-off

AWS-native vs open-source: SQS vs RabbitMQ, Kinesis vs Kafka, DynamoDB vs Cassandra, ECS vs Kubernetes. AWS-native means less ops burden and deep IAM integration but vendor lock-in; open-source means portability and ecosystem but you (or MSK/EKS pricing) carry the operational load. Each chapter covers the decision explicitly.

Account structure — the part everyone gets wrong first

An AWS account is the strongest isolation boundary AWS offers — stronger than VPCs, stronger than IAM. Mature organizations run dozens of accounts under AWS Organizations. One account per environment is the minimum bar.

flowchart TB
  MGMT["Management account\n(billing only — no workloads!)"]
  ORG["AWS Organizations\n+ Service Control Policies"]
  SEC["Security OU"]
  WORK["Workloads OU"]
  SAND["Sandbox OU"]
  LOG["Log archive\naccount"]
  AUDIT["Security tooling\naccount"]
  DEV["dev\naccount"]
  STG["staging\naccount"]
  PROD["prod\naccount"]
  DEVS["per-engineer\nsandboxes"]
  MGMT --> ORG
  ORG --> SEC
  ORG --> WORK
  ORG --> SAND
  SEC --> LOG
  SEC --> AUDIT
  WORK --> DEV
  WORK --> STG
  WORK --> PROD
  SAND --> DEVS

Management account — owns the Organization and consolidated billing. Run zero workloads here.
One account per environment — dev, staging, prod. NEVER share an account between environments: a dev script with a wildcard delete cannot touch prod if prod is a different account.
Security accounts — CloudTrail logs and GuardDuty findings delivered to an account application teams can't write to.
SCPs as guardrails — org-level policies that even account admins can't bypass: deny leaving the org, deny disabling CloudTrail, deny unapproved regions.
Blast radius — leaked credentials, runaway costs, and quota exhaustion are all contained per account.

🔒 Security

Root user checklist (do this today): enable MFA (hardware key for the management account), delete root access keys, set a strong unique password, store recovery in a break-glass procedure, and never use root for daily work. Root bypasses all IAM policies — it can do anything, including closing the account.

⚠️ Pitfall

Starting with one account "to keep it simple" and untangling it two years later is one of the most expensive migrations in cloud engineering — resources, IAM history, and data gravity all resist moving. Use AWS Organizations from day one even if you only create three accounts.

AWS timeline — how the platform was built

The launch order explains the architecture: storage and queues first, compute second, managed databases third, serverless and containers a decade later. Older services are lower-level; newer ones abstract them away.

2006
S3 + SQS launch · EC2 beta

The original trio: durable objects, reliable queues, rentable VMs. Everything since builds on these primitives.
2009
RDS · VPC

Managed relational databases and software-defined private networking — the enterprise unlock.
2012
DynamoDB · Glacier

Serverless NoSQL born from the Dynamo paper (Amazon's own shopping-cart database), plus deep archival storage.
2014
Lambda · ECS

Functions-as-a-service invents the serverless category; ECS answers Docker's rise with AWS-native orchestration.
2016
Application Load Balancer

Layer-7 routing, host/path rules, WebSocket, HTTP/2 — the default front door for containerized services.
2017–2019
Fargate preview → EKS → Fargate GA

Serverless containers and managed Kubernetes — AWS concedes K8s won the orchestration war while betting Fargate abstracts it away.
2020–2022
Graviton2 → Graviton3

AWS-designed ARM CPUs: 20–40% better price/performance. The biggest free cost optimization most teams ignore.
2023
Bedrock

Managed foundation-model APIs (Claude, Llama, Titan) — AWS's entry into the GenAI platform race.
2024
Graviton4

Fourth-generation ARM — 75% more memory bandwidth than Graviton3. New default for r8g memory-optimized instances.

Explore the guide — all sections

Thirteen deep-dive chapters plus cheat sheets. Recommended path: IAM → VPC → Compute → Storage → Databases, then messaging, edge, and architecture patterns as your role requires.

Learning path: IAM · VPC & Networking · Compute · Storage · Databases · Cost

Backend Developer

developer

Java/Spring engineer deploying a first production workload: EC2, ECS, RDS, S3, SQS/SNS, API Gateway, and enough IAM to not get the account compromised.

DevOps / Platform Engineer

devops

Reusable infrastructure with Terraform/CDK: VPCs, EKS clusters, multi-account strategy, cost dashboards, CI/CD pipelines on AWS.

Solutions Architect

architect

Resilient multi-region systems, AWS-native vs open-source selection, build-vs-buy decisions, SA Associate/Professional exam preparation.

Learn AWS the way production teams actually run it

What this site covers

What it deliberately skips

ARN anatomy — how AWS names everything

One resource, three tools — the pattern used everywhere on this site

Regions

Availability Zones

Local Zones

Wavelength Zones

Edge Locations

Compute

Storage

Database

Networking

Messaging

Security

Developer Tools

Observability

S3 + SQS launch · EC2 beta

RDS · VPC

DynamoDB · Glacier

Lambda · ECS

Application Load Balancer

Fargate preview → EKS → Fargate GA

Graviton2 → Graviton3

Bedrock

Graviton4

Backend Developer

DevOps / Platform Engineer

Solutions Architect

IAM: Identity & Access Management

VPC & Networking

Compute: EC2, ECS, EKS & Lambda

Storage: S3, EBS & EFS

Databases: RDS, Aurora, DynamoDB & ElastiCache

Messaging: SQS, SNS, EventBridge & Kinesis

API Gateway, ALB & CloudFront

Observability: CloudWatch, X-Ray & CloudTrail

Security Services

Infrastructure as Code: CDK, Terraform & CloudFormation

Cost Management & Optimization

Production Architecture Patterns

Cheat Sheets