AWS Core

Learn AWS the way production teams actually run it

Amazon Web Services through the lens of a backend Java/Spring engineer who already knows Docker, Kubernetes, Kafka, and microservices. No "what is cloud computing" — straight to IAM policy evaluation, VPC design, Graviton economics, Multi-AZ failover, and what actually happens inside AWS when you call CreateBucket. Every resource shown three ways: AWS CLI + Terraform + CDK.

What this site covers

  • IAM done right — roles, policies, federation, least privilege in practice
  • Production VPC design — three-tier networks, endpoints, NAT economics
  • EC2/ECS/EKS/Lambda — when to use which, with Java-specific guidance
  • S3, RDS, Aurora, DynamoDB, ElastiCache — internals and failure modes
  • Cost: pricing model + estimation formula + optimization per service
  • Security, encryption, and network isolation for every service
  • Solutions Architect Associate/Professional exam angles throughout

What it deliberately skips

  • "What is the cloud" intros — you ship software already
  • Console-only click-through tutorials with no IaC
  • Niche services you'll never touch (Ground Station, RoboMaker…)
  • Machine-learning service catalog tours — see Applied AI instead
  • Docker/K8s fundamentals — covered in DockerCore and K8sCore

The AWS mental model: everything is an API call

There is no magic in the console. Every button click, every CLI command, every Terraform apply, every CDK deploy becomes a signed HTTPS request to a regional service endpoint like ec2.eu-west-1.amazonaws.com. Once you internalize this, AWS stops being 200+ products and becomes one consistent system: authenticate → authorize (IAM) → execute → audit (CloudTrail).

Caller

Console, CLI, SDK, Terraform, CDK — all build the same API request

SigV4 signature

Request signed with credentials (never sent raw) + timestamp + region + service

IAM evaluation

Who are you? Is this action on this resource allowed? Explicit deny wins

Service executes

Control plane mutates state; data plane serves traffic

CloudTrail

Who called what API, when, from where — the audit trail

🔬 Under the Hood

Control plane vs data plane: AWS services split management operations (RunInstances, CreateTable — low volume, strongly consistent, regional) from data operations (GetObject, Query — massive volume, highly available, often zonal). During large AWS incidents the data plane usually keeps serving while the control plane is degraded — which is why "design so you don't need control-plane actions during failover" is a core resilience principle.

🎯 Exam Tip

IAM, CloudFront, Route53, and WAF (for CloudFront) are global; almost everything else is regional. S3 bucket names are global but buckets live in one region. Exam questions love testing whether you know a resource's scope.

ARN anatomy — how AWS names everything

Every resource has an Amazon Resource Name. You will write hundreds of them in IAM policies, so learn the segments cold. Hover or tap each part:

Interactive — click each segment

:::::

One resource, three tools — the pattern used everywhere on this site

Every resource example on this site ships as AWS CLI + Terraform + CDK. Pick your tool once — the choice persists across all pages. Here's a production-grade S3 bucket (encrypted, versioned, no public access):

saved globally
bash
aws s3api create-bucket --bucket my-app-artifacts --region us-east-1

aws s3api put-public-access-block --bucket my-app-artifacts \
  --public-access-block-configuration \
  BlockPublicAcls=true,IgnorePublicAcls=true,BlockPublicPolicy=true,RestrictPublicBuckets=true

aws s3api put-bucket-versioning --bucket my-app-artifacts \
  --versioning-configuration Status=Enabled

aws s3api put-bucket-encryption --bucket my-app-artifacts \
  --server-side-encryption-configuration \
  '{"Rules":[{"ApplyServerSideEncryptionByDefault":{"SSEAlgorithm":"aws:kms"}}]}'
hcl
resource "aws_s3_bucket" "artifacts" {
  bucket = "my-app-artifacts"
}

resource "aws_s3_bucket_public_access_block" "artifacts" {
  bucket                  = aws_s3_bucket.artifacts.id
  block_public_acls       = true
  ignore_public_acls      = true
  block_public_policy     = true
  restrict_public_buckets = true
}

resource "aws_s3_bucket_versioning" "artifacts" {
  bucket = aws_s3_bucket.artifacts.id
  versioning_configuration {
    status = "Enabled"
  }
}

resource "aws_s3_bucket_server_side_encryption_configuration" "artifacts" {
  bucket = aws_s3_bucket.artifacts.id
  rule {
    apply_server_side_encryption_by_default {
      sse_algorithm = "aws:kms"
    }
  }
}
typescript
import * as s3 from 'aws-cdk-lib/aws-s3';
import { RemovalPolicy } from 'aws-cdk-lib';

new s3.Bucket(this, 'Artifacts', {
  bucketName: 'my-app-artifacts',
  encryption: s3.BucketEncryption.KMS_MANAGED,
  blockPublicAccess: s3.BlockPublicAccess.BLOCK_ALL,
  versioned: true,
  removalPolicy: RemovalPolicy.RETAIN, // never delete data by accident
});
💡 Pro Tip

Notice the CDK version is 8 lines vs ~30 for CLI/Terraform — L2 constructs encode AWS best practices as defaults. The CLI version requires four separate API calls because that's what's really happening: there is no single "create secure bucket" API. The console hides this; IaC doesn't.

Global infrastructure 2024 numbers

AWS is physically organized as a hierarchy. Each layer trades blast-radius isolation against latency and cost. Knowing this hierarchy is the foundation for every availability decision you'll make.

⚠️ Pitfall

us-east-1 (N. Virginia) is special and dangerous. It's the oldest, biggest, and cheapest region, hosts global control planes (IAM, CloudFront certificates must live there), and historically has the most outages. Don't put your only production deployment there just because tutorials do. Also: inter-region and inter-AZ data transfer costs real money — region choice is a cost decision, not just a latency one.

🎯 Exam Tip

Multi-AZ = high availability (survive an AZ failure, synchronous, same region). Multi-region = disaster recovery + latency (survive a region failure, asynchronous). The exam tests this distinction constantly: "company needs RPO of seconds across geographic failure" → multi-region; "survive data center failure with no data loss" → Multi-AZ.

The shared responsibility model

AWS secures the cloud; you secure what you put in the cloud. Almost every "AWS data breach" in the news was a customer misconfiguration — a public S3 bucket, leaked access keys, an overprivileged role. The line moves depending on the service: more managed = more AWS responsibility.

  • Physical security — data centers, hardware, network infrastructure, supply chain
  • Hypervisor & host OS — the Nitro system isolating your EC2 instances from other tenants
  • Managed service infrastructure — RDS host patching, Lambda runtime environment, S3 storage durability (11 nines)
  • Global network — DDoS absorption at the edge (Shield Standard), inter-region backbone encryption
  • Compliance of the platform — SOC 2, ISO 27001, PCI DSS certification of AWS infrastructure itself
  • IAM — who can call which APIs on which resources. The single most common failure point
  • OS patches — on EC2 you patch the guest OS (on RDS/Fargate/Lambda, AWS does)
  • Application code — your vulnerabilities are your problem, wherever they run
  • Data encryption — enabling encryption at rest, enforcing TLS in transit, key policies
  • Network configuration — security groups, NACLs, public vs private subnets, S3 Block Public Access
  • Compliance of your workload — AWS being PCI-certified doesn't make your app PCI-compliant
📦 Real World

The 2019 Capital One breach: a misconfigured WAF allowed SSRF against the EC2 metadata service, which handed over credentials for an overprivileged IAM role with s3:List* and s3:Get* on everything — 100M records exfiltrated. Every control that failed was on the customer side of the line: IMDSv1 (not v2), role scope, bucket policies. This single incident motivates half of the IAM chapter.

The AWS service map — what you actually need

AWS lists 200+ services. A production backend uses maybe 25. This is the map of the ones that matter, grouped the way this guide is organized.

Compute

  • EC2 — virtual machines, the foundation
  • ECS / Fargate — containers, AWS-native
  • EKS — managed Kubernetes
  • Lambda — event-driven functions

Storage

  • S3 — object storage, 11 nines durability
  • EBS — block storage for EC2
  • EFS — shared NFS across AZs
  • Glacier — archival tiers

Database

  • RDS / Aurora — managed PostgreSQL/MySQL
  • DynamoDB — serverless NoSQL, single-digit ms
  • ElastiCache — managed Redis/Memcached
  • Redshift — columnar analytics warehouse

Networking

  • VPC — your isolated network
  • Route53 — DNS + routing policies
  • CloudFront — CDN, 600+ PoPs
  • ALB / NLB — L7 / L4 load balancing
  • API Gateway — managed API front door

Messaging

  • SQS — queues, the workhorse
  • SNS — pub/sub fan-out
  • EventBridge — event bus + scheduler
  • Kinesis — real-time streams (Kafka-like)

Security

  • IAM — identity and access, learn this first
  • KMS — encryption keys, envelope encryption
  • Secrets Manager — rotated credentials
  • WAF + Shield + GuardDuty — defense layers

Developer Tools

  • ECR — container registry
  • CodePipeline / CodeBuild — AWS-native CI/CD
  • CDK — infrastructure in TypeScript/Java
  • CloudFormation — the IaC engine underneath

Observability

  • CloudWatch — metrics, logs, alarms
  • X-Ray — distributed tracing
  • CloudTrail — API audit log
⚖️ Trade-off

AWS-native vs open-source: SQS vs RabbitMQ, Kinesis vs Kafka, DynamoDB vs Cassandra, ECS vs Kubernetes. AWS-native means less ops burden and deep IAM integration but vendor lock-in; open-source means portability and ecosystem but you (or MSK/EKS pricing) carry the operational load. Each chapter covers the decision explicitly.

Account structure — the part everyone gets wrong first

An AWS account is the strongest isolation boundary AWS offers — stronger than VPCs, stronger than IAM. Mature organizations run dozens of accounts under AWS Organizations. One account per environment is the minimum bar.

flowchart TB
  MGMT["Management account\n(billing only — no workloads!)"]
  ORG["AWS Organizations\n+ Service Control Policies"]
  SEC["Security OU"]
  WORK["Workloads OU"]
  SAND["Sandbox OU"]
  LOG["Log archive\naccount"]
  AUDIT["Security tooling\naccount"]
  DEV["dev\naccount"]
  STG["staging\naccount"]
  PROD["prod\naccount"]
  DEVS["per-engineer\nsandboxes"]
  MGMT --> ORG
  ORG --> SEC
  ORG --> WORK
  ORG --> SAND
  SEC --> LOG
  SEC --> AUDIT
  WORK --> DEV
  WORK --> STG
  WORK --> PROD
  SAND --> DEVS
🔒 Security

Root user checklist (do this today): enable MFA (hardware key for the management account), delete root access keys, set a strong unique password, store recovery in a break-glass procedure, and never use root for daily work. Root bypasses all IAM policies — it can do anything, including closing the account.

⚠️ Pitfall

Starting with one account "to keep it simple" and untangling it two years later is one of the most expensive migrations in cloud engineering — resources, IAM history, and data gravity all resist moving. Use AWS Organizations from day one even if you only create three accounts.

AWS timeline — how the platform was built

The launch order explains the architecture: storage and queues first, compute second, managed databases third, serverless and containers a decade later. Older services are lower-level; newer ones abstract them away.

  1. 2006

    S3 + SQS launch · EC2 beta

    The original trio: durable objects, reliable queues, rentable VMs. Everything since builds on these primitives.

  2. 2009

    RDS · VPC

    Managed relational databases and software-defined private networking — the enterprise unlock.

  3. 2012

    DynamoDB · Glacier

    Serverless NoSQL born from the Dynamo paper (Amazon's own shopping-cart database), plus deep archival storage.

  4. 2014

    Lambda · ECS

    Functions-as-a-service invents the serverless category; ECS answers Docker's rise with AWS-native orchestration.

  5. 2016

    Application Load Balancer

    Layer-7 routing, host/path rules, WebSocket, HTTP/2 — the default front door for containerized services.

  6. 2017–2019

    Fargate preview → EKS → Fargate GA

    Serverless containers and managed Kubernetes — AWS concedes K8s won the orchestration war while betting Fargate abstracts it away.

  7. 2020–2022

    Graviton2 → Graviton3

    AWS-designed ARM CPUs: 20–40% better price/performance. The biggest free cost optimization most teams ignore.

  8. 2023

    Bedrock

    Managed foundation-model APIs (Claude, Llama, Titan) — AWS's entry into the GenAI platform race.

  9. 2024

    Graviton4

    Fourth-generation ARM — 75% more memory bandwidth than Graviton3. New default for r8g memory-optimized instances.

Explore the guide — all sections

Thirteen deep-dive chapters plus cheat sheets. Recommended path: IAMVPCComputeStorageDatabases, then messaging, edge, and architecture patterns as your role requires.

Learning path: IAM · VPC & Networking · Compute · Storage · Databases · Cost

Backend Developer

developer

Java/Spring engineer deploying a first production workload: EC2, ECS, RDS, S3, SQS/SNS, API Gateway, and enough IAM to not get the account compromised.

DevOps / Platform Engineer

devops

Reusable infrastructure with Terraform/CDK: VPCs, EKS clusters, multi-account strategy, cost dashboards, CI/CD pipelines on AWS.

Solutions Architect

architect

Resilient multi-region systems, AWS-native vs open-source selection, build-vs-buy decisions, SA Associate/Professional exam preparation.