API Gateway, ALB & CloudFront

Application Load Balancer — L7 deep dive

ALB operates at Layer 7 (HTTP/HTTPS). It understands paths, host headers, query strings, and HTTP methods — routing /api/* to one target group and /admin/* to another. This is the default front door for ECS, EC2, and IP-mode Kubernetes ingress when you run long-lived HTTP services.

L7 routing and listener rules

An ALB has one or more listeners (typically port 443 HTTPS). Each listener has ordered rules evaluated top-to-bottom; the first match wins. A default rule catches everything else. Rules can match on host header (api.example.com), path (/v2/*), HTTP method, query string, or source IP.

Path-based — /orders/* → orders target group; multiple microservices on one ALB
Host-based — payments.example.com → dedicated target group
Weighted — 90% v1 / 10% v2 for blue/green without API Gateway
Redirect / fixed response — HTTP→HTTPS on :80; return 403 for /metrics before app tier

Target groups

A target group is a pool of backends. Target types:

IP — Fargate tasks, Lambda (via ALB Lambda target), or any IP in the VPC
Instance — EC2 instances registered by Auto Scaling Group
ALB — chain ALBs (rare; prefer single ALB with rules)

Each target group has a protocol/port (HTTP:8080 for Spring Boot), a health check path, and optional stickiness. Targets register automatically when ECS attaches tasks or ASG launches instances.

Health checks

ALB polls targets on the health check path (default /). Configure it to match your app's readiness — for Spring Boot use /actuator/health with management.endpoint.health.probes.enabled=true for separate liveness/readiness. Unhealthy targets stop receiving traffic within one interval; healthy targets rejoin after consecutive successes.

Tune interval (15–30s), healthy/unhealthy thresholds (2–3), timeout (< interval), and matcher (200 only — not 503 during graceful shutdown). JVM slow-start needs higher healthy threshold to avoid flapping.

SSL/TLS and ACM

Terminate TLS at the ALB using an ACM certificate — free, auto-renewed, but must be in the same region as the ALB. Use an HTTPS listener on 443 with a security policy like ELBSecurityPolicy-TLS13-1-2-2021-06. Traffic from ALB to targets can be HTTP (common inside VPC) or HTTPS (re-encrypt) depending on compliance requirements.

Wildcard certs (*.example.com) cover all subdomains on one ALB
SNI allows multiple certificates on one listener — different hostnames, different certs
CloudFront in front of ALB uses a cert in us-east-1 (CloudFront requirement)

Sticky sessions (session affinity)

ALB can pin a client to one target using an AWSALB cookie (duration 1 second to 7 days). Use only when your app stores session state in memory — prefer external session stores (Redis/ElastiCache) so any task can serve any request. Sticky sessions complicate rolling deploys: old sessions stick to draining targets until cookie expires.

Connection draining (deregistration delay)

When a target is deregistered (deploy, scale-in, failed health check), ALB enters draining state for the configured delay (default 300 seconds, max 3600). In-flight requests complete; new requests route to healthy targets only. Set delay ≥ your longest request timeout. For Spring Boot graceful shutdown, align:

ECS stopTimeout ≥ deregistration delay
Spring server.shutdown=graceful + adequate grace period
Health check returns non-200 during shutdown so ALB stops sending new connections early

WAF integration

Associate AWS WAF with an ALB to filter SQL injection, rate-limit by IP, block geo regions, or enforce managed rule groups (OWASP Top 10). WAF evaluates before the request reaches listener rules. For CloudFront-fronted ALB, you can attach WAF to CloudFront instead — one WAF at the edge protects both static and dynamic paths.

ALB vs NLB — when to use which

Dimension	ALB (Layer 7)	NLB (Layer 4)
Protocols	HTTP, HTTPS, HTTP/2, gRPC, WebSocket	TCP, UDP, TLS passthrough
Routing	Path, host, header, query, method	Port and IP only — no URL awareness
Targets	ECS, EC2, Lambda, IP, ALB	EC2, IP, ALB (not Lambda directly)
Performance	Millions of requests/sec; slight L7 overhead	Ultra-low latency, static IP, preserves source IP
TLS	Terminate at ALB (ACM)	TLS passthrough or terminate at NLB
Best for	REST APIs, Spring Boot, microservices HTTP	Non-HTTP (MQTT, gaming), extreme latency, VPC Link to API GW
Pricing	LCU-based (connections, rules, bytes)	LCU-based (connections, bytes, new flows/sec)

Provision ALB + target group for ECS

saved globally

# Target group — IP mode for Fargate
aws elbv2 create-target-group \
  --name order-service-tg \
  --protocol HTTP --port 8080 \
  --vpc-id vpc-0abc123 \
  --target-type ip \
  --health-check-path /actuator/health \
  --health-check-interval-seconds 30 \
  --matcher HttpCode=200

aws elbv2 create-load-balancer \
  --name order-service-alb \
  --subnets subnet-pub-a subnet-pub-b \
  --security-groups sg-alb-public \
  --scheme internet-facing \
  --type application

aws elbv2 create-listener \
  --load-balancer-arn arn:aws:elasticloadbalancing:eu-west-1:123:loadbalancer/app/order-service-alb/abc \
  --protocol HTTPS --port 443 \
  --certificates CertificateArn=arn:aws:acm:eu-west-1:123:certificate/xyz \
  --default-actions Type=forward,TargetGroupArn=arn:aws:elasticloadbalancing:eu-west-1:123:targetgroup/order-service-tg/abc

resource "aws_lb_target_group" "app" {
  name        = "order-service-tg"
  port        = 8080
  protocol    = "HTTP"
  vpc_id      = aws_vpc.main.id
  target_type = "ip"

  health_check {
    path                = "/actuator/health"
    interval            = 30
    healthy_threshold   = 2
    unhealthy_threshold = 3
    matcher             = "200"
  }

  deregistration_delay = 120
}

resource "aws_lb" "app" {
  name               = "order-service-alb"
  internal           = false
  load_balancer_type = "application"
  security_groups    = [aws_security_group.alb.id]
  subnets            = aws_subnet.public[*].id
}

resource "aws_lb_listener" "https" {
  load_balancer_arn = aws_lb.app.arn
  port              = 443
  protocol          = "HTTPS"
  ssl_policy        = "ELBSecurityPolicy-TLS13-1-2-2021-06"
  certificate_arn   = aws_acm_certificate.api.arn

  default_action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.app.arn
  }
}

import * as elbv2 from 'aws-cdk-lib/aws-elasticloadbalancingv2';
import * as ecs from 'aws-cdk-lib/aws-ecs';
import * as acm from 'aws-cdk-lib/aws-certificatemanager';

const cert = acm.Certificate.fromCertificateArn(this, 'Cert', certArn);

const alb = new elbv2.ApplicationLoadBalancer(this, 'Alb', {
  vpc,
  internetFacing: true,
});

const tg = new elbv2.ApplicationTargetGroup(this, 'Tg', {
  vpc,
  port: 8080,
  protocol: elbv2.ApplicationProtocol.HTTP,
  targetType: elbv2.TargetType.IP,
  healthCheck: { path: '/actuator/health', interval: Duration.seconds(30) },
  deregistrationDelay: Duration.seconds(120),
});

alb.addListener('Https', {
  port: 443,
  protocol: elbv2.ApplicationProtocol.HTTPS,
  certificates: [cert],
  defaultTargetGroups: [tg],
});

const service = new ecs.FargateService(this, 'Service', { cluster, taskDefinition });
service.attachToApplicationTargetGroup(tg);

🔬 Under the Hood

ALB nodes scale horizontally per AZ — AWS manages the fleet; you never see individual load balancer instances. Each AZ gets at least one node; cross-zone load balancing (enabled by default) distributes traffic evenly even if targets are unevenly distributed across AZs. Source IP seen by targets is the ALB node IP unless you enable proxy protocol v2 or check X-Forwarded-For.

💰 Cost

ALB bills hourly (~$0.0225/hr) plus LCU (Load Balancer Capacity Units) for new connections, active connections, processed bytes, and rule evaluations. One moderately busy API ALB often runs $20–40/month. Idle ALBs in dev accounts are a common waste — delete or share one ALB with path-based rules instead of one ALB per microservice in non-prod.

🎯 Exam Tip

ALB for HTTP routing, NLB for TCP/static IP. Classic Load Balancer is legacy — never choose it on new designs. When the question mentions WebSocket or HTTP/2 with path routing, ALB. When it mentions millions of TCP connections with client IP preservation, NLB. ALB cannot terminate traffic on port 22 — that's NLB or direct connect territory.

API Gateway — managed API front door

API Gateway is a managed API layer: it handles authentication, request validation, throttling, API keys, CORS, and OpenAPI documentation — then forwards to Lambda, HTTP endpoints, or AWS services. You don't run it; you configure it. Choose the API type based on features needed, not habit.

REST vs HTTP vs WebSocket APIs

Type	Protocol	Key features	When to use
REST API	HTTP/HTTPS	Full feature set: API keys, usage plans, request validation, caching, WAF, SDK generation	Public/partner APIs with billing tiers, legacy integrations, maximum control
HTTP API	HTTP/HTTPS	Lower latency, lower cost, JWT/Cognito authorizers, Lambda proxy — fewer features	Default for new serverless APIs — internal microservices, Lambda backends
WebSocket API	WebSocket	$connect, $disconnect, $default routes; push to clients via connection ID	Real-time chat, live dashboards, collaborative editing

⚖️ Trade-off

HTTP API vs REST API: HTTP API is ~70% cheaper and has lower latency, but lacks usage plans, API keys, request validation, and edge-optimized/custom domain edge endpoints. If you need per-customer rate limits and API key billing, REST API. If you need a thin proxy to Lambda with JWT auth, HTTP API. Don't default to REST API out of habit.

Integrations

API Gateway connects routes to backends via integrations:

Lambda — synchronous invoke; most common serverless pattern
HTTP — proxy to any HTTPS endpoint (ALB, EC2, external SaaS)
AWS service — direct integration (SQS, Step Functions, Kinesis) without Lambda middleman
VPC Link — private connection to NLB/ALB inside your VPC (REST and HTTP APIs)
Mock — return static response for testing or CORS preflight

Lambda proxy integration

With Lambda proxy, API Gateway passes the entire request as a JSON event and expects a structured response with statusCode, headers, and body. Your Lambda owns routing logic inside the handler — API Gateway doesn't map individual paths unless you configure them. This is the fastest way to ship but pushes HTTP semantics into Lambda code; for complex APIs, use explicit routes per endpoint.

Authorizers

Type	How it works	Best for
JWT (HTTP API)	Validates JWT signature against issuer's JWKS; passes claims to Lambda	Cognito User Pool, Auth0, Okta — preferred for HTTP API
Lambda authorizer	Custom Lambda returns IAM policy Allow/Deny + context	Custom auth logic, legacy token formats, fine-grained per-route decisions
Cognito User Pool (REST)	Built-in Cognito JWT validation on REST API	Mobile/web apps already on Cognito
IAM	SigV4-signed requests — AWS SDK callers only	Service-to-service, private internal APIs

Authorizer cache TTL (default 300 seconds) avoids re-invoking Lambda authorizer on every request — tune down for high-security endpoints, up for cost savings. JWT authorizers on HTTP API validate at the edge without a Lambda call.

Throttling and quotas

API Gateway enforces throttling at two levels:

Account-level — 10,000 requests/second steady, 5,000 burst (increasable via support ticket)
Stage/method-level — configure rate (steady RPS) and burst per method or stage

Exceeded limits return 429 Too Many Requests. Set client-side retry with exponential backoff. For per-API-key limits, use REST API usage plans.

Caching

REST API stage caching stores responses keyed by path + query + headers (configurable). TTL 0–3600 seconds. Cache hits skip the backend entirely — massive cost savings for read-heavy public APIs. Invalidate cache on deploy or use a cache-busting header. Not available on HTTP API.

Usage plans and API keys

Usage plans tie API keys to throttle/quota limits — e.g. free tier 1000 req/day, paid tier 1M req/day. Clients pass x-api-key header. This is the AWS-native API monetization model. HTTP API does not support API keys — use a Lambda authorizer or third-party gateway if you need keys on HTTP API.

Canary deployments

REST API supports canary releases on a stage: route X% of traffic to a new deployment, monitor CloudWatch alarms, promote or rollback. HTTP API uses weighted aliases (similar to Lambda aliases) for gradual rollout. Pair with CloudWatch alarms on 5xx rate and latency p99 before promoting.

HTTP API + Lambda (production pattern)

saved globally

aws apigatewayv2 create-api \
  --name order-api \
  --protocol-type HTTP \
  --target "arn:aws:lambda:eu-west-1:123456789012:function:order-handler"

aws apigatewayv2 create-stage \
  --api-id abc123 \
  --stage-name prod \
  --auto-deploy

# JWT authorizer (Cognito)
aws apigatewayv2 create-authorizer \
  --api-id abc123 \
  --authorizer-type JWT \
  --identity-source '$request.header.Authorization' \
  --jwt-configuration Audience=client-id,Issuer=https://cognito-idp.eu-west-1.amazonaws.com/eu-west-1_XXX

resource "aws_apigatewayv2_api" "orders" {
  name          = "order-api"
  protocol_type = "HTTP"
}

resource "aws_apigatewayv2_integration" "lambda" {
  api_id                 = aws_apigatewayv2_api.orders.id
  integration_type       = "AWS_PROXY"
  integration_uri        = aws_lambda_function.handler.invoke_arn
  payload_format_version = "2.0"
}

resource "aws_apigatewayv2_route" "get_order" {
  api_id    = aws_apigatewayv2_api.orders.id
  route_key = "GET /orders/{id}"
  target    = "integrations/${aws_apigatewayv2_integration.lambda.id}"
  authorizer_id = aws_apigatewayv2_authorizer.jwt.id
  authorization_type = "JWT"
}

resource "aws_apigatewayv2_authorizer" "jwt" {
  api_id           = aws_apigatewayv2_api.orders.id
  authorizer_type  = "JWT"
  identity_sources = ["$request.header.Authorization"]
  name             = "cognito-jwt"
  jwt_configuration {
    audience = [aws_cognito_user_pool_client.app.id]
    issuer   = "https://cognito-idp.${var.region}.amazonaws.com/${aws_cognito_user_pool.main.id}"
  }
}

resource "aws_apigatewayv2_stage" "prod" {
  api_id      = aws_apigatewayv2_api.orders.id
  name        = "prod"
  auto_deploy = true
  default_route_settings {
    throttling_burst_limit = 500
    throttling_rate_limit  = 1000
  }
}

import * as apigwv2 from 'aws-cdk-lib/aws-apigatewayv2';
import * as integrations from 'aws-cdk-lib/aws-apigatewayv2-integrations';
import * as authorizers from 'aws-cdk-lib/aws-apigatewayv2-authorizers';
import * as lambda from 'aws-cdk-lib/aws-lambda';

const fn = new lambda.Function(this, 'Handler', { /* ... */ });

const httpApi = new apigwv2.HttpApi(this, 'OrderApi', {
  defaultAuthorizer: new authorizers.HttpJwtAuthorizer('JwtAuth', 'https://cognito-idp...', {
    jwtAudience: ['client-id'],
  }),
});

httpApi.addRoutes({
  path: '/orders/{id}',
  methods: [apigwv2.HttpMethod.GET],
  integration: new integrations.HttpLambdaIntegration('OrderInt', fn),
});

⚠️ Pitfall

Putting API Gateway in front of an ALB that already handles auth, rate limiting, and routing — you pay for both and add latency. Use API Gateway when you need its features (API keys, usage plans, serverless integration, request validation). Use ALB when you run containers with Spring Security. Exception: API Gateway + VPC Link → private ALB for hybrid public API surface on internal services.

🔒 Security

Enable WAF on API Gateway stages for production. Use JWT authorizers over Lambda authorizers when possible — fewer moving parts, no cold starts on auth. Never expose internal ALB endpoints directly; route through API Gateway with VPC Link or use private API Gateway with resource policies restricting source VPC endpoints.

CloudFront — global CDN and edge compute

CloudFront caches content at 600+ edge locations worldwide. Users in Tokyo hit a Tokyo PoP, not your eu-west-1 ALB. Use it for static assets (S3), cacheable API responses, and TLS termination at the edge — but understand cache behaviors or you'll serve stale data for hours.

How a CDN works in AWS

A distribution has one or more origins (S3, ALB, custom HTTP server, MediaPackage, etc.) and cache behaviors that map URL patterns to origins and caching rules. Viewer requests hit the nearest edge; on cache miss, CloudFront fetches from origin (the origin shield optional layer reduces origin load further).

flowchart LR
  USER["User in Tokyo"]
  EDGE["CloudFront Edge PoP\n(cache check)"]
  ORIGIN["Origin\nS3 or ALB in eu-west-1"]
  USER --> EDGE
  EDGE -->|Cache HIT| USER
  EDGE -->|Cache MISS| ORIGIN
  ORIGIN --> EDGE
  EDGE --> USER

Origins and cache behaviors

Setting	What it controls	Typical value
Path pattern	Which URLs use this behavior	/static/* → S3; /api/* → ALB; default → S3
TTL (min/default/max)	How long objects stay cached	Static: 86400s; API: 0–60s or no cache
Cache key	Headers, query strings, cookies in key	Static: none; API: only auth-excluded query params
Compress	Gzip/Brotli at edge	Enable for text assets — free bandwidth savings
Viewer protocol	HTTP vs HTTPS to viewer	Redirect HTTP → HTTPS always
Allowed methods	GET/HEAD/OPTIONS vs full CRUD	S3 static: GET, HEAD; API origin: all methods

Origin Access Control (OAC) for S3

Never make S3 buckets public for CloudFront. Use OAC (replaces legacy OAI): CloudFront signs requests to S3 with SigV4; bucket policy allows only the specific distribution. Users cannot bypass CloudFront to access S3 directly.

Signed URLs and signed cookies

For private content (paid videos, user-specific downloads), generate signed URLs or cookies with a CloudFront key pair. Signed URLs grant access to one object; signed cookies grant access to a path pattern (better for HLS streaming with many segment files). Store the private key in Secrets Manager; rotate key groups without downtime using multiple active keys.

Lambda@Edge and CloudFront Functions

Feature	CloudFront Functions	Lambda@Edge
Runtime	Custom lightweight JS (~1ms)	Node.js, Python — full Lambda
Triggers	Viewer request/response only	Viewer + origin request/response
Use cases	URL rewrite, header inject, A/B cookie	JWT validation, image resize, dynamic origin selection
Limits	10KB code, no network I/O	5–30s timeout, must deploy to us-east-1

Price class

CloudFront price classes control which edge locations serve traffic:

PriceClass_All — all edge locations worldwide (best latency, highest cost)
PriceClass_200 — excludes most expensive South America/Australia edges
PriceClass_100 — US, Canada, Europe, Israel only — fine for US/EU-focused B2B SaaS

CloudFront + S3 with OAC

saved globally

# OAC + distribution (simplified — use IaC for full config)
aws cloudfront create-origin-access-control \
  --origin-access-control-config '{
    "Name": "spa-oac",
    "SigningProtocol": "sigv4",
    "SigningBehavior": "always",
    "OriginAccessControlOriginType": "s3"
  }'

# Bucket policy: allow only this CloudFront distribution
# "Condition": { "StringEquals": { "AWS:SourceArn": "arn:aws:cloudfront::123:distribution/E1234" } }

aws cloudfront create-distribution --distribution-config file://dist-config.json
# dist-config.json: Origins with OAC id, DefaultCacheBehavior TTL=86400, Compress=true

resource "aws_cloudfront_origin_access_control" "spa" {
  name                              = "spa-oac"
  origin_access_control_origin_type = "s3"
  signing_behavior                  = "always"
  signing_protocol                  = "sigv4"
}

resource "aws_cloudfront_distribution" "spa" {
  enabled             = true
  default_root_object = "index.html"
  price_class         = "PriceClass_100"

  origin {
    domain_name              = aws_s3_bucket.spa.bucket_regional_domain_name
    origin_id                = "s3-spa"
    origin_access_control_id = aws_cloudfront_origin_access_control.spa.id
  }

  default_cache_behavior {
    target_origin_id       = "s3-spa"
    viewer_protocol_policy = "redirect-to-https"
    allowed_methods        = ["GET", "HEAD", "OPTIONS"]
    cached_methods         = ["GET", "HEAD"]
    compress               = true
    min_ttl                = 0
    default_ttl            = 86400
    max_ttl                = 31536000
    forwarded_values {
      query_string = false
      cookies { forward = "none" }
    }
  }

  restrictions {
    geo_restriction { restriction_type = "none" }
  }

  viewer_certificate {
    acm_certificate_arn      = aws_acm_certificate.cdn.arn # must be us-east-1
    ssl_support_method       = "sni-only"
    minimum_protocol_version = "TLSv1.2_2021"
  }
}

# aws_s3_bucket_policy.spa — allow GetObject only from this distribution ARN

import * as cloudfront from 'aws-cdk-lib/aws-cloudfront';
import * as origins from 'aws-cdk-lib/aws-cloudfront-origins';
import * as s3 from 'aws-cdk-lib/aws-s3';
import * as acm from 'aws-cdk-lib/aws-certificatemanager';

const bucket = new s3.Bucket(this, 'SpaBucket', { blockPublicAccess: s3.BlockPublicAccess.BLOCK_ALL });

const cert = acm.Certificate.fromCertificateArn(this, 'CdnCert', usEast1CertArn);

const distribution = new cloudfront.Distribution(this, 'SpaCdn', {
  defaultBehavior: {
    origin: origins.S3BucketOrigin.withOriginAccessControl(bucket),
    viewerProtocolPolicy: cloudfront.ViewerProtocolPolicy.REDIRECT_TO_HTTPS,
    compress: true,
    cachePolicy: cloudfront.CachePolicy.CACHING_OPTIMIZED,
  },
  defaultRootObject: 'index.html',
  domainNames: ['app.example.com'],
  certificate: cert,
  priceClass: cloudfront.PriceClass.PRICE_CLASS_100,
  errorResponses: [
    { httpStatus: 404, responseHttpStatus: 200, responsePagePath: '/index.html', ttl: Duration.seconds(0) },
  ],
});

📦 Real World

Spotify and Slack serve static web clients entirely from CDN — the browser downloads the SPA shell once, then calls regional APIs. Netflix uses CloudFront for Open Connect appliances and control-plane assets. Pattern: CloudFront for cacheable, ALB/API Gateway for dynamic — never run your CDN and API on the same uncached path without thought.

🎯 Exam Tip

CloudFront ACM certificates must be in us-east-1 — even if your ALB origin is in eu-west-1. OAC replaced OAI — choose OAC on new designs. For SPA routing (React/Vue), configure custom error response: 404 → /index.html with 200 so client-side routing works. Invalidation costs money — version assets with content hashes instead of invalidating /* on every deploy.

Production front-door patterns

Architecture choices at the edge compound. These four patterns cover 90% of backend deployments — pick one primary front door per traffic type (static, API containers, serverless API) and combine only when each layer adds distinct value.

Pattern 1: ALB + ECS (containerized APIs)

The standard pattern for Spring Boot, Node, or Go services on Fargate/EC2:

Internet-facing ALB in public subnets (two+ AZs)
ECS service in private subnets, registered as IP targets
ALB SG allows 443 from internet (or CloudFront prefix list); app SG allows app port only from ALB SG
ACM cert on ALB; optional CloudFront in front for global users
Route53 alias record → ALB or CloudFront distribution

Pattern 2: API Gateway + Lambda (serverless API)

For event-driven, spiky, or low-ops APIs:

HTTP API with JWT authorizer (Cognito) — no ALB to manage
Lambda per domain or per route; Step Functions for multi-step workflows
DynamoDB or Aurora Serverless behind Lambda via VPC if needed
Custom domain on API Gateway with ACM cert (regional)
CloudWatch + X-Ray for tracing; throttling at stage level

When Lambda needs VPC access (private RDS), cold starts increase — consider provisioned concurrency for latency-sensitive routes or move to ECS if p99 latency requirements are strict.

Pattern 3: CloudFront + S3 (static SPA)

React/Vue/Angular admin consoles, marketing sites, documentation:

Build pipeline uploads hashed assets to S3 (main.a3f2b1.js)
CloudFront OAC — bucket fully private
SPA fallback: 403/404 → index.html
API calls from browser go to a different subdomain (api.example.com → ALB or API GW)
CORS configured on API origin, not on S3 static bucket

Pattern 4: Spring Boot behind ALB — practical guidance

Java/Spring teams hit specific edge issues. Configure these before production:

Concern	Configuration
Health checks	/actuator/health/liveness and /readiness — ALB uses readiness
Graceful shutdown	server.shutdown=graceful; ECS stopTimeout ≥ ALB deregistration delay
Forwarded headers	server.forward-headers-strategy=framework — trust X-Forwarded-Proto for HTTPS links
Keep-alive	ALB idle timeout default 60s — align with connection pool and load test; increase to 120–300s for long polls
Session state	Redis/ElastiCache + Spring Session — disable ALB stickiness
Request size	ALB max body 1 MB default for some paths — use S3 presigned upload for large files
Observability	Pass X-Amzn-Trace-Id to Micrometer/OTel; log correlation ID from ALB request ID header

# application-prod.yml — Spring Boot behind ALB
server:
  shutdown: graceful
  forward-headers-strategy: framework
  tomcat:
    connection-timeout: 60s

management:
  endpoint:
    health:
      probes:
        enabled: true
  health:
    livenessstate:
      enabled: true
    readinessstate:
      enabled: true
  endpoints:
    web:
      exposure:
        include: health,info,prometheus

Combining layers — when it makes sense

CloudFront → ALB → ECS — global dynamic API with edge WAF and TLS
CloudFront path split — /static/* → S3 (cached), /api/* → ALB (uncached)
API GW → VPC Link → NLB → ECS — usage plans on private backends without public ALB
CloudFront → S3 only — pure static; no compute origin

⚠️ Pitfall

Running Spring Boot on port 8080 in a public subnet "because ALB handles security" — the task is still reachable if SG rules drift. Always place tasks in private subnets; only ALB and NAT in public subnets. Security groups reference other SGs (ALB SG → App SG), never CIDR 0.0.0.0/0 on app ports.

🎯 Exam Tip

Decision tree: static content → S3 + CloudFront. HTTP microservices in containers → ALB. Serverless REST with API keys → API Gateway REST. Low-latency Lambda proxy → HTTP API. WebSockets → API Gateway WebSocket or ALB (ALB supports WebSocket natively). Don't stack API Gateway on ALB unless the question explicitly requires API management features on private backends.