API Gateway, ALB & CloudFront
Every user request hits a front door before it reaches your code. ALB terminates TLS and routes HTTP to containers; API Gateway manages API contracts, auth, and throttling for serverless backends; CloudFront caches static assets at the edge so your JVM never serves bundle.js again. Pick the wrong layer and you pay twice — ALB + API Gateway in series for a simple Lambda, or no CDN when 90% of traffic is cacheable assets. This chapter covers how each service works, when to combine them, and how to wire Spring Boot, ECS, and Lambda behind production-grade entry points.
Application Load Balancer — L7 deep dive
ALB operates at Layer 7 (HTTP/HTTPS). It understands paths, host headers, query strings, and HTTP methods — routing /api/* to one target group and /admin/* to another. This is the default front door for ECS, EC2, and IP-mode Kubernetes ingress when you run long-lived HTTP services.
L7 routing and listener rules
An ALB has one or more listeners (typically port 443 HTTPS). Each listener has ordered rules evaluated top-to-bottom; the first match wins. A default rule catches everything else. Rules can match on host header (api.example.com), path (/v2/*), HTTP method, query string, or source IP.
- Path-based — /orders/* → orders target group; multiple microservices on one ALB
- Host-based — payments.example.com → dedicated target group
- Weighted — 90% v1 / 10% v2 for blue/green without API Gateway
- Redirect / fixed response — HTTP→HTTPS on :80; return 403 for /metrics before app tier
Target groups
A target group is a pool of backends. Target types:
- IP — Fargate tasks, Lambda (via ALB Lambda target), or any IP in the VPC
- Instance — EC2 instances registered by Auto Scaling Group
- ALB — chain ALBs (rare; prefer single ALB with rules)
Each target group has a protocol/port (HTTP:8080 for Spring Boot), a health check path, and optional stickiness. Targets register automatically when ECS attaches tasks or ASG launches instances.
Health checks
ALB polls targets on the health check path (default /). Configure it to match your app's readiness — for Spring Boot use /actuator/health with management.endpoint.health.probes.enabled=true for separate liveness/readiness. Unhealthy targets stop receiving traffic within one interval; healthy targets rejoin after consecutive successes.
Tune interval (15–30s), healthy/unhealthy thresholds (2–3), timeout (< interval), and matcher (200 only — not 503 during graceful shutdown). JVM slow-start needs higher healthy threshold to avoid flapping.
SSL/TLS and ACM
Terminate TLS at the ALB using an ACM certificate — free, auto-renewed, but must be in the same region as the ALB. Use an HTTPS listener on 443 with a security policy like ELBSecurityPolicy-TLS13-1-2-2021-06. Traffic from ALB to targets can be HTTP (common inside VPC) or HTTPS (re-encrypt) depending on compliance requirements.
- Wildcard certs (*.example.com) cover all subdomains on one ALB
- SNI allows multiple certificates on one listener — different hostnames, different certs
- CloudFront in front of ALB uses a cert in us-east-1 (CloudFront requirement)
Sticky sessions (session affinity)
ALB can pin a client to one target using an AWSALB cookie (duration 1 second to 7 days). Use only when your app stores session state in memory — prefer external session stores (Redis/ElastiCache) so any task can serve any request. Sticky sessions complicate rolling deploys: old sessions stick to draining targets until cookie expires.
Connection draining (deregistration delay)
When a target is deregistered (deploy, scale-in, failed health check), ALB enters draining state for the configured delay (default 300 seconds, max 3600). In-flight requests complete; new requests route to healthy targets only. Set delay ≥ your longest request timeout. For Spring Boot graceful shutdown, align:
- ECS stopTimeout ≥ deregistration delay
- Spring server.shutdown=graceful + adequate grace period
- Health check returns non-200 during shutdown so ALB stops sending new connections early
WAF integration
Associate AWS WAF with an ALB to filter SQL injection, rate-limit by IP, block geo regions, or enforce managed rule groups (OWASP Top 10). WAF evaluates before the request reaches listener rules. For CloudFront-fronted ALB, you can attach WAF to CloudFront instead — one WAF at the edge protects both static and dynamic paths.
ALB vs NLB — when to use which
| Dimension | ALB (Layer 7) | NLB (Layer 4) |
|---|---|---|
| Protocols | HTTP, HTTPS, HTTP/2, gRPC, WebSocket | TCP, UDP, TLS passthrough |
| Routing | Path, host, header, query, method | Port and IP only — no URL awareness |
| Targets | ECS, EC2, Lambda, IP, ALB | EC2, IP, ALB (not Lambda directly) |
| Performance | Millions of requests/sec; slight L7 overhead | Ultra-low latency, static IP, preserves source IP |
| TLS | Terminate at ALB (ACM) | TLS passthrough or terminate at NLB |
| Best for | REST APIs, Spring Boot, microservices HTTP | Non-HTTP (MQTT, gaming), extreme latency, VPC Link to API GW |
| Pricing | LCU-based (connections, rules, bytes) | LCU-based (connections, bytes, new flows/sec) |
Provision ALB + target group for ECS
# Target group — IP mode for Fargate
aws elbv2 create-target-group \
--name order-service-tg \
--protocol HTTP --port 8080 \
--vpc-id vpc-0abc123 \
--target-type ip \
--health-check-path /actuator/health \
--health-check-interval-seconds 30 \
--matcher HttpCode=200
aws elbv2 create-load-balancer \
--name order-service-alb \
--subnets subnet-pub-a subnet-pub-b \
--security-groups sg-alb-public \
--scheme internet-facing \
--type application
aws elbv2 create-listener \
--load-balancer-arn arn:aws:elasticloadbalancing:eu-west-1:123:loadbalancer/app/order-service-alb/abc \
--protocol HTTPS --port 443 \
--certificates CertificateArn=arn:aws:acm:eu-west-1:123:certificate/xyz \
--default-actions Type=forward,TargetGroupArn=arn:aws:elasticloadbalancing:eu-west-1:123:targetgroup/order-service-tg/abc
resource "aws_lb_target_group" "app" {
name = "order-service-tg"
port = 8080
protocol = "HTTP"
vpc_id = aws_vpc.main.id
target_type = "ip"
health_check {
path = "/actuator/health"
interval = 30
healthy_threshold = 2
unhealthy_threshold = 3
matcher = "200"
}
deregistration_delay = 120
}
resource "aws_lb" "app" {
name = "order-service-alb"
internal = false
load_balancer_type = "application"
security_groups = [aws_security_group.alb.id]
subnets = aws_subnet.public[*].id
}
resource "aws_lb_listener" "https" {
load_balancer_arn = aws_lb.app.arn
port = 443
protocol = "HTTPS"
ssl_policy = "ELBSecurityPolicy-TLS13-1-2-2021-06"
certificate_arn = aws_acm_certificate.api.arn
default_action {
type = "forward"
target_group_arn = aws_lb_target_group.app.arn
}
}
import * as elbv2 from 'aws-cdk-lib/aws-elasticloadbalancingv2';
import * as ecs from 'aws-cdk-lib/aws-ecs';
import * as acm from 'aws-cdk-lib/aws-certificatemanager';
const cert = acm.Certificate.fromCertificateArn(this, 'Cert', certArn);
const alb = new elbv2.ApplicationLoadBalancer(this, 'Alb', {
vpc,
internetFacing: true,
});
const tg = new elbv2.ApplicationTargetGroup(this, 'Tg', {
vpc,
port: 8080,
protocol: elbv2.ApplicationProtocol.HTTP,
targetType: elbv2.TargetType.IP,
healthCheck: { path: '/actuator/health', interval: Duration.seconds(30) },
deregistrationDelay: Duration.seconds(120),
});
alb.addListener('Https', {
port: 443,
protocol: elbv2.ApplicationProtocol.HTTPS,
certificates: [cert],
defaultTargetGroups: [tg],
});
const service = new ecs.FargateService(this, 'Service', { cluster, taskDefinition });
service.attachToApplicationTargetGroup(tg);
ALB nodes scale horizontally per AZ — AWS manages the fleet; you never see individual load balancer instances. Each AZ gets at least one node; cross-zone load balancing (enabled by default) distributes traffic evenly even if targets are unevenly distributed across AZs. Source IP seen by targets is the ALB node IP unless you enable proxy protocol v2 or check X-Forwarded-For.
ALB bills hourly (~$0.0225/hr) plus LCU (Load Balancer Capacity Units) for new connections, active connections, processed bytes, and rule evaluations. One moderately busy API ALB often runs $20–40/month. Idle ALBs in dev accounts are a common waste — delete or share one ALB with path-based rules instead of one ALB per microservice in non-prod.
ALB for HTTP routing, NLB for TCP/static IP. Classic Load Balancer is legacy — never choose it on new designs. When the question mentions WebSocket or HTTP/2 with path routing, ALB. When it mentions millions of TCP connections with client IP preservation, NLB. ALB cannot terminate traffic on port 22 — that's NLB or direct connect territory.
API Gateway — managed API front door
API Gateway is a managed API layer: it handles authentication, request validation, throttling, API keys, CORS, and OpenAPI documentation — then forwards to Lambda, HTTP endpoints, or AWS services. You don't run it; you configure it. Choose the API type based on features needed, not habit.
REST vs HTTP vs WebSocket APIs
| Type | Protocol | Key features | When to use |
|---|---|---|---|
| REST API | HTTP/HTTPS | Full feature set: API keys, usage plans, request validation, caching, WAF, SDK generation | Public/partner APIs with billing tiers, legacy integrations, maximum control |
| HTTP API | HTTP/HTTPS | Lower latency, lower cost, JWT/Cognito authorizers, Lambda proxy — fewer features | Default for new serverless APIs — internal microservices, Lambda backends |
| WebSocket API | WebSocket | $connect, $disconnect, $default routes; push to clients via connection ID | Real-time chat, live dashboards, collaborative editing |
HTTP API vs REST API: HTTP API is ~70% cheaper and has lower latency, but lacks usage plans, API keys, request validation, and edge-optimized/custom domain edge endpoints. If you need per-customer rate limits and API key billing, REST API. If you need a thin proxy to Lambda with JWT auth, HTTP API. Don't default to REST API out of habit.
Integrations
API Gateway connects routes to backends via integrations:
- Lambda — synchronous invoke; most common serverless pattern
- HTTP — proxy to any HTTPS endpoint (ALB, EC2, external SaaS)
- AWS service — direct integration (SQS, Step Functions, Kinesis) without Lambda middleman
- VPC Link — private connection to NLB/ALB inside your VPC (REST and HTTP APIs)
- Mock — return static response for testing or CORS preflight
Lambda proxy integration
With Lambda proxy, API Gateway passes the entire request as a JSON event and expects a structured response with statusCode, headers, and body. Your Lambda owns routing logic inside the handler — API Gateway doesn't map individual paths unless you configure them. This is the fastest way to ship but pushes HTTP semantics into Lambda code; for complex APIs, use explicit routes per endpoint.
Authorizers
| Type | How it works | Best for |
|---|---|---|
| JWT (HTTP API) | Validates JWT signature against issuer's JWKS; passes claims to Lambda | Cognito User Pool, Auth0, Okta — preferred for HTTP API |
| Lambda authorizer | Custom Lambda returns IAM policy Allow/Deny + context | Custom auth logic, legacy token formats, fine-grained per-route decisions |
| Cognito User Pool (REST) | Built-in Cognito JWT validation on REST API | Mobile/web apps already on Cognito |
| IAM | SigV4-signed requests — AWS SDK callers only | Service-to-service, private internal APIs |
Authorizer cache TTL (default 300 seconds) avoids re-invoking Lambda authorizer on every request — tune down for high-security endpoints, up for cost savings. JWT authorizers on HTTP API validate at the edge without a Lambda call.
Throttling and quotas
API Gateway enforces throttling at two levels:
- Account-level — 10,000 requests/second steady, 5,000 burst (increasable via support ticket)
- Stage/method-level — configure rate (steady RPS) and burst per method or stage
Exceeded limits return 429 Too Many Requests. Set client-side retry with exponential backoff. For per-API-key limits, use REST API usage plans.
Caching
REST API stage caching stores responses keyed by path + query + headers (configurable). TTL 0–3600 seconds. Cache hits skip the backend entirely — massive cost savings for read-heavy public APIs. Invalidate cache on deploy or use a cache-busting header. Not available on HTTP API.
Usage plans and API keys
Usage plans tie API keys to throttle/quota limits — e.g. free tier 1000 req/day, paid tier 1M req/day. Clients pass x-api-key header. This is the AWS-native API monetization model. HTTP API does not support API keys — use a Lambda authorizer or third-party gateway if you need keys on HTTP API.
Canary deployments
REST API supports canary releases on a stage: route X% of traffic to a new deployment, monitor CloudWatch alarms, promote or rollback. HTTP API uses weighted aliases (similar to Lambda aliases) for gradual rollout. Pair with CloudWatch alarms on 5xx rate and latency p99 before promoting.
HTTP API + Lambda (production pattern)
aws apigatewayv2 create-api \
--name order-api \
--protocol-type HTTP \
--target "arn:aws:lambda:eu-west-1:123456789012:function:order-handler"
aws apigatewayv2 create-stage \
--api-id abc123 \
--stage-name prod \
--auto-deploy
# JWT authorizer (Cognito)
aws apigatewayv2 create-authorizer \
--api-id abc123 \
--authorizer-type JWT \
--identity-source '$request.header.Authorization' \
--jwt-configuration Audience=client-id,Issuer=https://cognito-idp.eu-west-1.amazonaws.com/eu-west-1_XXX
resource "aws_apigatewayv2_api" "orders" {
name = "order-api"
protocol_type = "HTTP"
}
resource "aws_apigatewayv2_integration" "lambda" {
api_id = aws_apigatewayv2_api.orders.id
integration_type = "AWS_PROXY"
integration_uri = aws_lambda_function.handler.invoke_arn
payload_format_version = "2.0"
}
resource "aws_apigatewayv2_route" "get_order" {
api_id = aws_apigatewayv2_api.orders.id
route_key = "GET /orders/{id}"
target = "integrations/${aws_apigatewayv2_integration.lambda.id}"
authorizer_id = aws_apigatewayv2_authorizer.jwt.id
authorization_type = "JWT"
}
resource "aws_apigatewayv2_authorizer" "jwt" {
api_id = aws_apigatewayv2_api.orders.id
authorizer_type = "JWT"
identity_sources = ["$request.header.Authorization"]
name = "cognito-jwt"
jwt_configuration {
audience = [aws_cognito_user_pool_client.app.id]
issuer = "https://cognito-idp.${var.region}.amazonaws.com/${aws_cognito_user_pool.main.id}"
}
}
resource "aws_apigatewayv2_stage" "prod" {
api_id = aws_apigatewayv2_api.orders.id
name = "prod"
auto_deploy = true
default_route_settings {
throttling_burst_limit = 500
throttling_rate_limit = 1000
}
}
import * as apigwv2 from 'aws-cdk-lib/aws-apigatewayv2';
import * as integrations from 'aws-cdk-lib/aws-apigatewayv2-integrations';
import * as authorizers from 'aws-cdk-lib/aws-apigatewayv2-authorizers';
import * as lambda from 'aws-cdk-lib/aws-lambda';
const fn = new lambda.Function(this, 'Handler', { /* ... */ });
const httpApi = new apigwv2.HttpApi(this, 'OrderApi', {
defaultAuthorizer: new authorizers.HttpJwtAuthorizer('JwtAuth', 'https://cognito-idp...', {
jwtAudience: ['client-id'],
}),
});
httpApi.addRoutes({
path: '/orders/{id}',
methods: [apigwv2.HttpMethod.GET],
integration: new integrations.HttpLambdaIntegration('OrderInt', fn),
});
Putting API Gateway in front of an ALB that already handles auth, rate limiting, and routing — you pay for both and add latency. Use API Gateway when you need its features (API keys, usage plans, serverless integration, request validation). Use ALB when you run containers with Spring Security. Exception: API Gateway + VPC Link → private ALB for hybrid public API surface on internal services.
Enable WAF on API Gateway stages for production. Use JWT authorizers over Lambda authorizers when possible — fewer moving parts, no cold starts on auth. Never expose internal ALB endpoints directly; route through API Gateway with VPC Link or use private API Gateway with resource policies restricting source VPC endpoints.
CloudFront — global CDN and edge compute
CloudFront caches content at 600+ edge locations worldwide. Users in Tokyo hit a Tokyo PoP, not your eu-west-1 ALB. Use it for static assets (S3), cacheable API responses, and TLS termination at the edge — but understand cache behaviors or you'll serve stale data for hours.
How a CDN works in AWS
A distribution has one or more origins (S3, ALB, custom HTTP server, MediaPackage, etc.) and cache behaviors that map URL patterns to origins and caching rules. Viewer requests hit the nearest edge; on cache miss, CloudFront fetches from origin (the origin shield optional layer reduces origin load further).
flowchart LR USER["User in Tokyo"] EDGE["CloudFront Edge PoP\n(cache check)"] ORIGIN["Origin\nS3 or ALB in eu-west-1"] USER --> EDGE EDGE -->|Cache HIT| USER EDGE -->|Cache MISS| ORIGIN ORIGIN --> EDGE EDGE --> USER
Origins and cache behaviors
| Setting | What it controls | Typical value |
|---|---|---|
| Path pattern | Which URLs use this behavior | /static/* → S3; /api/* → ALB; default → S3 |
| TTL (min/default/max) | How long objects stay cached | Static: 86400s; API: 0–60s or no cache |
| Cache key | Headers, query strings, cookies in key | Static: none; API: only auth-excluded query params |
| Compress | Gzip/Brotli at edge | Enable for text assets — free bandwidth savings |
| Viewer protocol | HTTP vs HTTPS to viewer | Redirect HTTP → HTTPS always |
| Allowed methods | GET/HEAD/OPTIONS vs full CRUD | S3 static: GET, HEAD; API origin: all methods |
Origin Access Control (OAC) for S3
Never make S3 buckets public for CloudFront. Use OAC (replaces legacy OAI): CloudFront signs requests to S3 with SigV4; bucket policy allows only the specific distribution. Users cannot bypass CloudFront to access S3 directly.
Signed URLs and signed cookies
For private content (paid videos, user-specific downloads), generate signed URLs or cookies with a CloudFront key pair. Signed URLs grant access to one object; signed cookies grant access to a path pattern (better for HLS streaming with many segment files). Store the private key in Secrets Manager; rotate key groups without downtime using multiple active keys.
Lambda@Edge and CloudFront Functions
| Feature | CloudFront Functions | Lambda@Edge |
|---|---|---|
| Runtime | Custom lightweight JS (~1ms) | Node.js, Python — full Lambda |
| Triggers | Viewer request/response only | Viewer + origin request/response |
| Use cases | URL rewrite, header inject, A/B cookie | JWT validation, image resize, dynamic origin selection |
| Limits | 10KB code, no network I/O | 5–30s timeout, must deploy to us-east-1 |
Price class
CloudFront price classes control which edge locations serve traffic:
- PriceClass_All — all edge locations worldwide (best latency, highest cost)
- PriceClass_200 — excludes most expensive South America/Australia edges
- PriceClass_100 — US, Canada, Europe, Israel only — fine for US/EU-focused B2B SaaS
CloudFront + S3 with OAC
# OAC + distribution (simplified — use IaC for full config)
aws cloudfront create-origin-access-control \
--origin-access-control-config '{
"Name": "spa-oac",
"SigningProtocol": "sigv4",
"SigningBehavior": "always",
"OriginAccessControlOriginType": "s3"
}'
# Bucket policy: allow only this CloudFront distribution
# "Condition": { "StringEquals": { "AWS:SourceArn": "arn:aws:cloudfront::123:distribution/E1234" } }
aws cloudfront create-distribution --distribution-config file://dist-config.json
# dist-config.json: Origins with OAC id, DefaultCacheBehavior TTL=86400, Compress=true
resource "aws_cloudfront_origin_access_control" "spa" {
name = "spa-oac"
origin_access_control_origin_type = "s3"
signing_behavior = "always"
signing_protocol = "sigv4"
}
resource "aws_cloudfront_distribution" "spa" {
enabled = true
default_root_object = "index.html"
price_class = "PriceClass_100"
origin {
domain_name = aws_s3_bucket.spa.bucket_regional_domain_name
origin_id = "s3-spa"
origin_access_control_id = aws_cloudfront_origin_access_control.spa.id
}
default_cache_behavior {
target_origin_id = "s3-spa"
viewer_protocol_policy = "redirect-to-https"
allowed_methods = ["GET", "HEAD", "OPTIONS"]
cached_methods = ["GET", "HEAD"]
compress = true
min_ttl = 0
default_ttl = 86400
max_ttl = 31536000
forwarded_values {
query_string = false
cookies { forward = "none" }
}
}
restrictions {
geo_restriction { restriction_type = "none" }
}
viewer_certificate {
acm_certificate_arn = aws_acm_certificate.cdn.arn # must be us-east-1
ssl_support_method = "sni-only"
minimum_protocol_version = "TLSv1.2_2021"
}
}
# aws_s3_bucket_policy.spa — allow GetObject only from this distribution ARN
import * as cloudfront from 'aws-cdk-lib/aws-cloudfront';
import * as origins from 'aws-cdk-lib/aws-cloudfront-origins';
import * as s3 from 'aws-cdk-lib/aws-s3';
import * as acm from 'aws-cdk-lib/aws-certificatemanager';
const bucket = new s3.Bucket(this, 'SpaBucket', { blockPublicAccess: s3.BlockPublicAccess.BLOCK_ALL });
const cert = acm.Certificate.fromCertificateArn(this, 'CdnCert', usEast1CertArn);
const distribution = new cloudfront.Distribution(this, 'SpaCdn', {
defaultBehavior: {
origin: origins.S3BucketOrigin.withOriginAccessControl(bucket),
viewerProtocolPolicy: cloudfront.ViewerProtocolPolicy.REDIRECT_TO_HTTPS,
compress: true,
cachePolicy: cloudfront.CachePolicy.CACHING_OPTIMIZED,
},
defaultRootObject: 'index.html',
domainNames: ['app.example.com'],
certificate: cert,
priceClass: cloudfront.PriceClass.PRICE_CLASS_100,
errorResponses: [
{ httpStatus: 404, responseHttpStatus: 200, responsePagePath: '/index.html', ttl: Duration.seconds(0) },
],
});
Spotify and Slack serve static web clients entirely from CDN — the browser downloads the SPA shell once, then calls regional APIs. Netflix uses CloudFront for Open Connect appliances and control-plane assets. Pattern: CloudFront for cacheable, ALB/API Gateway for dynamic — never run your CDN and API on the same uncached path without thought.
CloudFront ACM certificates must be in us-east-1 — even if your ALB origin is in eu-west-1. OAC replaced OAI — choose OAC on new designs. For SPA routing (React/Vue), configure custom error response: 404 → /index.html with 200 so client-side routing works. Invalidation costs money — version assets with content hashes instead of invalidating /* on every deploy.
Production front-door patterns
Architecture choices at the edge compound. These four patterns cover 90% of backend deployments — pick one primary front door per traffic type (static, API containers, serverless API) and combine only when each layer adds distinct value.
Pattern 1: ALB + ECS (containerized APIs)
The standard pattern for Spring Boot, Node, or Go services on Fargate/EC2:
- Internet-facing ALB in public subnets (two+ AZs)
- ECS service in private subnets, registered as IP targets
- ALB SG allows 443 from internet (or CloudFront prefix list); app SG allows app port only from ALB SG
- ACM cert on ALB; optional CloudFront in front for global users
- Route53 alias record → ALB or CloudFront distribution
Pattern 2: API Gateway + Lambda (serverless API)
For event-driven, spiky, or low-ops APIs:
- HTTP API with JWT authorizer (Cognito) — no ALB to manage
- Lambda per domain or per route; Step Functions for multi-step workflows
- DynamoDB or Aurora Serverless behind Lambda via VPC if needed
- Custom domain on API Gateway with ACM cert (regional)
- CloudWatch + X-Ray for tracing; throttling at stage level
When Lambda needs VPC access (private RDS), cold starts increase — consider provisioned concurrency for latency-sensitive routes or move to ECS if p99 latency requirements are strict.
Pattern 3: CloudFront + S3 (static SPA)
React/Vue/Angular admin consoles, marketing sites, documentation:
- Build pipeline uploads hashed assets to S3 (main.a3f2b1.js)
- CloudFront OAC — bucket fully private
- SPA fallback: 403/404 → index.html
- API calls from browser go to a different subdomain (api.example.com → ALB or API GW)
- CORS configured on API origin, not on S3 static bucket
Pattern 4: Spring Boot behind ALB — practical guidance
Java/Spring teams hit specific edge issues. Configure these before production:
| Concern | Configuration |
|---|---|
| Health checks | /actuator/health/liveness and /readiness — ALB uses readiness |
| Graceful shutdown | server.shutdown=graceful; ECS stopTimeout ≥ ALB deregistration delay |
| Forwarded headers | server.forward-headers-strategy=framework — trust X-Forwarded-Proto for HTTPS links |
| Keep-alive | ALB idle timeout default 60s — align with connection pool and load test; increase to 120–300s for long polls |
| Session state | Redis/ElastiCache + Spring Session — disable ALB stickiness |
| Request size | ALB max body 1 MB default for some paths — use S3 presigned upload for large files |
| Observability | Pass X-Amzn-Trace-Id to Micrometer/OTel; log correlation ID from ALB request ID header |
# application-prod.yml — Spring Boot behind ALB
server:
shutdown: graceful
forward-headers-strategy: framework
tomcat:
connection-timeout: 60s
management:
endpoint:
health:
probes:
enabled: true
health:
livenessstate:
enabled: true
readinessstate:
enabled: true
endpoints:
web:
exposure:
include: health,info,prometheus
Combining layers — when it makes sense
- CloudFront → ALB → ECS — global dynamic API with edge WAF and TLS
- CloudFront path split — /static/* → S3 (cached), /api/* → ALB (uncached)
- API GW → VPC Link → NLB → ECS — usage plans on private backends without public ALB
- CloudFront → S3 only — pure static; no compute origin
Running Spring Boot on port 8080 in a public subnet "because ALB handles security" — the task is still reachable if SG rules drift. Always place tasks in private subnets; only ALB and NAT in public subnets. Security groups reference other SGs (ALB SG → App SG), never CIDR 0.0.0.0/0 on app ports.
Decision tree: static content → S3 + CloudFront. HTTP microservices in containers → ALB. Serverless REST with API keys → API Gateway REST. Low-latency Lambda proxy → HTTP API. WebSockets → API Gateway WebSocket or ALB (ALB supports WebSocket natively). Don't stack API Gateway on ALB unless the question explicitly requires API management features on private backends.