Cost Management & Optimization
AWS bills in dimensions most teams discover only after the first invoice shock: data transfer between AZs, NAT Gateway processing fees, idle EBS volumes, and cross-region replication you forgot to turn off. This chapter teaches the pricing model, the levers that actually move the needle for backend workloads, and the monitoring stack that catches runaway spend before finance does.
Billing fundamentals
AWS charges along four axes: compute hours, storage GB-months, data transfer GB, and API requests. Understanding which axis dominates your architecture is the first step — a Spring Boot fleet on EC2 is compute-heavy; a document pipeline with cross-region replication is transfer-heavy.
Cost dimensions
| Dimension | What you pay for | Common services | Optimization lever |
|---|---|---|---|
| Compute | vCPU-hours, memory, GPU seconds | EC2, Fargate, Lambda, RDS instance hours | Right-sizing, Graviton, Savings Plans, Spot for batch |
| Storage | GB stored per month + IOPS/throughput provisioned | S3, EBS, EFS, RDS storage, DynamoDB | Lifecycle rules, gp3 vs io2, Intelligent-Tiering, delete snapshots |
| Data transfer | GB moved between AZs, regions, or to the internet | NAT Gateway, CloudFront, cross-AZ ALB, S3 replication | VPC endpoints, same-AZ placement, CloudFront caching |
| Requests | Per-million API calls, per-query charges | S3 PUT/GET, DynamoDB RCU/WCU, API Gateway, Lambda invocations | Batch writes, caching, connection pooling, SQS buffering |
Billing is computed in one-second granularity for most compute (EC2, Fargate) and per-hour or per-month for storage. Data transfer charges appear on the invoice under AWS Data Transfer — often split across multiple line items that don't obviously map to a single resource. Use Cost and Usage Reports (CUR) with Athena for line-item attribution; the console Cost Explorer aggregates at service level only.
AWS Free Tier (2024+ model)
New AWS accounts receive up to $200 in credits for select services during the first six months, plus always-free tiers for specific services within limits. The old "12-month free tier for everything" model is largely replaced — always verify current terms on the AWS pricing page before architecture decisions.
| Category | Typical always-free / credit-eligible | Production reality |
|---|---|---|
| Compute | 750 hrs/mo of t2/t3.micro (legacy accounts), Lambda 1M requests | A single m7g.large running 24/7 exceeds free tier in days |
| Storage | 5 GB S3 Standard, 20 GB EBS gp2/gp3 | Dev artifact buckets and log retention blow past 5 GB quickly |
| Data transfer | 100 GB outbound to internet (first 12 months, legacy) | NAT Gateway hourly charges are never free — only data egress has free allowances |
| Database | 750 hrs db.t3.micro RDS, 25 GB DynamoDB storage | Multi-AZ doubles instance cost immediately — not covered by free tier math |
Assuming "we're on free tier" because the account is new. NAT Gateways, interface VPC endpoints, Aurora I/O-Optimized, and cross-AZ data transfer have no meaningful free tier. A three-AZ VPC with NAT can cost $100+/month before you deploy a single application container.
Data transfer pricing (us-east-1 reference — verify your region)
Data transfer is the #1 surprise on AWS invoices for backend teams. Inbound to AWS is free; movement within and out of AWS is where costs accumulate.
| Transfer type | Direction / scope | Typical price (us-east-1) | Example scenario |
|---|---|---|---|
| Intra-AZ | Same Availability Zone, same VPC | Free | ECS task → RDS in same AZ; ALB → target in same AZ |
| Inter-AZ | Different AZ, same region | $0.01/GB each direction | Multi-AZ RDS replication; ALB cross-AZ load balancing; NAT cross-AZ routing |
| Internet egress | From AWS to public internet | $0.09/GB (first 10 TB tier) | API responses to mobile clients; S3 direct download without CloudFront |
| Cross-region | Between AWS regions | $0.02/GB (varies by pair) | S3 cross-region replication; DR read replica in eu-west-1 from us-east-1 primary |
| CloudFront → viewer | CDN edge to end user | Lower than direct S3 egress at scale | Static assets, API caching at edge — often 30–50% egress savings vs S3 website |
| Gateway VPC endpoint | S3 / DynamoDB from VPC | Free (no hourly, no per-GB) | Private subnet → S3 without NAT — eliminates NAT processing for that traffic |
A microservices fleet with cross-AZ load balancing and chatty service-to-service calls can generate $500–2,000/month in inter-AZ transfer alone. Mitigations: prefer same-AZ placement groups for latency-sensitive pairs, enable ALB cross-zone load balancing only when needed, and use VPC endpoints for AWS service traffic. Run Cost Explorer grouped by Usage Type Group → Data Transfer to find the culprit.
Cost allocation tags
Tags are how finance maps AWS spend to teams, products, and environments. Without them, Cost Explorer shows one blob labeled "Amazon EC2" — useless for chargeback. Activate cost allocation tags in the Billing console, then enforce tag-on-create with SCPs, Config rules, or Terraform default_tags.
| Tag key | Purpose | Example value |
|---|---|---|
| Environment | Separate prod vs non-prod spend | production, staging |
| Team | Chargeback to engineering team | payments, platform |
| Service | Map to microservice or product | order-api |
| CostCenter | Finance ERP integration | CC-1042 |
Enforce tags at deploy time
# Activate user-defined tags for Cost Explorer (once per account)
aws ce update-cost-allocation-tags-status \
--cost-allocation-tags-status TagKey=Environment,Status=Active \
--cost-allocation-tags-status TagKey=Team,Status=Active
# Tag an existing resource
aws ec2 create-tags --resources i-0abc123 \
--tags Key=Environment,Value=production Key=Team,Value=payments
provider "aws" {
default_tags {
tags = {
Environment = var.environment
Team = "platform"
ManagedBy = "terraform"
}
}
}
resource "aws_instance" "app" {
ami = data.aws_ami.amazon_linux.id
instance_type = "m7g.large"
tags = {
Service = "order-api"
}
}
import * as cdk from 'aws-cdk-lib';
const app = new cdk.App();
new cdk.Stack(app, 'OrderApiStack', {
tags: {
Environment: 'production',
Team: 'payments',
Service: 'order-api',
},
});
When the exam asks how to allocate costs to departments, answer cost allocation tags + Cost Explorer or AWS Organizations consolidated billing. When it asks about preventing spend in unapproved regions, answer SCP — not Budgets (Budgets alert; SCPs block). Data transfer into AWS from the internet is always free.
Compute costs
EC2 and Fargate typically dominate AWS spend for backend teams. The optimization stack is layered: right-size first (free savings), then Graviton (20–40% price/performance), then commit baseline with Savings Plans, then add Spot for interruptible work on top.
Purchasing options compared
| Option | Commitment | Typical discount | Flexibility | Best for |
|---|---|---|---|---|
| On-Demand | None | 0% (baseline) | Change instance type/region anytime | Spiky traffic, new workloads, burst above committed baseline |
| Reserved Instances (RI) | 1 or 3 years; specific instance family + region + tenancy | ~40% (1-yr) to ~60% (3-yr) | Low — locked to family/region; convertible RI adds cost | Legacy; prefer Savings Plans unless you need capacity reservation |
| Compute Savings Plans | $/hr commit for 1 or 3 years | ~54% (1-yr) to ~66% (3-yr) | High — applies to EC2, Fargate, Lambda regardless of family/region/OS | Mixed fleets, multi-region, container-heavy architectures |
| EC2 Instance Savings Plans | $/hr commit; specific instance family in a region | ~62% (1-yr) to ~72% (3-yr) | Medium — family locked, size flexible within family | Homogeneous fleets (all m7g in us-east-1) |
| Spot Instances | None; 2-minute interruption notice | Up to ~90% off On-Demand | High — any available capacity; can be reclaimed | Batch jobs, CI runners, Karpenter nodes, stateless workers with checkpointing |
RI vs Savings Plans: AWS recommends Savings Plans for most customers — they apply across instance sizes, families, regions (Compute SP), and even Lambda/Fargate. RIs still win when you need capacity reservations (guaranteed capacity in an AZ during shortages) or Dedicated Host licensing. For a standard Spring Boot on ECS fleet, Compute SP 3-year is the default answer.
Savings Plans calculator
Model your monthly On-Demand compute spend and steady-state percentage. Committed plans cover only the steady portion; burst traffic stays On-Demand. Use Cost Explorer → Savings Plans recommendations for AWS-generated commit amounts before purchasing.
💰 EC2 Savings Plans calculator
Compare On-Demand vs RI vs Savings Plans vs Spot on monthly compute spend (us-east-1 approximations).
Graviton (ARM) economics
Graviton3/4 instances (m7g, c7g, r7g) typically deliver 20–40% better price/performance than equivalent x86 for Java/Spring workloads on Corretto 17+. Savings Plans and RIs apply to Graviton the same way — you're stacking architecture choice on top of commitment discounts.
| Instance | vCPU / RAM | On-Demand (us-east-1) | Notes |
|---|---|---|---|
| m7i.large | 2 / 8 GiB | ~$0.1008/hr | Intel baseline for comparison |
| m7g.large | 2 / 8 GiB | ~$0.0816/hr | ~19% cheaper On-Demand; often faster for JVM |
| m7g.xlarge | 4 / 16 GiB | ~$0.1632/hr | Sweet spot for medium Spring Boot services |
Run a one-week A/B test: deploy the same service on m7i.large and m7g.large behind the same ALB target group with weighted routing. Compare p99 latency, GC pause times, and CPU utilization before committing to Graviton fleet-wide. Watch for JNI/native dependencies (some encryption libs, old JDBC drivers) that lack ARM builds.
Right-sizing
The cheapest instance is the one that's appropriately sized — not the smallest SKU. Over-provisioned instances waste money; under-provisioned ones cause latency incidents that cost more in engineering time.
- Enable Detailed Monitoring (1-minute CloudWatch metrics) on production EC2 — $2.10/instance/month well spent
- Review CPUUtilization p95 over 14 days — sustained <20% suggests downsizing
- Check MemoryUtilization via CloudWatch agent — CPU low but memory high → move to r family
- Use AWS Compute Optimizer — free ML recommendations for EC2, EBS, Lambda, ECS on Fargate
- Schedule non-prod environments to stop nights/weekends with Instance Scheduler or ASG scheduled actions
AWS Compute Optimizer
Compute Optimizer analyzes 14 days of CloudWatch metrics and recommends instance type changes, EBS volume type changes, and Lambda memory settings. It requires opt-in (free) and appropriate IAM permissions. Export findings to CSV or integrate with Systems Manager for automated right-sizing workflows.
$ aws compute-optimizer get-ec2-instance-recommendations \ --account-ids 123456789012 \ --filters name=Finding,values=Underprovisioned,Overprovisioned → currentInstanceType: m5.xlarge → recommendation: m7g.large (estimated savings 34%) $ aws compute-optimizer get-lambda-function-recommendations \ --function-arns arn:aws:lambda:us-east-1:123456789012:function:order-processor → memory 1024 MB → recommendation: 512 MB (no perf impact, 50% cost reduction)
Stripe and Netflix run mixed On-Demand + Spot fleets with aggressive auto-scaling — baseline covered by Savings Plans, burst handled by Spot with interruption handling. Platform teams review Compute Optimizer exports monthly; any recommendation with >30% savings and low migration risk goes into the next sprint as a standard change.
Buying a 3-year Compute Savings Plan based on peak holiday traffic. Commit only to minimum steady-state spend — the portion that runs 24/7 year-round. Burst and seasonal peaks should stay On-Demand or Spot. Over-committing means paying for unused capacity for three years; under-committing means leaving 40%+ on the table for your baseline fleet.
Storage, network & database costs
After compute, the next largest line items are usually S3 storage accumulation, NAT Gateway fees, and database instance hours. These costs are almost always architectural — fixable with lifecycle policies, VPC endpoints, and capacity mode choices rather than negotiating with AWS.
S3 lifecycle & Intelligent-Tiering
| Storage class | Use case | Price (us-east-1 per GB/mo) | Retrieval |
|---|---|---|---|
| Standard | Hot data, frequent access | ~$0.023 | Instant |
| Intelligent-Tiering | Unknown or changing access patterns | ~$0.023 + monitoring fee per object | Auto-moves to IA/Archive after 30/90 days idle |
| Standard-IA | Infrequent access, milliseconds retrieval | ~$0.0125 | Min 128 KB bill + retrieval fee |
| Glacier Instant | Archive with instant retrieval | ~$0.004 | Instant; higher retrieval $/GB |
| Glacier Deep Archive | Compliance retention 7+ years | ~$0.00099 | 12–48 hour retrieval |
Lifecycle rules transition objects by prefix and age — e.g. move logs/ to Glacier after 90 days, expire temp/ after 7 days. Intelligent-Tiering removes manual tier decisions for buckets with mixed access patterns (user uploads, ML training sets).
aws s3api put-bucket-lifecycle-configuration --bucket my-app-logs \
--lifecycle-configuration '{
"Rules": [{
"ID": "archive-old-logs",
"Status": "Enabled",
"Filter": { "Prefix": "app/" },
"Transitions": [
{ "Days": 30, "StorageClass": "STANDARD_IA" },
{ "Days": 90, "StorageClass": "GLACIER" }
],
"Expiration": { "Days": 365 }
}]
}'
resource "aws_s3_bucket_lifecycle_configuration" "logs" {
bucket = aws_s3_bucket.logs.id
rule {
id = "archive-old-logs"
status = "Enabled"
filter { prefix = "app/" }
transition {
days = 30
storage_class = "STANDARD_IA"
}
transition {
days = 90
storage_class = "GLACIER"
}
expiration { days = 365 }
}
}
import * as s3 from 'aws-cdk-lib/aws-s3';
bucket.addLifecycleRule({
id: 'archive-old-logs',
prefix: 'app/',
transitions: [
{ storageClass: s3.StorageClass.INFREQUENT_ACCESS, transitionAfter: cdk.Duration.days(30) },
{ storageClass: s3.StorageClass.GLACIER, transitionAfter: cdk.Duration.days(90) },
],
expiration: cdk.Duration.days(365),
});
EBS cleanup
Detached EBS volumes and old snapshots are silent budget killers — they bill at full storage rates forever until deleted. A 500 GB gp3 volume costs ~$40/month doing nothing.
- Run weekly reports: aws ec2 describe-volumes --filters Name=status,Values=available
- Enable DLM (Data Lifecycle Manager) for automated snapshot retention — keep 7 daily, 4 weekly, delete older
- Delete AMIs you no longer deploy from — the backing snapshots remain until the AMI is deregistered
- Prefer gp3 over gp2 — decouple IOPS from volume size; right-size baseline 3,000 IOPS
- Use AWS Config rule ec2-volume-inuse-check to flag unattached volumes
The NAT Gateway surprise
NAT Gateway charges hourly per AZ plus per-GB processed — even for traffic destined to S3 or DynamoDB that could use free gateway endpoints. This is the most common "why is our AWS bill so high?" finding for teams new to VPC networking.
NAT Gateway pricing (us-east-1): $0.045/hour (~$32.85/month per NAT) plus $0.045/GB processed. A three-AZ production VPC with three NAT Gateways and 2 TB/month outbound can exceed $500/month on NAT alone — before compute.
Mitigations:
- Gateway endpoints for S3 and DynamoDB — free, bypass NAT entirely
- Interface endpoints for ECR, Secrets Manager, CloudWatch Logs — hourly cost but often cheaper than NAT GB charges at scale
- Single NAT in dev/staging; NAT per AZ only in production
- VPC Flow Logs + Athena to identify NAT-heavy subnets
Use the interactive calculator on the VPC & Networking → Internet & NAT page (#awsc-nat-calc) to model hourly + data processing for your traffic profile.
Routing all private subnet traffic through NAT when 40–60% is AWS service calls (S3, ECR, DynamoDB, CloudWatch). Without gateway endpoints, you pay NAT processing fees on traffic that could be free. After adding endpoints, verify route tables: S3 gateway endpoint adds a prefix list route automatically when associated with the route table — but only for that route table's subnets.
CloudFront egress savings
Serving content directly from S3 to global users incurs S3 data transfer-out charges from the origin region. CloudFront caches at edge locations — repeat requests hit the edge, not your origin. PriceClass_100 (US/Europe only) reduces cost for US-centric apps.
| Pattern | Egress path | Cost profile |
|---|---|---|
| S3 presigned URL direct | Client → S3 origin region | S3 egress $0.09/GB + no caching |
| CloudFront + S3 OAI/OAC | Client → edge → S3 (first request only) | CloudFront egress (tiered, often lower) + S3→CloudFront transfer free |
| API Gateway + CloudFront | Cache GET responses at edge | Reduces origin Lambda/ALB invocations + transfer |
Set CloudFront cache policies with long TTLs for immutable assets (hashed filenames in /static/). Use Cache-Control: max-age=31536000, immutable from your Spring Boot resource handler. Origin requests drop 90%+ for typical SPAs — egress and origin load both fall.
Aurora Serverless v2
Aurora Serverless v2 scales ACUs (Aurora Capacity Units) automatically between a min and max you define. Pay per ACU-second — ideal for dev/staging databases that idle nights/weekends, and prod workloads with predictable baseline but occasional spikes. Not a replacement for provisioned Aurora when you need sustained high throughput at lowest $/query — benchmark both.
| Mode | Billing | When to choose |
|---|---|---|
| Provisioned Aurora | Instance hours + storage + I/O (or I/O-Optimized flat rate) | Steady production traffic, known capacity needs, lowest latency SLA |
| Aurora Serverless v2 | ACU-hours (min 0.5 ACU ≈ 1 GiB RAM scale) | Variable load, multi-tenant SaaS, dev/staging auto-pause patterns |
| RDS Proxy + either | Proxy hourly + underlying DB costs | Connection pooling for Lambda/serverless — reduces DB instance size needed |
DynamoDB: on-demand vs provisioned
| Capacity mode | Billing | Best for | Cost trap |
|---|---|---|---|
| On-demand | Per-request pricing (RCU/WCU equivalent) | Unpredictable traffic, new products, spiky workloads <20% utilization of provisioned | Steady high throughput — 3–5× more expensive than well-provisioned |
| Provisioned | Hourly RCU/WCU capacity units | Predictable traffic, production tables with known QPS | Over-provisioning idle capacity; throttling if under-provisioned |
| Provisioned + Auto Scaling | Provisioned base + scaling within bounds | Production default — predictable baseline with headroom | Scaling lag during sudden spikes — pair with on-demand burst or DAX cache |
DynamoDB on-demand vs provisioned: switch to provisioned when sustained utilization exceeds ~20% of what on-demand would cost at your traffic level. Use Cost Explorer → DynamoDB → filter by table to compare. For tables with diurnal patterns (high day, low night), provisioned + scheduled scaling beats on-demand; for viral/unpredictable spikes, on-demand avoids throttling incidents.
Cost monitoring & alerting
Optimization without observability is guesswork. AWS provides native tools from free dashboards to ML-powered anomaly detection — wire them into Slack/PagerDuty before the first five-figure surprise invoice.
AWS Budgets
Budgets let you set monthly, quarterly, or annual spend thresholds at account, linked account, service, or tag level. Alerts fire at configurable percentages (50%, 80%, 100%, forecasted 100%). Budgets are reactive alerts — they don't stop spend (use SCPs for hard blocks).
| Budget type | Tracks | Typical alert threshold |
|---|---|---|
| Cost budget | Actual spend vs limit | 80% actual, 100% forecasted |
| Usage budget | Service usage quantity (e.g. EC2 hours) | 90% of reserved capacity |
| RI/SP utilization | Commitment coverage | <80% utilization — you're wasting commit |
| RI/SP coverage | On-Demand vs covered hours | <70% coverage — buying more SP may help |
aws budgets create-budget --account-id 123456789012 --budget '{
"BudgetName": "monthly-prod-cap",
"BudgetLimit": { "Amount": "15000", "Unit": "USD" },
"TimeUnit": "MONTHLY",
"BudgetType": "COST",
"CostFilters": {
"TagKeyValue": ["User:Environment$production"]
}
}' --notifications-with-subscribers '[{
"Notification": {
"NotificationType": "FORECASTED",
"ComparisonOperator": "GREATER_THAN",
"Threshold": 100
},
"Subscribers": [{ "SubscriptionType": "EMAIL", "Address": "[email protected]" }]
}]'
resource "aws_budgets_budget" "prod_monthly" {
name = "monthly-prod-cap"
budget_type = "COST"
limit_amount = "15000"
limit_unit = "USD"
time_unit = "MONTHLY"
cost_filter {
name = "TagKeyValue"
values = ["User:Environment$production"]
}
notification {
comparison_operator = "GREATER_THAN"
threshold = 80
threshold_type = "PERCENTAGE"
notification_type = "ACTUAL"
subscriber_email_addresses = ["[email protected]"]
}
}
import * as budgets from 'aws-cdk-lib/aws-budgets';
new budgets.CfnBudget(this, 'ProdMonthlyCap', {
budget: {
budgetName: 'monthly-prod-cap',
budgetType: 'COST',
timeUnit: 'MONTHLY',
budgetLimit: { amount: 15000, unit: 'USD' },
costFilters: { TagKeyValue: ['User:Environment$production'] },
},
notificationsWithSubscribers: [{
notification: {
notificationType: 'FORECASTED',
comparisonOperator: 'GREATER_THAN',
threshold: 100,
},
subscribers: [{ subscriptionType: 'EMAIL', address: '[email protected]' }],
}],
});
Cost Explorer
Cost Explorer is the primary analysis UI — group by service, linked account, tag, usage type, or API operation. Enable 13-month retention (free). Saved reports and monthly PDF exports feed finance reviews.
- Monthly cost by service — find the top 3 services; drill into usage type
- Cost by tag (Team, Environment) — requires activated allocation tags
- Net Amortized cost — includes RI/SP benefit spread across covered usage
- Right-sizing recommendations — EC2 instances with low utilization
- Savings Plans recommendations — suggested commit $/hr based on 7/30/60-day history
Create a weekly Cost Explorer saved report filtered to Usage Type Group = Data Transfer and subscribe via EventBridge + Lambda to post top movers to Slack. Spikes in NatGateway-Bytes or InterZone-Out are actionable within hours, not at month-end close.
Cost Anomaly Detection
ML service that learns your normal spend patterns and alerts on anomalies — e.g. a misconfigured S3 sync loop, a crypto-mining instance, or a forgotten load test. Create monitor scopes (entire account, specific service, tag-based) and SNS/Chatbot subscriptions. First 1,000 anomalies/month free tier.
aws ce create-anomaly-monitor --anomaly-monitor '{
"MonitorName": "prod-ec2-anomalies",
"MonitorType": "DIMENSIONAL",
"MonitorDimension": "SERVICE"
}'
aws ce create-anomaly-subscription --anomaly-subscription '{
"SubscriptionName": "platform-slack-alerts",
"Threshold": 50.0,
"Frequency": "DAILY",
"MonitorArnList": ["arn:aws:ce::123456789012:anomalymonitor/..."],
"Subscribers": [{ "Type": "SNS", "Address": "arn:aws:sns:us-east-1:123456789012:cost-alerts" }]
}'
AWS Trusted Advisor
Trusted Advisor scans your account against AWS best practices across five categories — including Cost Optimization. Checks include idle load balancers, underutilized EBS/RDS/EC2, unassociated Elastic IPs, and S3 bucket versioning without lifecycle. Basic checks are free; full checks require Business or Enterprise Support plans.
| Check category | Example cost checks | Plan required |
|---|---|---|
| Cost Optimization | Idle RDS, low-utilization EC2, overprovisioned EBS | Full: Business+; basic subset free |
| Performance | CloudFront optimization, EC2 to EBS throughput | Full: Business+ |
| Security | Public S3 buckets, open security groups | Basic free; expanded Business+ |
| Fault Tolerance | Multi-AZ RDS, ELB across AZs | Full: Business+ |
Recommended monitoring stack
- Cost Explorer — weekly saved reports by service and tag; net amortized view for unit economics
- Budgets — per-environment caps at 80%/100% actual and forecasted; SNS → Slack
- Anomaly Detection — ML alerts on unexpected spikes before month-end close
- Allocation tags — Environment + Team + Service on every resource; SCP deny if missing
- CUR + Athena — hourly line-item billing for engineering-level dashboards
- Trusted Advisor — monthly idle-resource review; automate remediation with Config + Lambda
FinOps Foundation practitioners at scale run a weekly "cost standup" — 30 minutes reviewing Cost Explorer deltas, open Trusted Advisor findings, and Compute Optimizer exports. Platform teams own the tooling; product teams own tag compliance. Anomaly Detection caught a misconfigured cross-region S3 replication loop at $12k/month run rate within 48 hours at multiple companies — Budgets alone would have triggered only at month-end forecast.
Budgets alert; SCPs block. Cost Anomaly Detection uses ML — no manual thresholds per service. Trusted Advisor cost checks require Business Support for the full set. For organization-wide consolidated billing and RI/SP sharing, answer AWS Organizations. CUR delivers hourly granularity; Cost Explorer is daily aggregates — choose CUR for custom Athena dashboards.
Restrict ce:* and budgets:* to finance/platform roles — billing data reveals architecture details. Deliver CUR to a dedicated security/finance account S3 bucket with bucket policy denying non-finance principals. Enable MFA on the root account billing contacts.