IAM: Identity & Access Management

IAM fundamentals

Everything in AWS is an API call. IAM decides whether that call is allowed. There is no separate "login server" per service — one IAM engine evaluates all requests, whether they come from the console, CLI, SDK, Terraform, or CDK.

Everything is an API call

When you click "Create bucket" in the console, the browser sends s3:CreateBucket to the S3 control plane. When your Spring Boot app on ECS reads a secret, the task role credentials sign secretsmanager:GetSecretValue. When Terraform runs aws ec2 describe-instances, the IAM user or role behind the provider must have ec2:DescribeInstances. Same engine, same evaluation order, every time.

IAM is global

IAM users, groups, roles, and policies exist once per account, not per region. You create a role in IAM once and attach it to EC2 instances in any region. The exception: some actions are region-scoped in policies via condition keys like aws:RequestedRegion, but the IAM entities themselves are global. Route53 and CloudFront are also global services; most others are regional.

Core components

Component	What it is	Production guidance
Users	Long-term identity with optional access keys and console password	Avoid for humans — use SSO. OK for legacy CI with rotation; prefer OIDC roles
Groups	Collection of users; policies attach to the group	Map SSO groups to permission sets instead of IAM groups when possible
Roles	Temporary credentials via sts:AssumeRole	Default for everything — EC2, ECS, Lambda, cross-account, CI/CD
Policies	JSON documents: Effect, Action, Resource, Condition	Customer-managed for app-specific; AWS managed to bootstrap, then narrow
Identity providers	SAML/OIDC/OAuth trust for federation	Corporate IdP → IAM Identity Center; GitHub → OIDC provider for Actions

🔬 Under the Hood

IAM is a regional service with a global endpoint. Policy data replicates globally within an account, but evaluation happens at the edge of each AWS API. When you assume a role, STS returns temporary credentials (access key + secret + session token) valid for 15 minutes to 12 hours — the session token is what makes them temporary; without it, the access key alone is useless.

💰 Cost

IAM itself is free. You pay nothing for users, roles, or policies. Costs appear indirectly: overly broad s3:* policies enable data exfiltration; missing SCPs enable crypto-mining in rogue regions. IAM Access Analyzer and IAM Identity Center have their own pricing tiers — Access Analyzer is free for external access findings; policy validation is included.

Policy evaluation logic

AWS evaluates policies in a fixed order. One explicit Deny anywhere overrides any number of Allow statements. If nothing allows the action, the default is implicit deny — the request fails.

flowchart TD
  START["Incoming API request\n(principal + action + resource)"]
  START --> EXPLICIT{"Explicit Deny\nin any policy?"}
  EXPLICIT -->|Yes| DENY["❌ DENY"]
  EXPLICIT -->|No| SCP{"SCP allows?\n(org guardrail)"}
  SCP -->|No| DENY
  SCP -->|Yes| RP{"Resource-based policy\nallows?"}
  RP -->|Explicit Deny| DENY
  RP -->|Allow| ALLOW["✅ ALLOW"]
  RP -->|No match| ID{"Identity-based policies\n(user/role/group) allow?"}
  ID -->|Explicit Deny| DENY
  ID -->|Allow| PB{"Permission boundary\nallows?"}
  ID -->|No Allow| IMPLICIT["❌ Implicit DENY"]
  PB -->|No| DENY
  PB -->|Yes| SP{"Session policy\n(AssumeRole) allows?"}
  SP -->|No| DENY
  SP -->|Yes| ALLOW

Evaluation order (simplified — exam-critical):

Explicit Deny — any policy, any type → immediate deny
SCPs — organization guardrails; cannot grant permissions, only deny or pass-through
Resource-based policies — S3 bucket policy, KMS key policy, SQS queue policy, Lambda resource policy
Identity-based policies — attached to user, group, or role
Permission boundaries — cap maximum permissions even if identity policies allow more
Session policies — passed during AssumeRole to further restrict the session
Implicit deny — default if no Allow matched

🎯 Exam Tip

SCPs never grant permissions — they only filter what identity policies can grant. A common trap: "We attached PowerUserAccess SCP to the OU" — SCPs don't work that way. Also: resource policies can grant cross-account access without identity policies (S3 bucket policy allowing another account's role) — both sides may need to allow the action.

⚠️ Pitfall

Attaching AdministratorAccess to a role "temporarily for debugging" and forgetting to remove it. Combined with a permissive trust policy ("Principal": {"AWS": "*"}), you've created a persistent backdoor. Use aws sts assume-role with MFA and time-bounded break-glass roles instead.

IAM policy deep dive

Policies are JSON. Master the structure once and you can read any AWS permission document — IAM policies, S3 bucket policies, KMS key policies, and SCPs all share the same grammar.

Policy structure

{
  "Version": "2012-10-17",
  "Statement": [{
    "Sid": "AllowReadOwnBucketPrefix",
    "Effect": "Allow",
    "Action": ["s3:GetObject", "s3:ListBucket"],
    "Resource": [
      "arn:aws:s3:::my-app-artifacts",
      "arn:aws:s3:::my-app-artifacts/${aws:username}/*"
    ],
    "Condition": {
      "Bool": { "aws:SecureTransport": "true" },
      "StringEquals": { "aws:RequestedRegion": "eu-west-1" }
    }
  }]
}

Element	Purpose
Version	Always 2012-10-17 for new policies
Effect	Allow or Deny
Action	API operations — service:Operation or wildcards
Resource	ARN(s) the action applies to; omit for IAM actions that aren't resource-scoped
Condition	Optional constraints — IP, MFA, tags, encryption headers, source VPC
Principal	Only in resource-based policies — who is allowed/denied

Wildcards — when to use and when to avoid

s3:Get* — acceptable during prototyping; replace with explicit actions in production
s3:* on arn:aws:s3:::* — almost never acceptable for application roles
ec2:Describe* — common for read-only ops roles; low risk (no mutations)
? — single-character wildcard in some resource patterns; rarely needed

Condition operators and keys

Operator	Use case	Example key
StringEquals	Exact match	aws:RequestedRegion, aws:PrincipalTag/Team
StringLike	Pattern match with *	s3:prefix = uploads/${aws:username}/*
IpAddress	Restrict to CIDR	aws:SourceIp — corporate VPN only
DateLessThan	Time-bound access	aws:CurrentTime — contractor expiry
Bool	True/false flags	aws:SecureTransport = true (HTTPS only)
ArnLike	ARN pattern	aws:SourceArn — Lambda can only be invoked by this SNS topic
NumericLessThan	Thresholds	s3:max-keys — limit list operations

High-value condition keys for backend engineers:

aws:MultiFactorAuthPresent — require MFA for sensitive operations
aws:CalledVia — restrict which service may call on your behalf (confused deputy prevention)
aws:SourceVpc / aws:SourceVpce — only from your VPC or endpoint
ec2:Region — legacy; prefer aws:RequestedRegion

Policy variables

Dynamic policies use ${aws:username}, ${aws:userid}, ${aws:PrincipalTag/key} in Resource or Condition values — one policy template for all developers instead of one policy per person.

NotAction and NotResource

NotAction means "everything except these actions" — e.g. deny all except s3:GetObject. Powerful and dangerous: easy to accidentally allow more than intended. Use explicit Allow lists instead when possible.

Inline vs managed policies

AWS managed — AmazonS3ReadOnlyAccess; maintained by AWS, good starting point
Customer managed — your reusable policies; versioned, attachable to multiple roles, reviewable in PRs
Attach up to 10 managed policies per user/role/group (quota increaseable)

Embedded directly on one user/role/group — deleted when entity is deleted
No versioning, no reuse, hard to audit across accounts
Acceptable only for generated one-off break-glass policies

Example: least-privilege S3 read for a Spring service role

saved globally

cat > /tmp/order-service-s3-policy.json <<'EOF'
{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Action": ["s3:GetObject", "s3:PutObject"],
    "Resource": "arn:aws:s3:::my-app-artifacts/orders/*",
    "Condition": { "Bool": { "aws:SecureTransport": "true" } }
  }]
}
EOF

aws iam create-policy --policy-name OrderServiceS3Access \
  --policy-document file:///tmp/order-service-s3-policy.json

aws iam attach-role-policy --role-name order-service-ecs-task \
  --policy-arn arn:aws:iam::123456789012:policy/OrderServiceS3Access

resource "aws_iam_policy" "order_service_s3" {
  name = "OrderServiceS3Access"
  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect   = "Allow"
      Action   = ["s3:GetObject", "s3:PutObject"]
      Resource = "${aws_s3_bucket.artifacts.arn}/orders/*"
      Condition = {
        Bool = { "aws:SecureTransport" = "true" }
      }
    }]
  })
}

resource "aws_iam_role_policy_attachment" "order_service" {
  role       = aws_iam_role.ecs_task.name
  policy_arn = aws_iam_policy.order_service_s3.arn
}

import * as iam from 'aws-cdk-lib/aws-iam';
import * as s3 from 'aws-cdk-lib/aws-s3';

const bucket = s3.Bucket.fromBucketName(this, 'Artifacts', 'my-app-artifacts');

taskRole.addToPolicy(new iam.PolicyStatement({
  effect: iam.Effect.ALLOW,
  actions: ['s3:GetObject', 's3:PutObject'],
  resources: [`${bucket.bucketArn}/orders/*`],
  conditions: { Bool: { 'aws:SecureTransport': 'true' } },
}));

🔒 Security

Enforce HTTPS on S3 with aws:SecureTransport in the identity policy and a bucket policy deny when the condition is false — defense in depth. Pair with bucket policy denying uploads without x-amz-server-side-encryption.

IAM roles — the right way to grant access

Never use long-term access keys if a role can be used. Roles provide temporary credentials automatically rotated by AWS. Every compute service on AWS is designed around roles — fighting that design means credential sprawl and 3 AM incidents.

Trust policy vs permissions policy

Every role has two documents: a trust policy (who can assume the role) and one or more permissions policies (what the role can do once assumed). Confusing them is the #1 IAM mistake — granting s3:* in the trust policy does nothing; putting ec2.amazonaws.com in a permissions policy does nothing.

Pattern	Trust principal	Typical permissions
EC2 instance profile	ec2.amazonaws.com	S3, SSM, CloudWatch — app needs on the instance
ECS task role	ecs-tasks.amazonaws.com	Per-service: RDS via proxy, SQS publish, Secrets Manager — not shared with the EC2 host role
Lambda execution role	lambda.amazonaws.com	CloudWatch Logs + exactly what the handler calls — avoid AdministratorAccess
Cross-account role	Account B root or specific role ARN + sts:ExternalId	Read-only audit, shared services, CI deploy to prod account
Service-linked role	Pre-created by AWS service	Auto-managed; don't delete — breaks the service (e.g. AWSServiceRoleForECS)

ECS task role vs EC2 instance role

On ECS with EC2 launch type, the container instance has an instance profile (pull images from ECR, send logs). Each task gets its own task role injected as environment credentials — your Spring service gets only the S3/DynamoDB permissions it needs, not the host's ECR pull permissions. On Fargate, there is no instance profile — only the task role matters.

Cross-account access

Account A assumes a role in Account B: B's role trust policy allows A's principal; B's permissions policy grants the actions; A's identity may also need sts:AssumeRole on the role ARN. Use External ID when a third party assumes into your account — prevents confused deputy attacks where another customer tricks the third party into assuming your role.

Role chaining

Assume role A, then from A assume role B — maximum session duration 1 hour for chained roles. Useful for hub-and-spoke access patterns; adds latency (two STS calls). Prefer direct trust where possible.

Create an ECS task role (production pattern)

saved globally

aws iam create-role --role-name order-service-task \
  --assume-role-policy-document '{
    "Version": "2012-10-17",
    "Statement": [{
      "Effect": "Allow",
      "Principal": { "Service": "ecs-tasks.amazonaws.com" },
      "Action": "sts:AssumeRole"
    }]
  }'

# Reference in task definition:
# "taskRoleArn": "arn:aws:iam::123456789012:role/order-service-task"

resource "aws_iam_role" "ecs_task" {
  name = "order-service-task"
  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect    = "Allow"
      Action    = "sts:AssumeRole"
      Principal = { Service = "ecs-tasks.amazonaws.com" }
    }]
  })
}

resource "aws_ecs_task_definition" "app" {
  family                   = "order-service"
  task_role_arn            = aws_iam_role.ecs_task.arn
  execution_role_arn       = aws_iam_role.ecs_execution.arn
  # ...
}

import * as ecs from 'aws-cdk-lib/aws-ecs';
import * as iam from 'aws-cdk-lib/aws-iam';

const taskRole = new iam.Role(this, 'OrderServiceTaskRole', {
  assumedBy: new iam.ServicePrincipal('ecs-tasks.amazonaws.com'),
  description: 'Runtime permissions for order-service containers',
});

new ecs.FargateTaskDefinition(this, 'TaskDef', {
  taskRole,
  // executionRole for ECR pull + logs — separate, tighter role
});

📦 Real World

Netflix pioneered cross-account IAM roles for their microservices — each service assumes a role scoped to its dataset, with no long-lived keys on instances. Stripe uses strict role boundaries between PCI and non-PCI accounts. Both treat IAM roles as the only credential mechanism in production.

⚖️ Trade-off

One role per service vs shared role: shared roles simplify ops but violate least privilege when services diverge (payments needs KMS, catalog doesn't). Start with one role per ECS service / Lambda function; merge only when permissions are identical and lifecycle is identical.

Least privilege in practice

Least privilege isn't a one-time policy write — it's a continuous loop: deploy with broader AWS managed policies, observe actual usage in CloudTrail, narrow to customer-managed policies, repeat. Tools exist at every step.

The narrowing workflow

Start with AWS managed policy closest to need (e.g. AmazonDynamoDBFullAccess in dev only)
Enable CloudTrail in all regions; log to a security account
Run IAM Access Advisor — see last-accessed services per user/role; remove unused actions
Use IAM Access Analyzer — detect resources shared externally; validate policies before deploy
Generate policies from CloudTrail with Access Analyzer policy generator (last 90 days of actual API calls)
Replace with customer-managed policy; attach permission boundary for delegated admin teams

Permission boundaries

A permission boundary caps what a role/user can do even if attached policies allow more. Use case: let a product team create their own roles in CI, but cap them with a boundary that denies iam:*, organizations:*, and s3:DeleteBucket. The effective permission is the intersection of boundary and identity policies.

CloudTrail → Athena: find what a role actually used

-- Run against CloudTrail logs table in Athena
SELECT eventSource, eventName, count(*) AS calls
FROM cloudtrail_logs
WHERE userIdentity.arn LIKE '%order-service-task%'
  AND eventTime > date_add('day', -90, current_timestamp)
GROUP BY eventSource, eventName
ORDER BY calls DESC
LIMIT 50;

$ # Simulate: can this role publish to our queue?
$ aws iam simulate-principal-policy \
  --policy-source-arn arn:aws:iam::123456789012:role/order-service-task \
  --action-names sqs:SendMessage \
  --resource-arns arn:aws:sqs:eu-west-1:123456789012:orders.fifo
→ EvalDecision: allowed | denied | explicitDeny
$ aws iam generate-service-last-accessed-details \
  --arn arn:aws:iam::123456789012:role/order-service-task
$ aws accessanalyzer validate-policy --policy-type IDENTITY_POLICY \
  --policy-document file://policy.json

💡 Pro Tip

Run aws iam simulate-principal-policy in CI when changing IAM — assert that deploy roles cannot call iam:CreateUser or s3:DeleteBucket. Treat IAM policy changes like application code: PR review + automated checks.

🎯 Exam Tip

Permission boundaries do not apply to the principal that creates them — they're set by an admin on a delegated role. SCPs apply to all principals in an OU including the management account (with exceptions). Know the difference: SCP = org-wide guardrail; boundary = per-role cap; session policy = per-assume-role session cap.

Identity federation

Humans and CI pipelines should never hold long-lived AWS access keys. Federation exchanges IdP tokens for temporary AWS credentials via STS — same roles and policies as always, but no keys to leak.

Mechanism	Use case	How it works
SAML 2.0	Corporate IdP (Okta, AD FS, Azure AD)	User logs in via IdP → SAML assertion → AssumeRoleWithSAML
OIDC	GitHub Actions, GitLab CI, Spacelift	OIDC token from provider → AssumeRoleWithWebIdentity — no stored AWS keys in GitHub
IAM Identity Center	Multi-account human access (replaces SSO)	Central portal → permission sets → temporary role in target account
Cognito User Pools	App user authentication (sign-up/sign-in)	JWT for your app — not the same as AWS console access
Cognito Identity Pools	Mobile/web apps needing AWS credentials	Federated identity → temporary AWS creds scoped by IAM role mapping

GitHub Actions → AWS (OIDC) — the modern CI pattern

saved globally

# 1. Create OIDC provider (once per account)
aws iam create-open-id-connect-provider \
  --url https://token.actions.githubusercontent.com \
  --client-id-list sts.amazonaws.com \
  --thumbprint-list 6938fd4d98bab03faadb97b34396831e3780aea1

# 2. Role trust policy — only repo main branch can assume
# "Condition": {
#   "StringEquals": { "token.actions.githubusercontent.com:aud": "sts.amazonaws.com" },
#   "StringLike": { "token.actions.githubusercontent.com:sub": "repo:myorg/my-app:ref:refs/heads/main" }
# }

resource "aws_iam_openid_connect_provider" "github" {
  url             = "https://token.actions.githubusercontent.com"
  client_id_list  = ["sts.amazonaws.com"]
  thumbprint_list = ["6938fd4d98bab03faadb97b34396831e3780aea1"]
}

resource "aws_iam_role" "github_deploy" {
  name = "github-actions-deploy"
  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect = "Allow"
      Action = "sts:AssumeRoleWithWebIdentity"
      Principal = {
        Federated = aws_iam_openid_connect_provider.github.arn
      }
      Condition = {
        StringEquals = {
          "token.actions.githubusercontent.com:aud" = "sts.amazonaws.com"
        }
        StringLike = {
          "token.actions.githubusercontent.com:sub" = "repo:myorg/my-app:ref:refs/heads/main"
        }
      }
    }]
  })
}

import * as iam from 'aws-cdk-lib/aws-iam';

const provider = new iam.OpenIdConnectProvider(this, 'GitHubOidc', {
  url: 'https://token.actions.githubusercontent.com',
  clientIds: ['sts.amazonaws.com'],
});

new iam.Role(this, 'GitHubDeployRole', {
  assumedBy: new iam.WebIdentityPrincipal(provider.openIdConnectProviderArn, {
    StringEquals: {
      'token.actions.githubusercontent.com:aud': 'sts.amazonaws.com',
    },
    StringLike: {
      'token.actions.githubusercontent.com:sub': 'repo:myorg/my-app:ref:refs/heads/main',
    },
  }),
  managedPolicies: [iam.ManagedPolicy.fromAwsManagedPolicyName('AmazonEC2ContainerRegistryPowerUser')],
});

IAM Identity Center (AWS SSO)

For human access across multiple accounts: connect Okta/Azure AD once, define permission sets (templates of IAM policies), assign to groups per account. Engineers get 1–12 hour sessions via the SSO portal — no IAM users, no access keys. This is the AWS-recommended enterprise pattern and appears frequently on the SA Pro exam.

⚖️ Trade-off

IAM users vs Identity Center: IAM users scale poorly (no central offboarding, key rotation burden). Identity Center adds setup complexity but gives one place to revoke access when someone leaves. Exception: break-glass IAM user with MFA in a sealed envelope — max two, never used in normal operations.

IAM security best practices

A checklist distilled from AWS Well-Architected Security pillar, incident post-mortems, and SA exam rubrics. If you implement only this section, you'll be ahead of most production accounts.

Root account

MFA enabled (hardware key). Access keys deleted. Never used for daily work. Email goes to a group alias, not one person.
Break-glass

Separate procedure for emergency root use — documented, audited, requires two people. Credentials in physical safe.
SCPs everywhere

Deny leaving org, deny disabling CloudTrail, deny unapproved regions, deny root API calls except from break-glass IP.
CloudTrail all regions

Multi-region trail, log file validation, deliver to security account S3 with MFA delete. Alert on StopLogging.
GuardDuty on

Detects credential exfiltration, unusual API geography, crypto mining, Tor activity. Enable org-wide from admin account.
Roles not keys

EC2/ECS/Lambda use roles. CI uses OIDC. Never hardcode keys in application.properties or GitHub secrets long-term.

Credential hygiene

Rotate IAM access keys < 90 days if they must exist; prefer elimination over rotation
Alert on access keys unused > 45 days (Config rule or Access Analyzer)
Deny iam:CreateAccessKey for human users via SCP
Use aws:MultiFactorAuthPresent for sensitive console/API operations

Example SCP: deny risky actions org-wide

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "DenyLeaveOrg",
      "Effect": "Deny",
      "Action": "organizations:LeaveOrganization",
      "Resource": "*"
    },
    {
      "Sid": "DenyDisableAudit",
      "Effect": "Deny",
      "Action": [
        "cloudtrail:DeleteTrail",
        "cloudtrail:StopLogging",
        "guardduty:DeleteDetector",
        "guardduty:DisassociateFromMasterAccount"
      ],
      "Resource": "*"
    },
    {
      "Sid": "DenyUnapprovedRegions",
      "Effect": "Deny",
      "NotAction": [
        "iam:*", "organizations:*", "route53:*", "support:*",
        "budgets:*", "ce:*", "cloudfront:*", "globalaccelerator:*"
      ],
      "Resource": "*",
      "Condition": {
        "StringNotEquals": {
          "aws:RequestedRegion": ["eu-west-1", "us-east-1"]
        }
      }
    }
  ]
}

⚠️ Pitfall

Storing AWS_ACCESS_KEY_ID in Spring application.yml committed to Git — even a private repo. Use ECS task role + default credential chain (DefaultCredentialsProvider in AWS SDK v2). For local dev, use aws sso login profiles, never production keys on laptops.

📦 Real World

Airbnb and Slack run multi-account AWS Organizations with SCPs denying public resource creation at the org level — even if a developer attaches s3:PutBucketPublicAccessBlock incorrectly, the SCP layer catches it. Defense in depth: SCP + resource policy + Block Public Access settings.

🎯 Exam Tip

When the exam asks "how to prevent account X from doing Y across all accounts," the answer is almost always SCP on an OU, not IAM policy on individual users. When it asks about limiting what a delegated admin can grant, answer permission boundary. When it asks about CI without long-lived keys, answer OIDC + AssumeRoleWithWebIdentity.

Everything is an API call

IAM is global

Core components

Policy structure

Wildcards — when to use and when to avoid

Condition operators and keys

Policy variables

NotAction and NotResource

Inline vs managed policies

Example: least-privilege S3 read for a Spring service role

Trust policy vs permissions policy

ECS task role vs EC2 instance role

Cross-account access

Role chaining

Create an ECS task role (production pattern)

The narrowing workflow

Permission boundaries

CloudTrail → Athena: find what a role actually used

GitHub Actions → AWS (OIDC) — the modern CI pattern

IAM Identity Center (AWS SSO)

Root account

Break-glass

SCPs everywhere

CloudTrail all regions

GuardDuty on

Roles not keys

Credential hygiene

Example SCP: deny risky actions org-wide