Secrets Management

The secrets problem in modern delivery

Secrets are credentials that authenticate machines and humans to systems—API keys, DB passwords, TLS certs, signing keys. Storing them in git, CI variables without rotation, or K8s Secrets as base64 gives attackers a permanent foothold.

flowchart TB
  DEV["Developer laptop
.env file"] --> GIT["Accidental git commit"]
  CI["CI variables
long-lived AWS key"] --> LEAK["Log exposure / fork PR"]
  K8S["K8s Secret
base64 in etcd"] --> RBAC["Over-broad RBAC read"]
  GIT --> BOT["Bot harvests in minutes"]
  LEAK --> LATERAL["Lateral movement to prod"]

name: Secret baseline
on: [push]
jobs:
  audit:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: gitleaks detect --source . --verbose --redact

secret-audit:
  stage: test
  image: zricethezav/gitleaks:latest
  script:
    - gitleaks detect --source . --verbose --redact

Where secrets leak

Surface	Failure mode	Severity
Git history	Committed .env, force-pushed but cached on forks	CRITICAL
CI logs	echo $SECRET, curl with token in URL	CRITICAL
Container images	ARG/ENV baking credentials into layers	HIGH
K8s etcd	Unencrypted Secret objects	HIGH
Chat / tickets	Paste for debugging	MEDIUM

🔒 Security

Assume any secret that touched git is compromised—rotate, do not just delete the file in a follow-up commit.

📦 Real World

Uber 2016 breach started with credentials in a private GitHub repo that attackers found. Private ≠ secure.

Best practices

Prefer dynamic secrets with TTL under 1 hour for databases and cloud APIs.
Never pass secrets as CLI arguments—visible in process list and shell history.
Use environment: protection rules for production credentials.
Audit secret access monthly; alert on anomalous read patterns.
Document rotation runbooks with RTO targets; test twice per year.

Anti-patterns

Shared root AWS keys in a team 1Password folder.
Same staging and production DB password "for simplicity".
Mounting K8s Secrets as env vars in 50 microservices—blast radius on one compromise.
Disabling fork PR pipelines entirely instead of scoped unprivileged workflows.

Zero-trust secret lifecycle

Treat every secret as time-bounded: creation → scoped use → audit → rotation → revocation. Long-lived credentials violate zero trust because compromise detection depends on luck, not architecture.

Phase	Control	Owner
Creation	Automated via Vault/database engine—no human-generated passwords	Platform
Distribution	OIDC/JWT to CI; ESO/CSI to pods—never email or Slack	DevOps
Use	Least privilege IAM policy; read-only where possible	App team
Rotation	Calendar + event-driven (employee offboarding, incident)	Security
Revocation	Break-glass playbook; kill switch in IdP + cloud	SRE + Security

🔬 Under the Hood

GitHub secret scanning partners (AWS, Azure, Stripe, etc.) can auto-revoke leaked tokens when push protection fires—enable org-wide.

🎯 Interview Tip

For The secrets problem in modern delivery: explain static vs dynamic secrets, rotation, and why OIDC beats long-lived CI keys.

⚖️ Trade-off

Managed Vault clusters cost money and ops time—but a single leaked production DB password costs more in incident hours and regulatory fines.

HashiCorp Vault & dynamic secrets

Vault is the reference secrets control plane: static secrets with versioning, dynamic database credentials with TTL, PKI certificate issuance, and encryption-as-a-service transit engine.

name: Vault in CI
on: [push]
jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: hashicorp/vault-action@v2
        with:
          url: https://vault.example.com
          method: jwt
          role: github-actions-deploy
          secrets: |
            secret/data/prod/db password | DB_PASSWORD
      - run: deploy.sh
        env:
          DB_PASSWORD: ${{ env.DB_PASSWORD }}

deploy:
  stage: deploy
  id_tokens:
    VAULT_ID_TOKEN:
      aud: https://vault.example.com
  script:
    - export VAULT_TOKEN=$(vault write -field=token auth/jwt/login role=gitlab-deploy jwt=$VAULT_ID_TOKEN)
    - export DB_PASSWORD=$(vault kv get -field=password secret/prod/db)
    - ./deploy.sh
  rules:
    - if: $CI_COMMIT_BRANCH == "main"

🔬 Under the Hood

Use Vault Agent sidecars in K8s to renew leases—applications should never embed long-lived Vault tokens.

Best practices

Prefer dynamic secrets with TTL under 1 hour for databases and cloud APIs.
Never pass secrets as CLI arguments—visible in process list and shell history.
Use environment: protection rules for production credentials.
Audit secret access monthly; alert on anomalous read patterns.
Document rotation runbooks with RTO targets; test twice per year.

Anti-patterns

Shared root AWS keys in a team 1Password folder.
Same staging and production DB password "for simplicity".
Mounting K8s Secrets as env vars in 50 microservices—blast radius on one compromise.
Disabling fork PR pipelines entirely instead of scoped unprivileged workflows.

Vault policy example

# payments-api can read only its DB creds
path "database/creds/payments-role" {
  capabilities = ["read"]
}
path "secret/data/payments/*" {
  capabilities = ["read"]
}
# deny all else implicitly

Dynamic database credentials

Vault's database secrets engine creates per-session SQL users with TTL. When the app disconnects or TTL expires, Vault revokes the user—stolen creds stop working without a global password rotation.

Engine	Use case	TTL typical
Database	Postgres/MySQL app connections	1h
AWS	Dynamic IAM keys	15m–1h
PKI	mTLS service certs	24h
Transit	Encrypt PII fields in app	N/A (key versioned)

🎯 Interview Tip

For HashiCorp Vault & dynamic secrets: explain static vs dynamic secrets, rotation, and why OIDC beats long-lived CI keys.

⚖️ Trade-off

Managed Vault clusters cost money and ops time—but a single leaked production DB password costs more in incident hours and regulatory fines.

Cloud-native secret managers

AWS Secrets Manager, GCP Secret Manager, and Azure Key Vault integrate with IAM and workload identity. Prefer short-lived credentials via OIDC over static keys in CI variables.

name: OIDC to AWS
on: [push]
permissions:
  id-token: write
  contents: read
jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: arn:aws:iam::123456789012:role/github-deploy
          aws-region: us-east-1
      - run: aws secretsmanager get-secret-value --secret-id prod/db --query SecretString

deploy-aws:
  stage: deploy
  id_tokens:
    AWS_TOKEN:
      aud: sts.amazonaws.com
  script:
    - |
      export $(printf "AWS_ACCESS_KEY_ID=%s AWS_SECRET_ACCESS_KEY=%s AWS_SESSION_TOKEN=%s" \
        $(aws sts assume-role-with-web-identity \
          --role-arn $AWS_ROLE_ARN \
          --web-identity-token $AWS_TOKEN \
          --role-session-name gitlab-ci \
          --query "Credentials.[AccessKeyId,SecretAccessKey,SessionToken]" \
          --output text))
    - aws secretsmanager get-secret-value --secret-id prod/db

🔒 Security

Federation trust policies must scope sub claims to specific repos—wildcard trust enables cross-repo privilege escalation.

Best practices

Prefer dynamic secrets with TTL under 1 hour for databases and cloud APIs.
Never pass secrets as CLI arguments—visible in process list and shell history.
Use environment: protection rules for production credentials.
Audit secret access monthly; alert on anomalous read patterns.
Document rotation runbooks with RTO targets; test twice per year.

Anti-patterns

Shared root AWS keys in a team 1Password folder.
Same staging and production DB password "for simplicity".
Mounting K8s Secrets as env vars in 50 microservices—blast radius on one compromise.
Disabling fork PR pipelines entirely instead of scoped unprivileged workflows.

Cloud manager comparison

Service	Strength	CI integration
AWS Secrets Manager	Native RDS rotation Lambdas	OIDC → IAM → GetSecretValue
GCP Secret Manager	Workload Identity Federation	GitHub OIDC → WIF → accessor
Azure Key Vault	HSM-backed keys, CMK	Federated credentials on app registration

GCP Workload Identity Federation sketch

# GCP: pool provider maps GitHub OIDC iss/sub to service account
# No JSON key file in CI — ever
attributeMapping:
  google.subject: assertion.sub
  attribute.repository: assertion.repository
serviceAccountImpersonation:
  serviceAccount: [email protected]

🎯 Interview Tip

For Cloud-native secret managers: explain static vs dynamic secrets, rotation, and why OIDC beats long-lived CI keys.

⚖️ Trade-off

Managed Vault clusters cost money and ops time—but a single leaked production DB password costs more in incident hours and regulatory fines.

Kubernetes secrets & external secret operators

Native Secret objects are convenient but risky. External Secrets Operator (ESO) or Secrets Store CSI Driver sync from Vault/AWS/GCP into pods without landing plaintext in git or CI logs.

External Secrets Operator

apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: app-db
spec:
  refreshInterval: 1h
  secretStoreRef:
    name: aws-secrets
    kind: ClusterSecretStore
  target:
    name: app-db-credentials
  data:
    - secretKey: password
      remoteRef:
        key: prod/database
        property: password

Sealed Secrets for GitOps

# Sealed Secrets — encrypt Secret manifests for GitOps
apiVersion: bitnami.com/v1alpha1
kind: SealedSecret
metadata:
  name: app-db
spec:
  encryptedData:
    password: AgBx...sealed...blob

name: Validate ExternalSecrets
on: [pull_request]
jobs:
  validate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: azure/setup-kubectl@v4
      - run: kubectl apply --dry-run=server -f k8s/external-secrets/

validate-secrets:
  stage: test
  image: bitnami/kubectl:latest
  script:
    - kubectl apply --dry-run=server -f k8s/external-secrets/
  rules:
    - changes:
        - k8s/external-secrets/**/*

💡 Pro Tip

Enable EncryptionConfiguration for secrets in etcd; restrict who can read Secret objects—default namespace-wide read is too permissive.

Best practices

Prefer dynamic secrets with TTL under 1 hour for databases and cloud APIs.
Never pass secrets as CLI arguments—visible in process list and shell history.
Use environment: protection rules for production credentials.
Audit secret access monthly; alert on anomalous read patterns.
Document rotation runbooks with RTO targets; test twice per year.

Anti-patterns

Shared root AWS keys in a team 1Password folder.
Same staging and production DB password "for simplicity".
Mounting K8s Secrets as env vars in 50 microservices—blast radius on one compromise.
Disabling fork PR pipelines entirely instead of scoped unprivileged workflows.

Encryption at rest

Enable the Kubernetes EncryptionConfiguration API with a KMS provider (AWS KMS, GCP Cloud KMS, Vault transit). Without it, anyone with etcd backup access reads Secrets in plaintext.

apiVersion: apiserver.config.k8s.io/v1
kind: EncryptionConfiguration
resources:
  - resources: [secrets]
    providers:
      - kms:
          name: aws-kms
          endpoint: unix:///var/run/kmsplugin/socket.sock
      - identity: {}

CSI vs ESO

External Secrets Operator — syncs to native Secret objects; familiar to apps using envFrom.
Secrets Store CSI — mounts secrets as volumes; no Secret object in etcd; better for zero-k8s-secret footprint.
SOPS + GitOps — Mozilla SOPS encrypts YAML in git; Flux/Argo decrypt at apply time with KMS.

🎯 Interview Tip

For Kubernetes secrets & external secret operators: explain static vs dynamic secrets, rotation, and why OIDC beats long-lived CI keys.

⚖️ Trade-off

Managed Vault clusters cost money and ops time—but a single leaked production DB password costs more in incident hours and regulatory fines.

Secrets in CI/CD platforms

GitHub Encrypted Secrets, GitLab masked/protected variables, and environment-scoped secrets reduce leakage—but forks, log echo, and artifact uploads remain exfil paths. Design pipelines assuming logs are public.

name: Safe CI secrets
on: [pull_request]
jobs:
  test:
    if: github.event.pull_request.head.repo.full_name == github.repository
    runs-on: ubuntu-latest
    steps:
      - run: npm test
        env:
          API_KEY: ${{ secrets.STAGING_API_KEY }}
# Never echo secrets; use env not args to avoid ps leakage

unit-test:
  script:
    - npm test
  rules:
    - if: $CI_PIPELINE_SOURCE == "merge_request_event"
      when: never
    - when: on_success
  # MR pipelines from forks skip secrets — use separate workflow

🔒 Security

Pull requests from forks must not receive org secrets—GitHub blocks this by default; verify GitLab equivalent protected variable rules.

Best practices

Prefer dynamic secrets with TTL under 1 hour for databases and cloud APIs.
Never pass secrets as CLI arguments—visible in process list and shell history.
Use environment: protection rules for production credentials.
Audit secret access monthly; alert on anomalous read patterns.
Document rotation runbooks with RTO targets; test twice per year.

Anti-patterns

Shared root AWS keys in a team 1Password folder.
Same staging and production DB password "for simplicity".
Mounting K8s Secrets as env vars in 50 microservices—blast radius on one compromise.
Disabling fork PR pipelines entirely instead of scoped unprivileged workflows.

GitHub vs GitLab secret models

Feature	GitHub Actions	GitLab CI
Scope	Org / repo / environment	Group / project / environment
Fork PRs	Secrets withheld from fork workflows	Protected variables + MR rules
Masking	Auto-redact in logs (best effort)	Masked variables regex
OIDC	id-token: write	id_tokens job keyword

⚠️ Pitfall

Base64-encoding a secret to "hide" it in a workflow YAML is not security—it's encoding. Use platform secret stores.

🎯 Interview Tip

For Secrets in CI/CD platforms: explain static vs dynamic secrets, rotation, and why OIDC beats long-lived CI keys.

⚖️ Trade-off

Managed Vault clusters cost money and ops time—but a single leaked production DB password costs more in incident hours and regulatory fines.

Pipeline secret security patterns

Production-grade secret hygiene: OIDC federation, minimum TTL, no secrets in URLs, secret scanning on pipeline definitions, and break-glass rotation runbooks tested quarterly.

name: Secret hygiene
on: [push]
permissions:
  contents: read
  id-token: write
jobs:
  oidc-deploy:
    runs-on: ubuntu-latest
    environment: production
    steps:
      - uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: ${{ vars.AWS_DEPLOY_ROLE }}
          aws-region: us-east-1
      - run: ./deploy.sh  # no static keys in repo or vars

include:
  - local: .gitlab/ci/secret-scan.yml

production-deploy:
  stage: deploy
  environment:
    name: production
  id_tokens:
    AWS_TOKEN:
      aud: sts.amazonaws.com
  script:
    - ./deploy.sh
  rules:
    - if: $CI_COMMIT_BRANCH == "main"
  when: manual

🔒 Security

Quarterly game-day: rotate a critical secret end-to-end and measure detection time if old credential is used—MTTD for secret revocation.

Best practices

Prefer dynamic secrets with TTL under 1 hour for databases and cloud APIs.
Never pass secrets as CLI arguments—visible in process list and shell history.
Use environment: protection rules for production credentials.
Audit secret access monthly; alert on anomalous read patterns.
Document rotation runbooks with RTO targets; test twice per year.

Anti-patterns

Shared root AWS keys in a team 1Password folder.
Same staging and production DB password "for simplicity".
Mounting K8s Secrets as env vars in 50 microservices—blast radius on one compromise.
Disabling fork PR pipelines entirely instead of scoped unprivileged workflows.

Break-glass rotation runbook

Identify compromised secret scope (which systems, which repos).
Revoke in source system (Vault lease revoke, AWS key deactivate, DB user drop).
Issue new credential with narrower policy if root cause was over-permission.
Redeploy all consumers—stale env vars in running pods keep old secret alive.
Post-incident: add detection rule (CloudTrail anomaly, failed auth spike).

sequenceDiagram
  participant CI as CI runner
  participant IdP as GitHub/GitLab OIDC
  participant Cloud as AWS/GCP/Azure
  participant SM as Secret manager
  CI->>IdP: Request OIDC token (aud scoped)
  IdP->>CI: JWT (sub, repo, ref)
  CI->>Cloud: AssumeRoleWithWebIdentity
  Cloud->>CI: Session creds (15m–1h)
  CI->>SM: GetSecret / Vault read
  SM->>CI: Short-lived DB password

Audit checklist

Inventory all secrets touched by CI—spreadsheet is fine, CMDB is better.
Zero static cloud root keys in CI variables (grep org for AKIA patterns).
Environment protection: required reviewers on production deploy jobs.
Secret scanning on .github/ and .gitlab-ci.yml in pre-commit.

🎯 Interview Tip

For Pipeline secret security patterns: explain static vs dynamic secrets, rotation, and why OIDC beats long-lived CI keys.

⚖️ Trade-off

Managed Vault clusters cost money and ops time—but a single leaked production DB password costs more in incident hours and regulatory fines.

Putting it together — secret architecture decision tree

Choose the right control per environment: dev may use SOPS + GitOps; staging uses ESO + cloud SM; production adds dynamic credentials, OIDC-only CI, and quarterly rotation drills.

flowchart TD
  Q1{"Need secrets in git?"}
  Q1 -->|Yes encrypted| SOPS["SOPS + KMS + GitOps"]
  Q1 -->|No| Q2{"K8s workload?"}
  Q2 -->|Yes| ESO["ESO / CSI Driver"]
  Q2 -->|No| Q3{"CI pipeline?"}
  Q3 -->|Yes| OIDC["OIDC federation\nno static keys"]
  Q3 -->|No| VAULT["Vault / Cloud SM API"]
  ESO --> VAULT
  OIDC --> VAULT

Pattern	Best for	Complexity	Risk if skipped
OIDC to cloud	CI deploy jobs	Medium	CRITICAL
External Secrets	K8s apps	Medium	HIGH
Vault dynamic DB	Stateful services	High	HIGH
Sealed Secrets	GitOps teams without SM API	Low	MEDIUM
Pre-commit gitleaks	Every repo	Low	CRITICAL

name: Org secret standards
on:
  schedule:
    - cron: '0 6 * * 1'
jobs:
  audit-static-keys:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: |
          ! grep -rE 'AKIA[0-9A-Z]{16}' .github/ || (echo "Static AWS key in workflow" && exit 1)
      - run: gitleaks detect --source . --redact

weekly-secret-audit:
  stage: audit
  script:
    - gitleaks detect --source . --redact
    - echo "Review protected variables and OIDC audiences quarterly"
  rules:
    - if: $CI_PIPELINE_SOURCE == "schedule"

🔒 Security

The north star: no human ever copies a production password. Humans approve deploys; machines fetch short-lived credentials.

🎯 Interview Tip

Draw the OIDC flow on the whiteboard: CI → IdP JWT → cloud STS → secret manager → app. Mention fork PR isolation and rotation MTTD.

30-day rollout plan

Week 1 — Enable gitleaks on all repos; inventory CI variables.
Week 2 — Migrate one service to OIDC deploy; delete static AWS key.
Week 3 — Deploy ESO; migrate one K8s Secret to ExternalSecret.
Week 4 — Run rotation game-day; document MTTD and gaps.

Maturity model

Level	Characteristics	Typical gap
1 — Ad hoc	Shared passwords in wiki; .env in git	No scanning
2 — Centralized	CI variables + K8s Secrets	Static keys, no rotation
3 — Federated	OIDC CI; ESO in cluster	Manual rotation runbooks
4 — Dynamic	Vault TTL; automated revocation	Full MTTD metrics