Secrets Management
Credentials are the keys to your kingdom—and CI/CD is where they concentrate. Learn how to keep secrets out of git, off runner disks, and out of logs using Vault, cloud managers, External Secrets Operator, OIDC, and pipeline patterns that assume breach.
The secrets problem in modern delivery
Secrets are credentials that authenticate machines and humans to systems—API keys, DB passwords, TLS certs, signing keys. Storing them in git, CI variables without rotation, or K8s Secrets as base64 gives attackers a permanent foothold.
flowchart TB DEV["Developer laptop .env file"] --> GIT["Accidental git commit"] CI["CI variables long-lived AWS key"] --> LEAK["Log exposure / fork PR"] K8S["K8s Secret base64 in etcd"] --> RBAC["Over-broad RBAC read"] GIT --> BOT["Bot harvests in minutes"] LEAK --> LATERAL["Lateral movement to prod"]
name: Secret baseline
on: [push]
jobs:
audit:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: gitleaks detect --source . --verbose --redact
secret-audit:
stage: test
image: zricethezav/gitleaks:latest
script:
- gitleaks detect --source . --verbose --redact
Where secrets leak
| Surface | Failure mode | Severity |
|---|---|---|
| Git history | Committed .env, force-pushed but cached on forks | CRITICAL |
| CI logs | echo $SECRET, curl with token in URL | CRITICAL |
| Container images | ARG/ENV baking credentials into layers | HIGH |
| K8s etcd | Unencrypted Secret objects | HIGH |
| Chat / tickets | Paste for debugging | MEDIUM |
Assume any secret that touched git is compromised—rotate, do not just delete the file in a follow-up commit.
Uber 2016 breach started with credentials in a private GitHub repo that attackers found. Private ≠ secure.
Best practices
- Prefer dynamic secrets with TTL under 1 hour for databases and cloud APIs.
- Never pass secrets as CLI arguments—visible in process list and shell history.
- Use environment: protection rules for production credentials.
- Audit secret access monthly; alert on anomalous read patterns.
- Document rotation runbooks with RTO targets; test twice per year.
Anti-patterns
- Shared root AWS keys in a team 1Password folder.
- Same staging and production DB password "for simplicity".
- Mounting K8s Secrets as env vars in 50 microservices—blast radius on one compromise.
- Disabling fork PR pipelines entirely instead of scoped unprivileged workflows.
Zero-trust secret lifecycle
Treat every secret as time-bounded: creation → scoped use → audit → rotation → revocation. Long-lived credentials violate zero trust because compromise detection depends on luck, not architecture.
| Phase | Control | Owner |
|---|---|---|
| Creation | Automated via Vault/database engine—no human-generated passwords | Platform |
| Distribution | OIDC/JWT to CI; ESO/CSI to pods—never email or Slack | DevOps |
| Use | Least privilege IAM policy; read-only where possible | App team |
| Rotation | Calendar + event-driven (employee offboarding, incident) | Security |
| Revocation | Break-glass playbook; kill switch in IdP + cloud | SRE + Security |
GitHub secret scanning partners (AWS, Azure, Stripe, etc.) can auto-revoke leaked tokens when push protection fires—enable org-wide.
For The secrets problem in modern delivery: explain static vs dynamic secrets, rotation, and why OIDC beats long-lived CI keys.
Managed Vault clusters cost money and ops time—but a single leaked production DB password costs more in incident hours and regulatory fines.
HashiCorp Vault & dynamic secrets
Vault is the reference secrets control plane: static secrets with versioning, dynamic database credentials with TTL, PKI certificate issuance, and encryption-as-a-service transit engine.
name: Vault in CI
on: [push]
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: hashicorp/vault-action@v2
with:
url: https://vault.example.com
method: jwt
role: github-actions-deploy
secrets: |
secret/data/prod/db password | DB_PASSWORD
- run: deploy.sh
env:
DB_PASSWORD: ${{ env.DB_PASSWORD }}
deploy:
stage: deploy
id_tokens:
VAULT_ID_TOKEN:
aud: https://vault.example.com
script:
- export VAULT_TOKEN=$(vault write -field=token auth/jwt/login role=gitlab-deploy jwt=$VAULT_ID_TOKEN)
- export DB_PASSWORD=$(vault kv get -field=password secret/prod/db)
- ./deploy.sh
rules:
- if: $CI_COMMIT_BRANCH == "main"
Use Vault Agent sidecars in K8s to renew leases—applications should never embed long-lived Vault tokens.
Best practices
- Prefer dynamic secrets with TTL under 1 hour for databases and cloud APIs.
- Never pass secrets as CLI arguments—visible in process list and shell history.
- Use environment: protection rules for production credentials.
- Audit secret access monthly; alert on anomalous read patterns.
- Document rotation runbooks with RTO targets; test twice per year.
Anti-patterns
- Shared root AWS keys in a team 1Password folder.
- Same staging and production DB password "for simplicity".
- Mounting K8s Secrets as env vars in 50 microservices—blast radius on one compromise.
- Disabling fork PR pipelines entirely instead of scoped unprivileged workflows.
Vault policy example
# payments-api can read only its DB creds
path "database/creds/payments-role" {
capabilities = ["read"]
}
path "secret/data/payments/*" {
capabilities = ["read"]
}
# deny all else implicitly
Dynamic database credentials
Vault's database secrets engine creates per-session SQL users with TTL. When the app disconnects or TTL expires, Vault revokes the user—stolen creds stop working without a global password rotation.
| Engine | Use case | TTL typical |
|---|---|---|
| Database | Postgres/MySQL app connections | 1h |
| AWS | Dynamic IAM keys | 15m–1h |
| PKI | mTLS service certs | 24h |
| Transit | Encrypt PII fields in app | N/A (key versioned) |
For HashiCorp Vault & dynamic secrets: explain static vs dynamic secrets, rotation, and why OIDC beats long-lived CI keys.
Managed Vault clusters cost money and ops time—but a single leaked production DB password costs more in incident hours and regulatory fines.
Cloud-native secret managers
AWS Secrets Manager, GCP Secret Manager, and Azure Key Vault integrate with IAM and workload identity. Prefer short-lived credentials via OIDC over static keys in CI variables.
name: OIDC to AWS
on: [push]
permissions:
id-token: write
contents: read
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::123456789012:role/github-deploy
aws-region: us-east-1
- run: aws secretsmanager get-secret-value --secret-id prod/db --query SecretString
deploy-aws:
stage: deploy
id_tokens:
AWS_TOKEN:
aud: sts.amazonaws.com
script:
- |
export $(printf "AWS_ACCESS_KEY_ID=%s AWS_SECRET_ACCESS_KEY=%s AWS_SESSION_TOKEN=%s" \
$(aws sts assume-role-with-web-identity \
--role-arn $AWS_ROLE_ARN \
--web-identity-token $AWS_TOKEN \
--role-session-name gitlab-ci \
--query "Credentials.[AccessKeyId,SecretAccessKey,SessionToken]" \
--output text))
- aws secretsmanager get-secret-value --secret-id prod/db
Federation trust policies must scope sub claims to specific repos—wildcard trust enables cross-repo privilege escalation.
Best practices
- Prefer dynamic secrets with TTL under 1 hour for databases and cloud APIs.
- Never pass secrets as CLI arguments—visible in process list and shell history.
- Use environment: protection rules for production credentials.
- Audit secret access monthly; alert on anomalous read patterns.
- Document rotation runbooks with RTO targets; test twice per year.
Anti-patterns
- Shared root AWS keys in a team 1Password folder.
- Same staging and production DB password "for simplicity".
- Mounting K8s Secrets as env vars in 50 microservices—blast radius on one compromise.
- Disabling fork PR pipelines entirely instead of scoped unprivileged workflows.
Cloud manager comparison
| Service | Strength | CI integration |
|---|---|---|
| AWS Secrets Manager | Native RDS rotation Lambdas | OIDC → IAM → GetSecretValue |
| GCP Secret Manager | Workload Identity Federation | GitHub OIDC → WIF → accessor |
| Azure Key Vault | HSM-backed keys, CMK | Federated credentials on app registration |
GCP Workload Identity Federation sketch
# GCP: pool provider maps GitHub OIDC iss/sub to service account
# No JSON key file in CI — ever
attributeMapping:
google.subject: assertion.sub
attribute.repository: assertion.repository
serviceAccountImpersonation:
serviceAccount: [email protected]
For Cloud-native secret managers: explain static vs dynamic secrets, rotation, and why OIDC beats long-lived CI keys.
Managed Vault clusters cost money and ops time—but a single leaked production DB password costs more in incident hours and regulatory fines.
Kubernetes secrets & external secret operators
Native Secret objects are convenient but risky. External Secrets Operator (ESO) or Secrets Store CSI Driver sync from Vault/AWS/GCP into pods without landing plaintext in git or CI logs.
External Secrets Operator
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: app-db
spec:
refreshInterval: 1h
secretStoreRef:
name: aws-secrets
kind: ClusterSecretStore
target:
name: app-db-credentials
data:
- secretKey: password
remoteRef:
key: prod/database
property: password
Sealed Secrets for GitOps
# Sealed Secrets — encrypt Secret manifests for GitOps
apiVersion: bitnami.com/v1alpha1
kind: SealedSecret
metadata:
name: app-db
spec:
encryptedData:
password: AgBx...sealed...blob
name: Validate ExternalSecrets
on: [pull_request]
jobs:
validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: azure/setup-kubectl@v4
- run: kubectl apply --dry-run=server -f k8s/external-secrets/
validate-secrets:
stage: test
image: bitnami/kubectl:latest
script:
- kubectl apply --dry-run=server -f k8s/external-secrets/
rules:
- changes:
- k8s/external-secrets/**/*
Enable EncryptionConfiguration for secrets in etcd; restrict who can read Secret objects—default namespace-wide read is too permissive.
Best practices
- Prefer dynamic secrets with TTL under 1 hour for databases and cloud APIs.
- Never pass secrets as CLI arguments—visible in process list and shell history.
- Use environment: protection rules for production credentials.
- Audit secret access monthly; alert on anomalous read patterns.
- Document rotation runbooks with RTO targets; test twice per year.
Anti-patterns
- Shared root AWS keys in a team 1Password folder.
- Same staging and production DB password "for simplicity".
- Mounting K8s Secrets as env vars in 50 microservices—blast radius on one compromise.
- Disabling fork PR pipelines entirely instead of scoped unprivileged workflows.
Encryption at rest
Enable the Kubernetes EncryptionConfiguration API with a KMS provider (AWS KMS, GCP Cloud KMS, Vault transit). Without it, anyone with etcd backup access reads Secrets in plaintext.
apiVersion: apiserver.config.k8s.io/v1
kind: EncryptionConfiguration
resources:
- resources: [secrets]
providers:
- kms:
name: aws-kms
endpoint: unix:///var/run/kmsplugin/socket.sock
- identity: {}
CSI vs ESO
- External Secrets Operator — syncs to native Secret objects; familiar to apps using envFrom.
- Secrets Store CSI — mounts secrets as volumes; no Secret object in etcd; better for zero-k8s-secret footprint.
- SOPS + GitOps — Mozilla SOPS encrypts YAML in git; Flux/Argo decrypt at apply time with KMS.
For Kubernetes secrets & external secret operators: explain static vs dynamic secrets, rotation, and why OIDC beats long-lived CI keys.
Managed Vault clusters cost money and ops time—but a single leaked production DB password costs more in incident hours and regulatory fines.
Secrets in CI/CD platforms
GitHub Encrypted Secrets, GitLab masked/protected variables, and environment-scoped secrets reduce leakage—but forks, log echo, and artifact uploads remain exfil paths. Design pipelines assuming logs are public.
name: Safe CI secrets
on: [pull_request]
jobs:
test:
if: github.event.pull_request.head.repo.full_name == github.repository
runs-on: ubuntu-latest
steps:
- run: npm test
env:
API_KEY: ${{ secrets.STAGING_API_KEY }}
# Never echo secrets; use env not args to avoid ps leakage
unit-test:
script:
- npm test
rules:
- if: $CI_PIPELINE_SOURCE == "merge_request_event"
when: never
- when: on_success
# MR pipelines from forks skip secrets — use separate workflow
Pull requests from forks must not receive org secrets—GitHub blocks this by default; verify GitLab equivalent protected variable rules.
Best practices
- Prefer dynamic secrets with TTL under 1 hour for databases and cloud APIs.
- Never pass secrets as CLI arguments—visible in process list and shell history.
- Use environment: protection rules for production credentials.
- Audit secret access monthly; alert on anomalous read patterns.
- Document rotation runbooks with RTO targets; test twice per year.
Anti-patterns
- Shared root AWS keys in a team 1Password folder.
- Same staging and production DB password "for simplicity".
- Mounting K8s Secrets as env vars in 50 microservices—blast radius on one compromise.
- Disabling fork PR pipelines entirely instead of scoped unprivileged workflows.
GitHub vs GitLab secret models
| Feature | GitHub Actions | GitLab CI |
|---|---|---|
| Scope | Org / repo / environment | Group / project / environment |
| Fork PRs | Secrets withheld from fork workflows | Protected variables + MR rules |
| Masking | Auto-redact in logs (best effort) | Masked variables regex |
| OIDC | id-token: write | id_tokens job keyword |
Base64-encoding a secret to "hide" it in a workflow YAML is not security—it's encoding. Use platform secret stores.
For Secrets in CI/CD platforms: explain static vs dynamic secrets, rotation, and why OIDC beats long-lived CI keys.
Managed Vault clusters cost money and ops time—but a single leaked production DB password costs more in incident hours and regulatory fines.
Pipeline secret security patterns
Production-grade secret hygiene: OIDC federation, minimum TTL, no secrets in URLs, secret scanning on pipeline definitions, and break-glass rotation runbooks tested quarterly.
name: Secret hygiene
on: [push]
permissions:
contents: read
id-token: write
jobs:
oidc-deploy:
runs-on: ubuntu-latest
environment: production
steps:
- uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: ${{ vars.AWS_DEPLOY_ROLE }}
aws-region: us-east-1
- run: ./deploy.sh # no static keys in repo or vars
include:
- local: .gitlab/ci/secret-scan.yml
production-deploy:
stage: deploy
environment:
name: production
id_tokens:
AWS_TOKEN:
aud: sts.amazonaws.com
script:
- ./deploy.sh
rules:
- if: $CI_COMMIT_BRANCH == "main"
when: manual
Quarterly game-day: rotate a critical secret end-to-end and measure detection time if old credential is used—MTTD for secret revocation.
Best practices
- Prefer dynamic secrets with TTL under 1 hour for databases and cloud APIs.
- Never pass secrets as CLI arguments—visible in process list and shell history.
- Use environment: protection rules for production credentials.
- Audit secret access monthly; alert on anomalous read patterns.
- Document rotation runbooks with RTO targets; test twice per year.
Anti-patterns
- Shared root AWS keys in a team 1Password folder.
- Same staging and production DB password "for simplicity".
- Mounting K8s Secrets as env vars in 50 microservices—blast radius on one compromise.
- Disabling fork PR pipelines entirely instead of scoped unprivileged workflows.
Break-glass rotation runbook
- Identify compromised secret scope (which systems, which repos).
- Revoke in source system (Vault lease revoke, AWS key deactivate, DB user drop).
- Issue new credential with narrower policy if root cause was over-permission.
- Redeploy all consumers—stale env vars in running pods keep old secret alive.
- Post-incident: add detection rule (CloudTrail anomaly, failed auth spike).
sequenceDiagram participant CI as CI runner participant IdP as GitHub/GitLab OIDC participant Cloud as AWS/GCP/Azure participant SM as Secret manager CI->>IdP: Request OIDC token (aud scoped) IdP->>CI: JWT (sub, repo, ref) CI->>Cloud: AssumeRoleWithWebIdentity Cloud->>CI: Session creds (15m–1h) CI->>SM: GetSecret / Vault read SM->>CI: Short-lived DB password
Audit checklist
- Inventory all secrets touched by CI—spreadsheet is fine, CMDB is better.
- Zero static cloud root keys in CI variables (grep org for AKIA patterns).
- Environment protection: required reviewers on production deploy jobs.
- Secret scanning on .github/ and .gitlab-ci.yml in pre-commit.
For Pipeline secret security patterns: explain static vs dynamic secrets, rotation, and why OIDC beats long-lived CI keys.
Managed Vault clusters cost money and ops time—but a single leaked production DB password costs more in incident hours and regulatory fines.
Putting it together — secret architecture decision tree
Choose the right control per environment: dev may use SOPS + GitOps; staging uses ESO + cloud SM; production adds dynamic credentials, OIDC-only CI, and quarterly rotation drills.
flowchart TD
Q1{"Need secrets in git?"}
Q1 -->|Yes encrypted| SOPS["SOPS + KMS + GitOps"]
Q1 -->|No| Q2{"K8s workload?"}
Q2 -->|Yes| ESO["ESO / CSI Driver"]
Q2 -->|No| Q3{"CI pipeline?"}
Q3 -->|Yes| OIDC["OIDC federation\nno static keys"]
Q3 -->|No| VAULT["Vault / Cloud SM API"]
ESO --> VAULT
OIDC --> VAULT
| Pattern | Best for | Complexity | Risk if skipped |
|---|---|---|---|
| OIDC to cloud | CI deploy jobs | Medium | CRITICAL |
| External Secrets | K8s apps | Medium | HIGH |
| Vault dynamic DB | Stateful services | High | HIGH |
| Sealed Secrets | GitOps teams without SM API | Low | MEDIUM |
| Pre-commit gitleaks | Every repo | Low | CRITICAL |
name: Org secret standards
on:
schedule:
- cron: '0 6 * * 1'
jobs:
audit-static-keys:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: |
! grep -rE 'AKIA[0-9A-Z]{16}' .github/ || (echo "Static AWS key in workflow" && exit 1)
- run: gitleaks detect --source . --redact
weekly-secret-audit:
stage: audit
script:
- gitleaks detect --source . --redact
- echo "Review protected variables and OIDC audiences quarterly"
rules:
- if: $CI_PIPELINE_SOURCE == "schedule"
The north star: no human ever copies a production password. Humans approve deploys; machines fetch short-lived credentials.
Draw the OIDC flow on the whiteboard: CI → IdP JWT → cloud STS → secret manager → app. Mention fork PR isolation and rotation MTTD.
30-day rollout plan
- Week 1 — Enable gitleaks on all repos; inventory CI variables.
- Week 2 — Migrate one service to OIDC deploy; delete static AWS key.
- Week 3 — Deploy ESO; migrate one K8s Secret to ExternalSecret.
- Week 4 — Run rotation game-day; document MTTD and gaps.
Maturity model
| Level | Characteristics | Typical gap |
|---|---|---|
| 1 — Ad hoc | Shared passwords in wiki; .env in git | No scanning |
| 2 — Centralized | CI variables + K8s Secrets | Static keys, no rotation |
| 3 — Federated | OIDC CI; ESO in cluster | Manual rotation runbooks |
| 4 — Dynamic | Vault TTL; automated revocation | Full MTTD metrics |