stop-slop, taste-skill, terrashark had embedded .git dirs causing Woodpecker clone to fail on submodule update.
4.7 KiB
Backend State Safety
Load this reference when detected: state backend s3, azurerm, gcs, remote, cloud, pg, consul, or local; or task mentions backend migration, state storage, locking, force-unlock, state backup, or restore.
Detection signals
- backend blocks using
s3,azurerm,gcs,remote,cloud,pg,consul, orlocal - commands or discussion around
init -migrate-state,init -reconfigure,state pull,force-unlock, backup, restore, or backend migration - CI or production work that creates, changes, or authenticates to state storage
Why this matters
Terraform/OpenTofu state is the source of truth for live resource identity and often contains sensitive values. Backend mistakes can leak secrets, orphan resources, disable locking, or make a routine refactor look like a destructive replacement.
Backend baseline
- Use remote state for every shared, CI, or production environment.
- Require locking on every apply path.
- Encrypt state at rest and in transit.
- Enable state versioning or point-in-time recovery where the backend supports it.
- Keep backend storage and lock primitives in a bootstrap root with a separate lifecycle.
- Never manage the backend bucket/container/table from the same root that uses it as its active backend.
- Keep backend credentials out of checked-in backend config; prefer workload identity or CI-provided partial backend config.
Backend-specific checks
| Backend | Required checks |
|---|---|
s3 |
Bucket versioning, encryption, public access block, narrow IAM, lock mechanism configured, state key split by environment/root. |
azurerm |
Storage account encryption, blob soft delete/versioning where available, lease-based locking, private/network restrictions, narrow data-plane RBAC. |
gcs |
Bucket versioning, uniform bucket-level access, encryption policy, narrow IAM, prefix split by environment/root. |
remote / cloud |
Workspace boundary matches blast radius, state sharing is restricted, sensitive variables are marked, applies use approved execution mode. |
pg |
TLS, database backups, least-privilege user, lock behavior verified, connection secrets kept out of code. |
consul |
TLS, ACLs, snapshots/backups, highly available quorum, lock/session behavior verified. |
local |
Solo prototype only; do not use for shared, CI, or production environments. |
Migration guardrails
- Do not combine backend migration with unrelated resource changes.
- Freeze applies for the affected state before migrating.
- Pull and securely store a state backup before
init -migrate-state; do not commit it. - Record current backend type, address/key, workspace, runtime version, and actor.
- Migrate the lowest-risk environment first.
- After migration, compare resource addresses before/after and run a no-op plan.
- Keep the old backend retained and access-controlled until restore has been tested or the rollback window has passed.
Use init -migrate-state when moving state between backends. Use init -reconfigure only when intentionally accepting the configured backend without migrating existing state.
Lock handling
- Treat a lock as a safety signal, not an inconvenience.
- Before
force-unlock, verify the lock holder, CI run, process, and timestamp. - Never recommend
force-unlockwhile an apply may still be running. - Serialize applies for shared foundation, backend, identity, and network roots.
Access and secret handling
- Treat state readers as secret readers.
- Avoid storing plan/state artifacts in public or broad-access CI logs.
- If a secret entered state, rotate the secret and use the secret remediation playbook; masking output is not enough.
- Keep backend read/write permissions separate when the platform supports it.
LLM mistake checklist
- Suggesting
localbackend for a team, CI, or production stack. - Creating backend storage inside the same root that uses it.
- Omitting a lock strategy for a shared backend.
- Treating encryption as protection from anyone who can read state.
- Combining backend migration with broad resource refactors.
- Recommending
force-unlockwithout proving no apply is active. - Deleting old backend data immediately after migration.
- Hard-coding backend credentials in HCL or checked-in config.
Validation commands
Use the active runtime (terraform or tofu) consistently:
terraform version
terraform workspace show
terraform state pull > state-backup.json
terraform state list > state-before.txt
terraform init -migrate-state
terraform state list > state-after.txt
diff -u state-before.txt state-after.txt
terraform plan -detailed-exitcode
Store state-backup.json in a secure temporary location outside the repository and delete it only after rollback is no longer needed.