autojanet/skills/terrashark/references/conditional/backend-state-safety.md
Zoë cfec11bb46
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
fix: convert skill submodules to plain directories
stop-slop, taste-skill, terrashark had embedded .git dirs causing
Woodpecker clone to fail on submodule update.
2026-05-30 15:44:44 -07:00

4.7 KiB

Backend State Safety

Load this reference when detected: state backend s3, azurerm, gcs, remote, cloud, pg, consul, or local; or task mentions backend migration, state storage, locking, force-unlock, state backup, or restore.

Detection signals

  • backend blocks using s3, azurerm, gcs, remote, cloud, pg, consul, or local
  • commands or discussion around init -migrate-state, init -reconfigure, state pull, force-unlock, backup, restore, or backend migration
  • CI or production work that creates, changes, or authenticates to state storage

Why this matters

Terraform/OpenTofu state is the source of truth for live resource identity and often contains sensitive values. Backend mistakes can leak secrets, orphan resources, disable locking, or make a routine refactor look like a destructive replacement.

Backend baseline

  • Use remote state for every shared, CI, or production environment.
  • Require locking on every apply path.
  • Encrypt state at rest and in transit.
  • Enable state versioning or point-in-time recovery where the backend supports it.
  • Keep backend storage and lock primitives in a bootstrap root with a separate lifecycle.
  • Never manage the backend bucket/container/table from the same root that uses it as its active backend.
  • Keep backend credentials out of checked-in backend config; prefer workload identity or CI-provided partial backend config.

Backend-specific checks

Backend Required checks
s3 Bucket versioning, encryption, public access block, narrow IAM, lock mechanism configured, state key split by environment/root.
azurerm Storage account encryption, blob soft delete/versioning where available, lease-based locking, private/network restrictions, narrow data-plane RBAC.
gcs Bucket versioning, uniform bucket-level access, encryption policy, narrow IAM, prefix split by environment/root.
remote / cloud Workspace boundary matches blast radius, state sharing is restricted, sensitive variables are marked, applies use approved execution mode.
pg TLS, database backups, least-privilege user, lock behavior verified, connection secrets kept out of code.
consul TLS, ACLs, snapshots/backups, highly available quorum, lock/session behavior verified.
local Solo prototype only; do not use for shared, CI, or production environments.

Migration guardrails

  • Do not combine backend migration with unrelated resource changes.
  • Freeze applies for the affected state before migrating.
  • Pull and securely store a state backup before init -migrate-state; do not commit it.
  • Record current backend type, address/key, workspace, runtime version, and actor.
  • Migrate the lowest-risk environment first.
  • After migration, compare resource addresses before/after and run a no-op plan.
  • Keep the old backend retained and access-controlled until restore has been tested or the rollback window has passed.

Use init -migrate-state when moving state between backends. Use init -reconfigure only when intentionally accepting the configured backend without migrating existing state.

Lock handling

  • Treat a lock as a safety signal, not an inconvenience.
  • Before force-unlock, verify the lock holder, CI run, process, and timestamp.
  • Never recommend force-unlock while an apply may still be running.
  • Serialize applies for shared foundation, backend, identity, and network roots.

Access and secret handling

  • Treat state readers as secret readers.
  • Avoid storing plan/state artifacts in public or broad-access CI logs.
  • If a secret entered state, rotate the secret and use the secret remediation playbook; masking output is not enough.
  • Keep backend read/write permissions separate when the platform supports it.

LLM mistake checklist

  • Suggesting local backend for a team, CI, or production stack.
  • Creating backend storage inside the same root that uses it.
  • Omitting a lock strategy for a shared backend.
  • Treating encryption as protection from anyone who can read state.
  • Combining backend migration with broad resource refactors.
  • Recommending force-unlock without proving no apply is active.
  • Deleting old backend data immediately after migration.
  • Hard-coding backend credentials in HCL or checked-in config.

Validation commands

Use the active runtime (terraform or tofu) consistently:

terraform version
terraform workspace show
terraform state pull > state-backup.json
terraform state list > state-before.txt
terraform init -migrate-state
terraform state list > state-after.txt
diff -u state-before.txt state-after.txt
terraform plan -detailed-exitcode

Store state-backup.json in a secure temporary location outside the repository and delete it only after rollback is no longer needed.