# Terraform Skill Design Philosophy This page describes the architectural decisions and empirical process behind TerraShark's design. ## Failure-Mode-First Architecture TerraShark is built around a single insight: **telling an LLM what good Terraform looks like is less effective than telling it how to think about Terraform problems**. The core `SKILL.md` is not a reference manual. It is a 7-step operational workflow that forces the model to diagnose before it generates. This prevents the most common failure pattern in LLM-assisted IaC: producing syntactically valid but operationally dangerous code. ## Token Efficiency as a Design Constraint Context window space is a finite resource. Every token spent on skill content is a token unavailable for the user's actual codebase, conversation history, and tool results. TerraShark is designed for minimal activation cost: | Metric | TerraShark | Typical Alternative | |---|---|---| | Activation cost | ~600 tokens | ~4,400 tokens | | Reference files | 19 focused files | 6 large files | | Loaded per query | 1-2 small files | Large reference dumps | The core `SKILL.md` is 86 lines containing no HCL examples, no inline code blocks, and no tutorial material. It is purely procedural. Depth lives in 19 granular reference files loaded on demand. ## LLM-Aware Guardrails Every reference file that covers a risk domain includes an **LLM mistake checklist** — a list of specific errors that language models make when generating Terraform code: - Defaulting to `count` instead of `for_each` for collections - Omitting `moved` blocks during refactors, causing destroy/create cycles - Using `sensitive` and assuming the value is safe from state - Proposing plaintext credential defaults "for demo purposes" - Recommending CLI-only `terraform import` instead of declarative import blocks These checklists exist because the model needs to know **what it gets wrong**, not just what is correct. A reference that only shows the right pattern still allows the model to hallucinate the wrong one. A reference that explicitly names the hallucination pattern reduces it. The **Feature Guard Table** in `coding-standards.md` maps Terraform features to their minimum version and the specific LLM error pattern associated with each, letting the model check feature availability before emitting code. ## Output Contracts Every TerraShark response includes a structured output contract: - **Assumptions and version floor** — what the model assumed - **Selected failure modes** — which risks were diagnosed - **Chosen remediation and tradeoffs** — what was recommended and why - **Validation/test plan** — how to verify the output - **Rollback/recovery notes** — how to undo if something goes wrong This makes outputs auditable. A reader can check assumptions, verify failure mode coverage, and validate the rollback path before applying anything. ## Reference Granularity The 19 reference files are organized by concern, not by Terraform concept: | Category | Files | When Loaded | |---|---|---| | **Primary failure modes** | Identity churn, secret exposure, blast radius, CI drift, compliance gates | When that failure mode is diagnosed | | **Structural guidance** | Structure/state, backend state safety, module architecture, coding standards | When designing, refactoring, or changing backends | | **Operational references** | Migration playbooks, testing matrix, CI delivery, security/governance, quick ops | For specific operational tasks | | **Pattern banks** | Good examples, bad examples, neutral examples, do/don't patterns | For review or teaching | | **Integration and meta** | MCP integration, token balance rationale | When relevant | Each file is self-contained. No file depends on another file being loaded simultaneously. ## Deep Hierarchy Model For platform engineering at scale, TerraShark defines a 5-level module hierarchy: | Level | Role | Scope | |---|---|---| | **L0** | Primitives | One resource family, strict contract | | **L1** | Composites | Capability units built from primitives | | **L2** | Domain stacks | Bounded business domains | | **L3** | Environment roots | Env-specific wiring and configuration | | **L4** | Org orchestration | Account/project vending and shared policy | Dependencies flow downward only. Each level owns its state boundary and apply lifecycle. ## Content Inclusion Rules Content enters TerraShark only when at least one condition is met: 1. It materially lowers the probability of destructive or non-compliant changes 2. It prevents common plan/apply surprises 3. It encodes organizational guardrails that general model knowledge cannot infer Content is excluded when: 1. It is generic Terraform/OpenTofu knowledge with low failure impact 2. It is provider-specific deep design that belongs in project docs 3. It duplicates an existing rule without adding a new decision signal ## The Token Experiment The content in TerraShark was empirically tested, not designed by intuition. ### Process 1. **Started large** — broader coverage, more examples, more tutorial material 2. **Built automated test suite** — practical Terraform/OpenTofu task patterns 3. **Measured baseline quality** — correctness, safety, completeness, hallucination rate 4. **Stripped iteratively** — removed sections one at a time, re-running the full test suite 5. **Measured quality impact** — if quality dropped, content was restored; if stable, content was permanently removed 6. **Converged** — continued until every remaining section was load-bearing ### What Survived (Models Need Help With) - Module role boundaries and composition rules - Migration playbooks (moved blocks, count-to-for_each, imports) - Native test caveats (set indexing, computed values, mocked providers) - CI delivery templates (policy checks, artifact integrity, env protection) - Quick troubleshooting (stuck locks, backend migration, provider auth in CI) ### What Was Removed (Models Already Know) - Generic HCL syntax tutorials - Provider-specific resource deep dives - Broad "best practice" prose without failure-mode framing - Duplicate explanations of concepts covered by multiple rules ### Core Design Principle **High signal density.** Every line must earn its token cost by preventing a specific failure mode or encoding knowledge the model demonstrably lacks. Content that merely restates what the model already knows is actively harmful — it burns context window space without improving output quality.