# Project Specification Template > Copy this file into your project root as `PROJECT-SPEC.md`. > Fill out each section. The more specific you are, the better the agent performs. > This is the single source of truth — the agent reads it every iteration. --- ## 1. Project Overview ### What are we building? ### Why does it matter? ### Success criteria - [ ] Criterion 1 - [ ] Criterion 2 - [ ] Criterion 3 --- ## 2. Technical Foundation ### Tech stack - **Language:** - **Framework:** - **Database:** - **Build system:** - **Test framework:** - **Package manager:** ### Project structure ``` project/ ├── src/ ├── tests/ ├── docs/ └── ... ``` ### Build & test commands ```bash # Build npm run build # Test npm test # Lint npm run lint ``` ### Coding standards - - - --- ## 3. Requirements ### Functional Requirements > List every feature the system must have. Each should be independently testable. > Use the format: FR-NNN: Description — Acceptance criteria #### FR-001: [Feature Name] **Description:** What it does. **Acceptance criteria:** - [ ] Given X, when Y, then Z - [ ] Edge case: ... - [ ] Error case: ... #### FR-002: [Feature Name] **Description:** What it does. **Acceptance criteria:** - [ ] ... ### Non-Functional Requirements #### NFR-001: Performance - [ ] Response time < X ms for Y operation - [ ] Handles Z concurrent users #### NFR-002: Security - [ ] No secrets in source code - [ ] Input validation on all user inputs #### NFR-003: Testing - [ ] Minimum 80% code coverage - [ ] All happy paths tested - [ ] All error paths tested --- ## 4. Data Model ### Entities ``` Entity: User - id: UUID (primary key) - name: string (required, max 100 chars) - email: string (required, unique, valid email) - created_at: timestamp (auto-set) ``` ### Relationships ### Sample Data --- ## 5. API / Interface Design ### Endpoints / Commands / UI Screens ### Input/Output Examples --- ## 6. Architecture Decisions ### Constraints > Four categories of constraints. Fill all four — even if one is just "none in this category." > The ESCALATE category is critical for autonomous agents: it defines when to stop and ask rather than decide. - **MUST:** ... (non-negotiable requirements) - **MUST NOT:** ... (explicit prohibitions — prevents entire failure categories) - **PREFER:** ... (soft preferences when multiple valid approaches exist) - **ESCALATE:** ... (conditions where the agent must stop and ask the human rather than decide autonomously) **Example ESCALATE entries:** - If a task requires deleting production data → ask first - If acceptance criteria are ambiguous and the spec doesn't resolve it → ask before implementing - If a dependency needs to be added that wasn't in the tech stack → confirm before installing - If the task conflicts with another requirement or constraint → flag the conflict and wait - If the agent discovers a requirement gap not covered by any FR-NNN → don't fill the gap silently ### Dependencies ### Known Challenges --- ## 7. Phasing (Optional) > If the project is large, break it into phases. Each phase should be > independently deployable and testable. ### Phase 1: Foundation - [ ] Task A - [ ] Task B ### Phase 2: Core Features - [ ] Task C - [ ] Task D ### Phase 3: Polish - [ ] Task E - [ ] Task F --- ## 8. Reference Materials ### External docs ### Existing code to learn from ### Anti-patterns --- ## 9. Evaluation Design > How do you know the agent's output is good? Not "does it look reasonable?" > but "can you prove it's correct, measurably and consistently?" > > Evaluation design is the only thing standing between AI output you can use > as-is and AI output that requires 40 minutes of cleanup. Build evals before > you need them — especially before model updates that could silently change behavior. ### Test cases #### TC-001: [Test Name] **Input:** What you feed to the agent **Expected output:** What you should get back (specific, verifiable) **Verification:** How to check programmatically (command, assertion, manual checklist) #### TC-002: [Test Name] **Input:** ... **Expected output:** ... **Verification:** ... ### Verification steps ```bash # Example: verify the build passes npm test # Example: check that specific files exist ls -la src/new-feature/ # Example: verify no regressions npm run lint ``` ### Run-after-update checklist - [ ] Run all test cases (TC-001 through TC-00N) - [ ] Compare outputs to expected baselines - [ ] Check for regression in failure modes (see anti-patterns) - [ ] Verify ESCALATE triggers still work correctly ### Regression baselines ```json { "tc_001_baseline": "...", "tc_002_baseline": "..." } ```