# Wave-Based Project Management > The biggest gap in most agentic projects: **planning only one task at a time.** > This guide captures the wave-based approach — planning a full stream's worth of work > before writing a single line of implementation code. > > Proven in practice: 44 tasks across 4 waves, 1,254 → 1,597 tests, zero regressions. --- ## The Core Insight: Plan the Stream, Not the Task The basic harness has you plan one task at a time. This works for small projects. For larger projects, it creates problems: - **Scope drift:** Agent picks up the next task without understanding how it fits the stream - **Missing dependencies:** Packet 3 turns out to need something Packet 1 should have built - **Unknown-answer tests discovered too late:** Financial formulas validated by feel, not by known CRA/ESDC figures - **No clear "done":** What does stream completion actually mean? The solution: **write the entire execution board for a stream before implementing any of it.** ``` ❌ Old approach: Plan task → Implement task → Plan next task → Implement → ... ✅ Wave approach: Plan ENTIRE stream → Review plan → Implement packet-by-packet → Close stream ``` --- ## The Four Levels of Structure ``` Project └── Waves (groups of streams, sequenced by dependency) └── Streams (a feature or module — has its own branch) └── Packets (atomic unit of work — one commit per packet) └── Tasks (sub-steps within a packet) ``` ### Waves A wave is a set of streams that logically belong together and can be started in parallel (or have light dependencies between them). Waves are gated — Wave N+1 doesn't start until Wave N is fully merged and green. **Example:** - Wave 1: Core data models + calculation engines (everything else depends on this) - Wave 2: Advisory layer + specialized tools (uses Wave 1 outputs) - Wave 3: Infrastructure + integrations (can be parallel with Wave 2) - Wave 4: Future vision / stretch goals ### Streams A stream is a feature branch with a defined scope. It has: - One `execution-board.md` (written before any code) - 2–6 packets - One `process-eval.md` (written after merge) - Validation evidence per packet ### Packets A packet is the atomic unit — one focused chunk of work that produces a commit. It has: - A clear goal (one sentence) - Explicit steps - Known-answer tests (mandatory for calculation work) - Programmatically verifiable acceptance criteria - One validation evidence file --- ## The Execution Board: Your Planning Artifact The execution board lives at `.harness//execution-board.md`. Copy `EXECUTION-BOARD-TEMPLATE.md` and fill it in. **The rule:** The board must be complete before you write a single line of implementation. ### What "complete" means: - Every packet is defined with goal, steps, files, and acceptance criteria - Known-answer tests are written out (not "TBD") for any calculation - Dependency order between packets is explicit - Stream completion criteria are listed ### What happens if you skip it: - You discover mid-stream that Packet 3 needs something Packet 1 didn't build - You commit calculation code with no ground-truth validation - You have no clear definition of "done" for the stream - The next agent session doesn't know what state the stream is in --- ## Known-Answer Tests: The Most Important Rule For any stream that touches domain-specific calculations (financial math, scientific formulas, regulatory thresholds, physical constants), every calculation module **must** include at least one known-answer test citing an official source. ```typescript // ✅ Correct: cites official source, tests exact value test('CPP at 70 is exactly 42% more than at 65', () => { // Source: ESDC https://www.canada.ca/en/services/benefits/publicpensions/cpp/benefit-amount.html // Formula: +0.7% per month after 65 × 60 months = +42% expect(calculateCPPBenefitAtAge(1000, 70) / calculateCPPBenefitAtAge(1000, 65)).toBeCloseTo(1.42, 5); }); // ❌ Wrong: no source, tests implementation against itself test('CPP at 70 returns more than at 65', () => { expect(calculateCPPBenefitAtAge(1000, 70)).toBeGreaterThan(calculateCPPBenefitAtAge(1000, 65)); }); ``` **Why this matters:** An agent can write a plausible-looking formula that's subtly wrong. Without a known-answer test from an authoritative source, you won't catch it until someone gets incorrect results in production. With known-answer tests, errors are caught immediately. ### What qualifies as a "known-answer source": - Government publications (CRA, ESDC, IRS, HMRC, etc.) - Official standards documents (ISO, RFC, IEEE) - Published academic results - Regulatory filings with specific numerical requirements - Product specifications with exact values ### The financial accuracy eval pattern For financial software, create a separate calibration test suite that lives outside the normal unit tests: ``` evals/ └── code-quality/ └── financial-accuracy.test.ts ← Run with: npm run eval:financial-accuracy ``` This suite contains ONLY known-answer tests from official sources. It grows over time as you add calculation modules. Run it independently to verify the app's financial accuracy hasn't drifted. --- ## EXECUTION_MASTER.md: The Project Dashboard Every project using wave-based management should have a single coordination file — typically `EXECUTION_MASTER.md` or equivalent — that shows: ```markdown # Project Execution Master ## Wave Status | Wave | Description | Status | |------|-------------|--------| | Wave 1 | Core foundations | ✅ Complete | | Wave 2 | Advisory layer | 🟡 In progress | | Wave 3 | Infrastructure | ⏸️ Not started | ## Active Streams | Stream | Branch | Status | Blocker | |--------|--------|--------|---------| | cpp-optimizer | feat/cpp-optimizer | ✅ Merged | — | | rrsp-meltdown | feat/rrsp-meltdown | 🟠 In progress | — | | estate-planning | feat/estate-planning | ⏸️ Planned | Needs rrsp-meltdown | ## Parallelism Rules 1. Max 2 active streams simultaneously 2. Shared schema changes are always sequential 3. Integration gate before any merge: full test suite must stay green ``` **Every agent session starts by reading this file.** It immediately knows: - What wave is active - Which streams are running - What's blocked and why - What can run in parallel --- ## The Wave Gate Before starting Wave N+1, verify: ``` [ ] All streams in Wave N merged to main [ ] Full test suite green (count ≥ baseline) [ ] Domain-specific accuracy suite passing (if applicable) [ ] All regression baselines saved [ ] Process evals written for all Wave N streams [ ] process-eval-history.json updated [ ] IMPLEMENTATION_PLAN: all Wave N tasks marked [x] [ ] EXECUTION_MASTER: Wave N status updated to ✅ [ ] Human sign-off: outputs are producing correct/plausible results ``` The gate exists because Wave N+1 often builds on Wave N's outputs. If Wave N has silent bugs, they compound in Wave N+1. Catch them at the gate. --- ## File Organization ``` / ├── AGENT.md ← Agent instructions (adapted from AGENT-INSTRUCTIONS.md) ├── IMPLEMENTATION_PLAN.md ← Master backlog (tasks 1-N, all waves) ├── PROJECT-SPEC.md ← What to build (never changes) ├── DECISIONS.md ← Architecture Decision Records └── .harness/ ├── EXECUTION_MASTER.md ← Wave/stream dashboard ├── EXECUTION-BOARD-TEMPLATE.md ← Copy this for new streams ├── VALIDATION-TEMPLATE.md ← Copy this for packet evidence ├── PROCESS-EVAL-TEMPLATE.md ← Copy this for stream retrospectives ├── regression-baselines/ ← Deterministic output snapshots ├── / │ ├── execution-board.md ← Written BEFORE implementation │ ├── process-eval.md ← Written AFTER merge │ └── validation/ │ ├── -validation.md │ └── -validation.md └── / └── ... ``` --- ## Adapting for Your Project ### Projects WITHOUT domain-specific calculations Skip the known-answer tests and financial accuracy eval. Keep everything else. ### Projects with a small scope (< 10 tasks) Skip waves entirely — just use streams. One execution board per logical feature group. ### Projects with a single developer (no parallelism) Streams are still valuable for planning discipline even if run sequentially. ### Non-TypeScript / non-test projects Adapt the commit trailers. The key trackers are: - **What model did the work** (for attribution and quality tracking) - **Test counts** or equivalent quality metric - **Build / type check status** --- ## Quick Reference: The Discipline in One Page ``` BEFORE CODING: ✅ Write execution board for the entire stream ✅ Define known-answer tests for ALL calculation modules ✅ Get acceptance criteria to programmatically verifiable PER PACKET: ✅ Code + tests in same commit ✅ Full suite green before moving on ✅ Write validation evidence immediately after ✅ Commit trailer: Agent / Tests / Tests-Added / TypeScript PER STREAM: ✅ Write process eval honestly ✅ Merge with --no-ff ✅ Update EXECUTION_MASTER PER WAVE: ✅ Run wave gate checklist before starting next wave ✅ Human sign-off on outputs ``` --- ## Why This Works The wave-based approach solves three failure modes common in agent projects: **1. Scope drift** — The execution board defines the stream's boundaries upfront. Agents can't drift into unrelated work because the plan is explicit. **2. Hidden inaccuracies** — Known-answer tests with official citations are written in the planning phase, before any implementation. This forces precision in the spec, which translates directly into correct implementations. **3. No definition of done** — The stream completion criteria (in the execution board) tell every agent, every session: "the stream is done when these boxes are checked." No ambiguity. --- *This pattern was developed through practice on the Fintrove project (2026-03-31 → 2026-04-01): 4 waves, 11 streams, 44 tasks, 1,254 → 1,597 tests, zero regressions.*