agent-harness/PLAN-MANAGEMENT.md

# Plan Management — The Living Document

> The IMPLEMENTATION_PLAN.md is the agent's "working memory" between iterations.
> It's the only file the agent both reads AND writes.
> Managing it well is the difference between a smooth build and a chaotic mess.

---

## What Is the Plan?

The plan is a **task decomposition** of the PROJECT-SPEC.md, created by the agent during the planning phase. It's a checklist of discrete, testable tasks ordered by dependency.

```markdown
# Implementation Plan

## Phase 1: Foundation
- [x] Project scaffolding (monorepo, tsconfig, build scripts)
- [x] SQLite schema and migration system
- [ ] QFX file parser with field extraction    ← Agent picks this next
- [ ] CSV import with configurable column mapping
- [ ] Category rules engine

## Phase 2: API Layer
- [ ] REST endpoints for CRUD operations
- [ ] Transaction query with filters
- [ ] Spending aggregation endpoints
```

**Key properties:**
- **Checkboxes** track completion (`[x]` = done, `[ ]` = pending)
- **Order matters** — tasks are listed in dependency order
- **One task per iteration** — the agent picks the first unchecked item
- **Agent updates it** after completing each task

---

## The Plan Lifecycle

### 1. Creation (Planning Phase)

The agent reads PROJECT-SPEC.md and decomposes it:

```
Spec → Agent Analysis → IMPLEMENTATION_PLAN.md
```

**What makes a good decomposition:**
- Each task is **independently testable** (has a clear "done" state)
- Tasks are **small enough** to complete in one iteration (30-60 minutes of agent work)
- Dependencies are respected (can't build API before schema exists)
- Each task maps to one or more acceptance criteria from the spec

**What makes a bad decomposition:**
- Tasks too large ("Build the entire backend")
- Tasks too small ("Create the src directory")
- Missing dependencies (trying to write tests before the test framework is set up)
- Vague tasks ("Improve error handling" — which errors? where?)

### 2. Execution (Build Iterations)

Each iteration, the agent:
1. Reads the plan
2. Finds the first `[ ]` task
3. Implements it
4. Marks it `[x]`
5. Commits the updated plan
6. Exits

**The commit includes the plan update:**
```bash
git add IMPLEMENTATION_PLAN.md src/parser.ts tests/parser.test.ts
git commit -m "feat: implement QFX file parser with field extraction"
```

### 3. Human Review (Periodic Check-ins)

You should review the plan at these moments:

| When | What to check |
|------|---------------|
| After planning phase | Does the decomposition make sense? Are tasks the right size? |
| After every 3-5 tasks | Is progress on track? Any tasks taking multiple iterations? |
| When agent signals STUCK | What's blocking it? Can you clarify the spec? |
| At phase boundaries | Is the phase complete? Ready for the next phase? |

---

## Intervening in the Plan

Sometimes you need to change the plan mid-flight. Here's how:

### Adding Tasks

New requirement discovered? Add it to the plan:
```markdown
- [ ] 🆕 Handle expired token during API calls (auto-refresh)
```

Use the 🆕 emoji to mark human-added tasks so you can track what changed.

### Removing Tasks

Scope cut? Mark it explicitly:
```markdown
- [~] ~~Monte Carlo visualization~~ (deferred to Phase 4)
```

Don't just delete — the agent might re-derive it from the spec.

### Reprioritizing

Move tasks up or down. Add a note so the agent understands:
```markdown
## Phase 1: Foundation
- [x] Project scaffolding
- [ ] REST API framework  ← Moved up: needed for integration testing
- [ ] SQLite schema
```

### Splitting Tasks

If a task keeps failing (agent can't complete it in one iteration), split it:
```markdown
# Before
- [ ] Transaction import with deduplication

# After
- [ ] QFX file parser (parse file → array of transaction objects)
- [ ] CSV file parser (configurable column mapping)
- [ ] Transaction dedup engine (match on date + amount + payee hash)
- [ ] Import orchestrator (parse → dedup → insert)
```

### Adding Notes

Leave notes for the agent in the plan:
```markdown
- [ ] CLM API integration
  > NOTE: The DTC CLM Demo account returns 401. Try CSA CLM Demo or
  > DocuSign CLM Demo accounts instead. See ACCOUNTS.md for IDs.
```

---

## Plan Patterns

### The Scaffold-First Pattern
```markdown
- [ ] Project scaffolding (dirs, configs, package.json)
- [ ] Build pipeline (TypeScript compilation, scripts)
- [ ] Test infrastructure (framework, first passing test)
- [ ] CI configuration (lint + build + test)
```

Always start with infrastructure. An agent that can't build and test will produce garbage.

### The Vertical Slice Pattern
```markdown
- [ ] Single endpoint: GET /api/health (route + handler + test)
- [ ] Single entity CRUD: categories (model + routes + tests)
- [ ] Second entity: transactions (model + routes + tests + import)
```

Build one complete vertical slice first. It establishes patterns for everything else.

### The Test-First Pattern
```markdown
- [ ] Write failing tests for QFX parser
- [ ] Implement QFX parser to pass tests
- [ ] Write failing tests for CSV parser
- [ ] Implement CSV parser to pass tests
```

Pairs well with agents — the failing test IS the acceptance criterion.

### The Dependency Chain Pattern
```markdown
## Layer 0: Shared
- [ ] Type definitions and interfaces

## Layer 1: Data
- [ ] Database schema and migrations
- [ ] Data access layer

## Layer 2: Business Logic
- [ ] Calculation engines (depends on Layer 1)
- [ ] Rule processors (depends on Layer 1)

## Layer 3: Interface
- [ ] CLI commands (depends on Layer 2)
- [ ] API endpoints (depends on Layer 2)
```

Makes dependencies explicit. Agent can't skip ahead.

---

## Tracking Progress

### Git Log as Progress Report

The git log tells the story:
```bash
git log --oneline
# a1b2c3d feat: implement Monte Carlo simulation engine
# d4e5f6g feat: add retirement projection calculator
# h7i8j9k feat: spending aggregation API endpoints
# l0m1n2o feat: transaction import with deduplication
# p3q4r5s chore: project scaffolding and build pipeline
```

Each commit = one completed task. If you see multiple commits for one task, the task was too big.

### Velocity Tracking

Count completed tasks per iteration:
- **1 task/iteration** = ideal (the system is working)
- **0 tasks/iteration** = agent is stuck (intervene)
- **2+ tasks/iteration** = tasks are too small (merge them)

### Stuck Detection

Signs the agent is stuck:
- Same task attempted 3+ iterations in a row
- Agent outputs `<promise>STUCK</promise>`
- Build or tests failing repeatedly
- Agent is modifying unrelated files

**When stuck, ask:**
1. Is the task too large? → Split it
2. Is the spec unclear? → Clarify acceptance criteria
3. Is there a technical blocker? → Investigate and add notes
4. Is the agent fighting its own previous work? → Reset to last good commit

---

## Plan Anti-Patterns

### 1. The God Task
```markdown
# Bad
- [ ] Build the backend
# Good
- [ ] Database schema for users and transactions
- [ ] CRUD API for users
- [ ] CRUD API for transactions
- [ ] Transaction query with date/category filters
```

### 2. The Missing Dependency
```markdown
# Bad (schema doesn't exist yet!)
- [ ] REST API for transactions
- [ ] Database schema

# Good
- [ ] Database schema
- [ ] REST API for transactions
```

### 3. The Implicit Task
```markdown
# Bad (who sets up the test framework?)
- [ ] Write tests for parser

# Good
- [ ] Set up test framework (vitest config, first passing test)
- [ ] Write tests for parser
```

### 4. The Never-Done Task
```markdown
# Bad (when is "improve" done?)
- [ ] Improve error handling

# Good
- [ ] Add user-friendly error messages for auth failures (expired token, missing key, network error)
- [ ] Add try/catch with contextual messages to all API endpoints
```

---

_The plan is your control surface. Master it, and you master the agent loop._