agent-harness/PLAN-MANAGEMENT.md

7.7 KiB

Plan Management — The Living Document

The IMPLEMENTATION_PLAN.md is the agent's "working memory" between iterations. It's the only file the agent both reads AND writes. Managing it well is the difference between a smooth build and a chaotic mess.


What Is the Plan?

The plan is a task decomposition of the PROJECT-SPEC.md, created by the agent during the planning phase. It's a checklist of discrete, testable tasks ordered by dependency.

# Implementation Plan

## Phase 1: Foundation
- [x] Project scaffolding (monorepo, tsconfig, build scripts)
- [x] SQLite schema and migration system
- [ ] QFX file parser with field extraction    ← Agent picks this next
- [ ] CSV import with configurable column mapping
- [ ] Category rules engine

## Phase 2: API Layer
- [ ] REST endpoints for CRUD operations
- [ ] Transaction query with filters
- [ ] Spending aggregation endpoints

Key properties:

  • Checkboxes track completion ([x] = done, [ ] = pending)
  • Order matters — tasks are listed in dependency order
  • One task per iteration — the agent picks the first unchecked item
  • Agent updates it after completing each task

The Plan Lifecycle

1. Creation (Planning Phase)

The agent reads PROJECT-SPEC.md and decomposes it:

Spec → Agent Analysis → IMPLEMENTATION_PLAN.md

What makes a good decomposition:

  • Each task is independently testable (has a clear "done" state)
  • Tasks are small enough to complete in one iteration (30-60 minutes of agent work)
  • Dependencies are respected (can't build API before schema exists)
  • Each task maps to one or more acceptance criteria from the spec

What makes a bad decomposition:

  • Tasks too large ("Build the entire backend")
  • Tasks too small ("Create the src directory")
  • Missing dependencies (trying to write tests before the test framework is set up)
  • Vague tasks ("Improve error handling" — which errors? where?)

2. Execution (Build Iterations)

Each iteration, the agent:

  1. Reads the plan
  2. Finds the first [ ] task
  3. Implements it
  4. Marks it [x]
  5. Commits the updated plan
  6. Exits

The commit includes the plan update:

git add IMPLEMENTATION_PLAN.md src/parser.ts tests/parser.test.ts
git commit -m "feat: implement QFX file parser with field extraction"

3. Human Review (Periodic Check-ins)

You should review the plan at these moments:

When What to check
After planning phase Does the decomposition make sense? Are tasks the right size?
After every 3-5 tasks Is progress on track? Any tasks taking multiple iterations?
When agent signals STUCK What's blocking it? Can you clarify the spec?
At phase boundaries Is the phase complete? Ready for the next phase?

Intervening in the Plan

Sometimes you need to change the plan mid-flight. Here's how:

Adding Tasks

New requirement discovered? Add it to the plan:

- [ ] 🆕 Handle expired token during API calls (auto-refresh)

Use the 🆕 emoji to mark human-added tasks so you can track what changed.

Removing Tasks

Scope cut? Mark it explicitly:

- [~] ~~Monte Carlo visualization~~ (deferred to Phase 4)

Don't just delete — the agent might re-derive it from the spec.

Reprioritizing

Move tasks up or down. Add a note so the agent understands:

## Phase 1: Foundation
- [x] Project scaffolding
- [ ] REST API framework  ← Moved up: needed for integration testing
- [ ] SQLite schema

Splitting Tasks

If a task keeps failing (agent can't complete it in one iteration), split it:

# Before
- [ ] Transaction import with deduplication

# After
- [ ] QFX file parser (parse file → array of transaction objects)
- [ ] CSV file parser (configurable column mapping)
- [ ] Transaction dedup engine (match on date + amount + payee hash)
- [ ] Import orchestrator (parse → dedup → insert)

Adding Notes

Leave notes for the agent in the plan:

- [ ] CLM API integration
  > NOTE: The DTC CLM Demo account returns 401. Try CSA CLM Demo or
  > DocuSign CLM Demo accounts instead. See ACCOUNTS.md for IDs.

Plan Patterns

The Scaffold-First Pattern

- [ ] Project scaffolding (dirs, configs, package.json)
- [ ] Build pipeline (TypeScript compilation, scripts)
- [ ] Test infrastructure (framework, first passing test)
- [ ] CI configuration (lint + build + test)

Always start with infrastructure. An agent that can't build and test will produce garbage.

The Vertical Slice Pattern

- [ ] Single endpoint: GET /api/health (route + handler + test)
- [ ] Single entity CRUD: categories (model + routes + tests)
- [ ] Second entity: transactions (model + routes + tests + import)

Build one complete vertical slice first. It establishes patterns for everything else.

The Test-First Pattern

- [ ] Write failing tests for QFX parser
- [ ] Implement QFX parser to pass tests
- [ ] Write failing tests for CSV parser
- [ ] Implement CSV parser to pass tests

Pairs well with agents — the failing test IS the acceptance criterion.

The Dependency Chain Pattern

## Layer 0: Shared
- [ ] Type definitions and interfaces

## Layer 1: Data
- [ ] Database schema and migrations
- [ ] Data access layer

## Layer 2: Business Logic
- [ ] Calculation engines (depends on Layer 1)
- [ ] Rule processors (depends on Layer 1)

## Layer 3: Interface
- [ ] CLI commands (depends on Layer 2)
- [ ] API endpoints (depends on Layer 2)

Makes dependencies explicit. Agent can't skip ahead.


Tracking Progress

Git Log as Progress Report

The git log tells the story:

git log --oneline
# a1b2c3d feat: implement Monte Carlo simulation engine
# d4e5f6g feat: add retirement projection calculator
# h7i8j9k feat: spending aggregation API endpoints
# l0m1n2o feat: transaction import with deduplication
# p3q4r5s chore: project scaffolding and build pipeline

Each commit = one completed task. If you see multiple commits for one task, the task was too big.

Velocity Tracking

Count completed tasks per iteration:

  • 1 task/iteration = ideal (the system is working)
  • 0 tasks/iteration = agent is stuck (intervene)
  • 2+ tasks/iteration = tasks are too small (merge them)

Stuck Detection

Signs the agent is stuck:

  • Same task attempted 3+ iterations in a row
  • Agent outputs <promise>STUCK</promise>
  • Build or tests failing repeatedly
  • Agent is modifying unrelated files

When stuck, ask:

  1. Is the task too large? → Split it
  2. Is the spec unclear? → Clarify acceptance criteria
  3. Is there a technical blocker? → Investigate and add notes
  4. Is the agent fighting its own previous work? → Reset to last good commit

Plan Anti-Patterns

1. The God Task

# Bad
- [ ] Build the backend
# Good
- [ ] Database schema for users and transactions
- [ ] CRUD API for users
- [ ] CRUD API for transactions
- [ ] Transaction query with date/category filters

2. The Missing Dependency

# Bad (schema doesn't exist yet!)
- [ ] REST API for transactions
- [ ] Database schema

# Good
- [ ] Database schema
- [ ] REST API for transactions

3. The Implicit Task

# Bad (who sets up the test framework?)
- [ ] Write tests for parser

# Good
- [ ] Set up test framework (vitest config, first passing test)
- [ ] Write tests for parser

4. The Never-Done Task

# Bad (when is "improve" done?)
- [ ] Improve error handling

# Good
- [ ] Add user-friendly error messages for auth failures (expired token, missing key, network error)
- [ ] Add try/catch with contextual messages to all API endpoints

The plan is your control surface. Master it, and you master the agent loop.