agent-harness/PROCESS-EVAL-TEMPLATE.md

# Process Eval Template

> Write this file after the stream is fully merged.
> File location: `.harness/<stream>/process-eval.md`
> Be honest — this is a retrospective, not a press release.
> Future agents and sessions will read this to understand what worked.

---

# Process Eval — [STREAM NAME]
**Completed:** YYYY-MM-DD
**Agent:** [model name]
**Packets:** [XX-01, XX-02, ...]
**Tests added:** NN total
**Final test count:** NNNN
**Wall-clock duration:** [estimated]

---

## Packet Summary

| Packet | Est. Effort | Actual | On Time? | Tests Added |
|--------|-------------|--------|----------|-------------|
| XX-01 | N sessions | N sessions | ✅/❌ | NN |
| XX-02 | ... | ... | ... | ... |

---

## Known-Answer Test Results

| Test | Expected | Actual | Pass? |
|------|----------|--------|-------|
| [Description] | [value] | [value] | ✅/❌ |

---

## Process Quality Dimensions

### Task Sizing
- Estimate accuracy: [XX%]
- Packets that overran: [list or "none"]
- Root cause of overruns: [...]

### Test-First Discipline
- Tests committed same commit as implementation: [XX/NN packets]
- Patches needed after initial commit: [list or "none"]

### Acceptance Criteria Quality
- Programmatically verifiable criteria: [XX/NN]
- Criteria that required human judgment: [list or "none"]

### Known-Answer Coverage
- New calculation modules: N
- Modules with ≥1 known-answer test: N/N
- Any gaps: [list or "none"]

### Architecture Integrity
- Cross-module import violations: [N]
- New shared utilities created: [list]

### Regression Protection
- Regression baseline saved: [yes/no — path if yes]

---

## What Went Well
- [Honest list]

## What Was Hard
- [Honest list — useful for planning the next stream]

## What To Do Differently
- [Actionable changes for next time]

## Model Attribution
- Model: [model name]
- Strengths observed: [...]
- Weaknesses observed: [...]