3.8 KiB
Recipe Manager Agentic Runbook
Last updated: 2026-03-24
Purpose
Operational guide for running the Recipe Manager agent harness reliably.
Core Execution Model
- One task per iteration
- One commit per iteration
- TODO.md is the authoritative queue
- Work only in:
/home/paulh/.openclaw/workspace/projects/recipe-manager
Required Guards (Must Pass Before Coding)
Pre-flight checks
Before any iteration starts, verify these files exist:
AGENT_INSTRUCTIONS.mdTODO.md
If missing, fail with:
STUCK: bad working dir or missing harness files at /home/paulh/.openclaw/workspace/projects/recipe-manager
Monitoring Signals (How we know it's working)
A run is healthy only when all 3 are true:
- Active session updated recently (
recipe-v1-iter*) - New git commits are landing
- TODO checkboxes advance
Known Failure Modes and Fixes
1) Wrong working directory
Symptom
Agent says AGENT_INSTRUCTIONS.md / TODO.md missing in /workspace.
Root cause
Spawner started outside project root.
Fix
- Force absolute project path in every task prompt
- Add mandatory pre-flight guard
- Relaunch fresh iteration
2) False “iteration already running”
Symptom
Auto-iterator repeatedly prints SKIP even when no coding progress occurs.
Root cause
It treated stale historical sessions as active.
Fix
- Treat a session as active only if updated recently (freshness window)
- Use current phase labels only (
recipe-v1-iter*)
3) Label mismatch across phases
Symptom
Monitor reports wrong status or misses active runs.
Root cause
MVP labels (recipe-mvp-*) used during v1 phase.
Fix
- Update monitor + iterator to phase-specific labels
- Standardize naming per phase:
- MVP:
recipe-mvp-iter* - v1:
recipe-v1-iter*
- MVP:
4) Model/provider auth mismatch
Symptom
Cron jobs fail with:
No API key found for provider openai- or Copilot cooldown rate-limit errors
Root cause
Using openai/... models without OpenAI API key.
Fix
- Use OAuth provider model prefix:
openai-codex/... - For this project, prefer:
openai-codex/gpt-5.3-codex
5) Environment capability mismatch (Docker)
Symptom
Task fails with docker: command not found.
Root cause
Agent runtime host lacks Docker.
Fix
- Mark as manual host validation task
- Continue with unblocked tasks
6) Runtime module mismatch (ESM/CommonJS)
Symptom
Backend runtime error: require is not defined.
Root cause
Using require() in ESM code path.
Fix
- Replace
require('fs')calls with ESM imports (writeFileSync) - Build + rerun server
Operational Controls
Pause automation
Disable both jobs:
- Recipe Manager Auto-Iterator
- Recipe Manager Progress Monitor
Resume automation
Enable both jobs, then manually kick one fresh iteration.
Manual override iteration (safe restart)
Spawn one explicit iteration with:
- absolute project path
- pre-flight guard
- one-task/one-commit rule
Completion Definition
A phase is complete when:
- No unchecked tasks remain in that phase section of TODO.md
- Latest iteration exits without STUCK/ERROR
- Commit + TODO update are present
Recommended Cadence
- Auto-iterator: every 15 minutes
- Progress monitor: every 5 minutes (high visibility mode)
If noisy, set monitor to every 10–15 minutes.
Handoff Checklist (Before ending a session)
- Confirm latest commit hash
- Confirm active phase + next unchecked task
- Confirm auto-iterator enabled/disabled status
- Confirm monitor enabled/disabled status
- Confirm no stale active-session false positives
Quick Status Commands
Latest commit
git log -1 --oneline
Next tasks
grep -n "^- \[ \]" TODO.md | head
Recent progress
git log --oneline -5
This runbook should be updated whenever a new failure mode appears.