3.9 KiB

Raw Blame History

Recipe Manager Agentic Runbook

Last updated: 2026-03-24

Purpose

Operational guide for running the Recipe Manager agent harness reliably.

Core Execution Model

One task per iteration
One commit per iteration
TODO.md is the authoritative queue
Work only in: /home/paulh/.openclaw/workspace/projects/recipe-manager

Required Guards (Must Pass Before Coding)

Pre-flight checks

Before any iteration starts, verify these files exist:

AGENT_INSTRUCTIONS.md
TODO.md

If missing, fail with: STUCK: bad working dir or missing harness files at /home/paulh/.openclaw/workspace/projects/recipe-manager

Monitoring Signals (How we know it's working)

A run is healthy only when all 3 are true:

Active session updated recently (recipe-v1-iter*)
New git commits are landing
TODO checkboxes advance

Known Failure Modes and Fixes

1) Wrong working directory

Symptom

Agent says AGENT_INSTRUCTIONS.md / TODO.md missing in /workspace.

Root cause

Spawner started outside project root.

Fix

Force absolute project path in every task prompt
Add mandatory pre-flight guard
Relaunch fresh iteration

2) False “iteration already running”

Symptom

Auto-iterator repeatedly prints SKIP even when no coding progress occurs.

Root cause

It treated stale historical sessions as active.

Fix

Treat a session as active only if updated recently (freshness window)
Use current phase labels only (recipe-v1-iter*)

3) Label mismatch across phases

Symptom

Monitor reports wrong status or misses active runs.

Root cause

MVP labels (recipe-mvp-*) used during v1 phase.

Fix

Update monitor + iterator to phase-specific labels
Standardize naming per phase:
- MVP: recipe-mvp-iter*
- v1: recipe-v1-iter*

4) Model/provider auth mismatch

Symptom

Cron jobs fail with:

No API key found for provider openai
or Copilot cooldown rate-limit errors

Root cause

Using openai/... models without OpenAI API key.

Fix

Use OAuth provider model prefix: openai-codex/...
For this project, prefer: openai-codex/gpt-5.3-codex

5) Environment capability mismatch (Docker)

Symptom

Task fails with docker: command not found.

Root cause

Agent runtime host lacks Docker.

Fix

Mark as manual host validation task
Continue with unblocked tasks

6) Runtime module mismatch (ESM/CommonJS)

Symptom

Backend runtime error: require is not defined.

Root cause

Using require() in ESM code path.

Fix

Replace require('fs') calls with ESM imports (writeFileSync)
Build + rerun server

Operational Controls

Pause automation

Disable both jobs:

Recipe Manager Auto-Iterator
Recipe Manager Progress Monitor

Resume automation

Enable both jobs, then manually kick one fresh iteration.

Manual override iteration (safe restart)

Spawn one explicit iteration with:

absolute project path
pre-flight guard
one-task/one-commit rule

Completion Definition

A phase is complete when:

No unchecked tasks remain in that phase section of TODO.md
Latest iteration exits without STUCK/ERROR
Commit + TODO update are present

Recommended Cadence

Auto-iterator: every 15 minutes
Progress monitor: every 5 minutes (high visibility mode)

If noisy, set monitor to every 10–15 minutes.

Handoff Checklist (Before ending a session)

Confirm latest commit hash
Confirm active phase + next unchecked task
Confirm auto-iterator enabled/disabled status
Confirm monitor enabled/disabled status
Confirm no stale active-session false positives

Quick Status Commands

Latest commit

git log -1 --oneline

Next tasks

grep -n "^- \[ \]" TODO.md | head

Recent progress

git log --oneline -5

This runbook should be updated whenever a new failure mode appears.

See also: INCIDENT_LOG.md for timestamped operational incidents and fixes.

3.9 KiB Raw Blame History Unescape Escape

Recipe Manager Agentic Runbook

Purpose

Core Execution Model

Required Guards (Must Pass Before Coding)

Pre-flight checks

Monitoring Signals (How we know it's working)

Known Failure Modes and Fixes

1) Wrong working directory

Symptom

Root cause

Fix

2) False “iteration already running”

Symptom

Root cause

Fix

3) Label mismatch across phases

Symptom

Root cause

Fix

4) Model/provider auth mismatch

Symptom

Root cause

Fix

5) Environment capability mismatch (Docker)

Symptom

Root cause

Fix

6) Runtime module mismatch (ESM/CommonJS)

Symptom

Root cause

Fix

Operational Controls

Pause automation

Resume automation

Manual override iteration (safe restart)

Completion Definition

Recommended Cadence

Handoff Checklist (Before ending a session)

Quick Status Commands

Latest commit

Next tasks

Recent progress

3.9 KiB

Raw Blame History