agent-harness/OPENCLAW-INTEGRATION.md

9.3 KiB
Raw Blame History

OpenClaw Integration — Running the Harness in OpenClaw

The ralph-loop.sh runs locally via CLI tools (claude, codex). OpenClaw provides a richer runtime: sub-agents, cron scheduling, cross-session messaging, and persistent workspace. This guide shows how to use them.


Architecture: CLI Loop vs OpenClaw

CLI Approach (ralph-loop.sh)

bash loop → spawn CLI agent → agent reads spec → works → exits → loop restarts
  • Runs on your terminal
  • You watch the output in real-time
  • Agent uses CLI tool (claude -p, codex)
  • Loop manages restarts

OpenClaw Approach

main session → sessions_spawn → sub-agent reads spec → works → exits → result delivered
  • Runs in the background
  • Results delivered to your chat (Telegram, Discord, etc.)
  • Sub-agent gets full workspace access
  • You orchestrate via main session or cron jobs

When to Use Which

Scenario Use
Sitting at your terminal, watching progress ralph-loop.sh
Overnight autonomous work OpenClaw sessions_spawn
Scheduled recurring tasks OpenClaw cron
Quick one-off agent tasks OpenClaw sessions_spawn
Need to monitor from your phone OpenClaw (results come to Telegram)

Method 1: Manual Orchestration (Main Session)

You drive each iteration from your chat with your agent (Cleo, etc.):

You: "Run the next iteration of the agent harness on /path/to/project"
Cleo: *spawns sub-agent* → *sub-agent does one task* → *reports back*
You: "Looks good, run the next one"
Cleo: *spawns another sub-agent* → *reports back*

Pros: Full control, you review between iterations Cons: Requires your attention, not autonomous

How Your Agent Orchestrates

Your main agent (e.g., Cleo) uses sessions_spawn internally:

sessions_spawn(
  task: "Read AGENT.md in /home/user/project/. Follow the core loop: 
         orient, pick one task from IMPLEMENTATION_PLAN.md, implement, 
         verify (build + test), commit, exit. Report what you completed.",
  model: "sonnet",
  timeoutSeconds: 600
)

The sub-agent:

  1. Reads AGENT.md and PROJECT-SPEC.md
  2. Reads IMPLEMENTATION_PLAN.md
  3. Picks the first unchecked task
  4. Implements and tests it
  5. Commits
  6. Updates the plan
  7. Reports back to your main session

Method 2: Cron-Based Automation

For true overnight autonomy, use OpenClaw cron jobs:

The Iteration Job

{
  "name": "agent-harness-iteration",
  "schedule": { "kind": "every", "everyMs": 900000 },
  "sessionTarget": "isolated",
  "payload": {
    "kind": "agentTurn",
    "message": "Read AGENT.md in /home/paulh/.openclaw/workspace/projects/my-project/. Follow the core loop: orient → pick ONE task → implement → verify → commit → update plan → exit. If all tasks are done, say DONE. If stuck, say STUCK with details.",
    "model": "sonnet",
    "timeoutSeconds": 600
  },
  "delivery": { "mode": "announce" }
}

This spawns a fresh agent every 15 minutes. Each one picks the next task, does it, and reports to your chat.

The Completion Watcher

Add logic to stop the loop when done:

{
  "name": "agent-harness-completion-check",
  "schedule": { "kind": "every", "everyMs": 3600000 },
  "sessionTarget": "isolated",
  "payload": {
    "kind": "agentTurn",
    "message": "Read IMPLEMENTATION_PLAN.md in /home/paulh/.openclaw/workspace/projects/my-project/. Count completed vs total tasks. If all tasks are checked, disable the cron job named 'agent-harness-iteration' and announce 'All tasks complete!' If less than 80% done and no progress in last 3 git commits, announce a warning.",
    "model": "sonnet",
    "timeoutSeconds": 120
  },
  "delivery": { "mode": "announce" }
}

Starting and Stopping

You: "Start the agent harness loop for my-project"
Cleo: *creates cron job* → iterations begin every 15 min

You: "Pause the harness"  
Cleo: *disables cron job*

You: "Resume"
Cleo: *re-enables cron job*

Method 3: Shell Script Orchestration

Create a script in your workspace that uses OpenClaw's CLI:

#!/usr/bin/env bash
# openclaw-harness.sh — Run agent harness via OpenClaw sub-agents
#
# Usage:
#   ./openclaw-harness.sh /path/to/project [max-iterations]

set -euo pipefail

PROJECT_DIR="${1:?Usage: $0 <project-dir> [max-iterations]}"
MAX="${2:-20}"
PLAN="$PROJECT_DIR/IMPLEMENTATION_PLAN.md"

echo "🔄 Starting OpenClaw agent harness"
echo "   Project: $PROJECT_DIR"
echo "   Max iterations: $MAX"

for i in $(seq 1 "$MAX"); do
  echo ""
  echo "━━━ Iteration $i/$MAX ━━━"

  # Check if plan exists and all tasks are done
  if [[ -f "$PLAN" ]] && ! grep -q '^\- \[ \]' "$PLAN" 2>/dev/null; then
    echo "✅ All tasks complete!"
    exit 0
  fi

  # Spawn sub-agent via OpenClaw
  # Note: This is conceptual — actual invocation depends on OpenClaw CLI
  openclaw sessions spawn \
    --task "Read AGENT.md in $PROJECT_DIR. Follow the core loop: orient, pick ONE task, implement, verify, commit, exit." \
    --model sonnet \
    --timeout 600

  echo "   Iteration $i complete. Waiting 30s..."
  sleep 30
done

echo "⚠️  Reached max iterations ($MAX)"

Note: The exact openclaw CLI syntax may vary. Check openclaw help for current commands. The sessions_spawn tool within an agent session is the most reliable method.


Monitoring Sub-Agent Work

Real-Time: Session List

You: "What sub-agents are running?"
Cleo: *calls sessions_list* → shows active sessions

After Completion: Session History

You: "What did the last sub-agent do?"
Cleo: *calls sessions_history(sessionKey)* → shows full transcript

The transcript includes:

  • Every tool call the agent made (file reads, writes, exec commands)
  • The agent's reasoning
  • Build/test output
  • Git commits made
  • Final summary

Git Log: The Source of Truth

cd /path/to/project
git log --oneline -10

Each iteration should produce exactly one commit. If you see:

  • 0 commits after an iteration → agent was stuck
  • 1 commit → working correctly
  • 2+ commits → task was large or agent fixed its own mistakes

Plan File: Progress at a Glance

# Count completed vs total
grep -c '^\- \[x\]' IMPLEMENTATION_PLAN.md  # Done
grep -c '^\- \[ \]' IMPLEMENTATION_PLAN.md  # Remaining

Model Selection Strategy

Different tasks need different models:

Task Type Recommended Model Why
Planning (decomposition) opus Needs strong reasoning for task ordering
Scaffolding (config, structure) sonnet Simple, deterministic work
Feature implementation sonnet Good balance of speed and quality
Complex algorithms opus Needs deeper reasoning
Bug fixing sonnet or opus Depends on bug complexity
Documentation sonnet Writing, not complex reasoning

Cost Optimization

Planning:     1 iteration × opus     = $$ (one-time cost)
Building:    15 iterations × sonnet  = $ per iteration
Complex:      3 iterations × opus    = $$ per iteration
Total:       19 iterations           = $$$ (manageable)

vs.

Everything:  19 iterations × opus    = $$$$$$ (expensive!)

Use the model parameter in sessions_spawn to control this per-iteration.


OpenClaw-Specific Agent Instructions

When running in OpenClaw, add these to your AGENT.md:

## OpenClaw Environment Notes

### File Access
- Working directory: /home/paulh/.openclaw/workspace/projects/[project]
- Use absolute paths in all exec commands
- The `~` shorthand does NOT work in exec — use full paths

### Build Commands
- Use `cd /absolute/path && command` pattern (cwd parameter is unreliable)
- Example: `cd /home/paulh/.openclaw/workspace/projects/my-project && npm test`

### Git
- Commit after each completed task
- Use descriptive commit messages: `feat: implement QFX parser with field extraction`
- Don't push — let the human decide when to push

### Communication
- Your output is delivered to the human's chat (Telegram/Discord)
- Keep summaries brief — one paragraph max
- Include: what task you completed, what test results look like, what's next

### Signals
- If all tasks done: include "ALL_TASKS_COMPLETE" in your response
- If stuck: include "STUCK:" followed by a description of the blocker
- If error: include "ERROR:" followed by the error details

Troubleshooting OpenClaw Integration

Sub-agent can't find files

  • Verify the project path is absolute
  • Check that the workspace is accessible (not in a different user's home)

Sub-agent uses wrong model

  • Specify model explicitly in sessions_spawn
  • Check session_status to verify which model was used
  • Known issue: isolated sessions may not respect model parameter (use main session sub-agents instead)

Sub-agent times out

  • Default timeout may be too short for complex tasks
  • Set timeoutSeconds: 600 (10 minutes) for implementation tasks
  • Set timeoutSeconds: 900 (15 minutes) for complex tasks
  • Set timeoutSeconds: 120 (2 minutes) for simple checks

Cron job fires but nothing happens

  • Check cron job status: cron list
  • Review recent runs: cron runs --jobId <id>
  • Verify the project path in the cron job message

Results not delivered to chat

  • Check delivery.mode is set to "announce" in cron job config
  • For sessions_spawn from main session, results auto-deliver

OpenClaw turns the agent harness from a terminal-bound tool into an always-available autonomous workforce.