9.3 KiB

Raw Permalink Blame History

OpenClaw Integration — Running the Harness in OpenClaw

The ralph-loop.sh runs locally via CLI tools (claude, codex). OpenClaw provides a richer runtime: sub-agents, cron scheduling, cross-session messaging, and persistent workspace. This guide shows how to use them.

Architecture: CLI Loop vs OpenClaw

CLI Approach (ralph-loop.sh)

bash loop → spawn CLI agent → agent reads spec → works → exits → loop restarts

Runs on your terminal
You watch the output in real-time
Agent uses CLI tool (claude -p, codex)
Loop manages restarts

OpenClaw Approach

main session → sessions_spawn → sub-agent reads spec → works → exits → result delivered

Runs in the background
Results delivered to your chat (Telegram, Discord, etc.)
Sub-agent gets full workspace access
You orchestrate via main session or cron jobs

When to Use Which

Scenario	Use
Sitting at your terminal, watching progress	ralph-loop.sh
Overnight autonomous work	OpenClaw sessions_spawn
Scheduled recurring tasks	OpenClaw cron
Quick one-off agent tasks	OpenClaw sessions_spawn
Need to monitor from your phone	OpenClaw (results come to Telegram)

Method 1: Manual Orchestration (Main Session)

You drive each iteration from your chat with your agent (Cleo, etc.):

You: "Run the next iteration of the agent harness on /path/to/project"
Cleo: *spawns sub-agent* → *sub-agent does one task* → *reports back*
You: "Looks good, run the next one"
Cleo: *spawns another sub-agent* → *reports back*

Pros: Full control, you review between iterations Cons: Requires your attention, not autonomous

How Your Agent Orchestrates

Your main agent (e.g., Cleo) uses sessions_spawn internally:

sessions_spawn(
  task: "Read AGENT.md in /home/user/project/. Follow the core loop: 
         orient, pick one task from IMPLEMENTATION_PLAN.md, implement, 
         verify (build + test), commit, exit. Report what you completed.",
  model: "sonnet",
  timeoutSeconds: 600
)

The sub-agent:

Reads AGENT.md and PROJECT-SPEC.md
Reads IMPLEMENTATION_PLAN.md
Picks the first unchecked task
Implements and tests it
Commits
Updates the plan
Reports back to your main session

Method 2: Cron-Based Automation

For true overnight autonomy, use OpenClaw cron jobs:

The Iteration Job

{
  "name": "agent-harness-iteration",
  "schedule": { "kind": "every", "everyMs": 900000 },
  "sessionTarget": "isolated",
  "payload": {
    "kind": "agentTurn",
    "message": "Read AGENT.md in /home/paulh/.openclaw/workspace/projects/my-project/. Follow the core loop: orient → pick ONE task → implement → verify → commit → update plan → exit. If all tasks are done, say DONE. If stuck, say STUCK with details.",
    "model": "sonnet",
    "timeoutSeconds": 600
  },
  "delivery": { "mode": "announce" }
}

This spawns a fresh agent every 15 minutes. Each one picks the next task, does it, and reports to your chat.

The Completion Watcher

Add logic to stop the loop when done:

{
  "name": "agent-harness-completion-check",
  "schedule": { "kind": "every", "everyMs": 3600000 },
  "sessionTarget": "isolated",
  "payload": {
    "kind": "agentTurn",
    "message": "Read IMPLEMENTATION_PLAN.md in /home/paulh/.openclaw/workspace/projects/my-project/. Count completed vs total tasks. If all tasks are checked, disable the cron job named 'agent-harness-iteration' and announce 'All tasks complete!' If less than 80% done and no progress in last 3 git commits, announce a warning.",
    "model": "sonnet",
    "timeoutSeconds": 120
  },
  "delivery": { "mode": "announce" }
}

Starting and Stopping

You: "Start the agent harness loop for my-project"
Cleo: *creates cron job* → iterations begin every 15 min

You: "Pause the harness"  
Cleo: *disables cron job*

You: "Resume"
Cleo: *re-enables cron job*

Method 3: Shell Script Orchestration

Create a script in your workspace that uses OpenClaw's CLI:

#!/usr/bin/env bash
# openclaw-harness.sh — Run agent harness via OpenClaw sub-agents
#
# Usage:
#   ./openclaw-harness.sh /path/to/project [max-iterations]

set -euo pipefail

PROJECT_DIR="${1:?Usage: $0 <project-dir> [max-iterations]}"
MAX="${2:-20}"
PLAN="$PROJECT_DIR/IMPLEMENTATION_PLAN.md"

echo "🔄 Starting OpenClaw agent harness"
echo "   Project: $PROJECT_DIR"
echo "   Max iterations: $MAX"

for i in $(seq 1 "$MAX"); do
  echo ""
  echo "━━━ Iteration $i/$MAX ━━━"

  # Check if plan exists and all tasks are done
  if [[ -f "$PLAN" ]] && ! grep -q '^\- \[ \]' "$PLAN" 2>/dev/null; then
    echo "✅ All tasks complete!"
    exit 0
  fi

  # Spawn sub-agent via OpenClaw
  # Note: This is conceptual — actual invocation depends on OpenClaw CLI
  openclaw sessions spawn \
    --task "Read AGENT.md in $PROJECT_DIR. Follow the core loop: orient, pick ONE task, implement, verify, commit, exit." \
    --model sonnet \
    --timeout 600

  echo "   Iteration $i complete. Waiting 30s..."
  sleep 30
done

echo "⚠️  Reached max iterations ($MAX)"

Note: The exact openclaw CLI syntax may vary. Check openclaw help for current commands. The sessions_spawn tool within an agent session is the most reliable method.

Monitoring Sub-Agent Work

Real-Time: Session List

You: "What sub-agents are running?"
Cleo: *calls sessions_list* → shows active sessions

After Completion: Session History

You: "What did the last sub-agent do?"
Cleo: *calls sessions_history(sessionKey)* → shows full transcript

The transcript includes:

Every tool call the agent made (file reads, writes, exec commands)
The agent's reasoning
Build/test output
Git commits made
Final summary

Git Log: The Source of Truth

cd /path/to/project
git log --oneline -10

Each iteration should produce exactly one commit. If you see:

0 commits after an iteration → agent was stuck
1 commit → working correctly
2+ commits → task was large or agent fixed its own mistakes

Plan File: Progress at a Glance

# Count completed vs total
grep -c '^\- \[x\]' IMPLEMENTATION_PLAN.md  # Done
grep -c '^\- \[ \]' IMPLEMENTATION_PLAN.md  # Remaining

Model Selection Strategy

Different tasks need different models:

Task Type	Recommended Model	Why
Planning (decomposition)	opus	Needs strong reasoning for task ordering
Scaffolding (config, structure)	sonnet	Simple, deterministic work
Feature implementation	sonnet	Good balance of speed and quality
Complex algorithms	opus	Needs deeper reasoning
Bug fixing	sonnet or opus	Depends on bug complexity
Documentation	sonnet	Writing, not complex reasoning

Cost Optimization

Planning:     1 iteration × opus     = $$ (one-time cost)
Building:    15 iterations × sonnet  = $ per iteration
Complex:      3 iterations × opus    = $$ per iteration
Total:       19 iterations           = $$$ (manageable)

vs.

Everything:  19 iterations × opus    = $$$$$$ (expensive!)

Use the model parameter in sessions_spawn to control this per-iteration.

OpenClaw-Specific Agent Instructions

When running in OpenClaw, add these to your AGENT.md:

## OpenClaw Environment Notes

### File Access
- Working directory: /home/paulh/.openclaw/workspace/projects/[project]
- Use absolute paths in all exec commands
- The `~` shorthand does NOT work in exec — use full paths

### Build Commands
- Use `cd /absolute/path && command` pattern (cwd parameter is unreliable)
- Example: `cd /home/paulh/.openclaw/workspace/projects/my-project && npm test`

### Git
- Commit after each completed task
- Use descriptive commit messages: `feat: implement QFX parser with field extraction`
- Don't push — let the human decide when to push

### Communication
- Your output is delivered to the human's chat (Telegram/Discord)
- Keep summaries brief — one paragraph max
- Include: what task you completed, what test results look like, what's next

### Signals
- If all tasks done: include "ALL_TASKS_COMPLETE" in your response
- If stuck: include "STUCK:" followed by a description of the blocker
- If error: include "ERROR:" followed by the error details

Troubleshooting OpenClaw Integration

Sub-agent can't find files

Verify the project path is absolute
Check that the workspace is accessible (not in a different user's home)

Sub-agent uses wrong model

Specify model explicitly in sessions_spawn
Check session_status to verify which model was used
Known issue: isolated sessions may not respect model parameter (use main session sub-agents instead)

Sub-agent times out

Default timeout may be too short for complex tasks
Set timeoutSeconds: 600 (10 minutes) for implementation tasks
Set timeoutSeconds: 900 (15 minutes) for complex tasks
Set timeoutSeconds: 120 (2 minutes) for simple checks

Cron job fires but nothing happens

Check cron job status: cron list
Review recent runs: cron runs --jobId <id>
Verify the project path in the cron job message

Results not delivered to chat

Check delivery.mode is set to "announce" in cron job config
For sessions_spawn from main session, results auto-deliver

OpenClaw turns the agent harness from a terminal-bound tool into an always-available autonomous workforce.

9.3 KiB Raw Permalink Blame History Unescape Escape