322 lines
9.3 KiB
Markdown
322 lines
9.3 KiB
Markdown
# OpenClaw Integration — Running the Harness in OpenClaw
|
||
|
||
> The ralph-loop.sh runs locally via CLI tools (claude, codex).
|
||
> OpenClaw provides a richer runtime: sub-agents, cron scheduling, cross-session
|
||
> messaging, and persistent workspace. This guide shows how to use them.
|
||
|
||
---
|
||
|
||
## Architecture: CLI Loop vs OpenClaw
|
||
|
||
### CLI Approach (ralph-loop.sh)
|
||
```
|
||
bash loop → spawn CLI agent → agent reads spec → works → exits → loop restarts
|
||
```
|
||
- Runs on your terminal
|
||
- You watch the output in real-time
|
||
- Agent uses CLI tool (claude -p, codex)
|
||
- Loop manages restarts
|
||
|
||
### OpenClaw Approach
|
||
```
|
||
main session → sessions_spawn → sub-agent reads spec → works → exits → result delivered
|
||
```
|
||
- Runs in the background
|
||
- Results delivered to your chat (Telegram, Discord, etc.)
|
||
- Sub-agent gets full workspace access
|
||
- You orchestrate via main session or cron jobs
|
||
|
||
### When to Use Which
|
||
|
||
| Scenario | Use |
|
||
|----------|-----|
|
||
| Sitting at your terminal, watching progress | ralph-loop.sh |
|
||
| Overnight autonomous work | OpenClaw sessions_spawn |
|
||
| Scheduled recurring tasks | OpenClaw cron |
|
||
| Quick one-off agent tasks | OpenClaw sessions_spawn |
|
||
| Need to monitor from your phone | OpenClaw (results come to Telegram) |
|
||
|
||
---
|
||
|
||
## Method 1: Manual Orchestration (Main Session)
|
||
|
||
You drive each iteration from your chat with your agent (Cleo, etc.):
|
||
|
||
```
|
||
You: "Run the next iteration of the agent harness on /path/to/project"
|
||
Cleo: *spawns sub-agent* → *sub-agent does one task* → *reports back*
|
||
You: "Looks good, run the next one"
|
||
Cleo: *spawns another sub-agent* → *reports back*
|
||
```
|
||
|
||
**Pros:** Full control, you review between iterations
|
||
**Cons:** Requires your attention, not autonomous
|
||
|
||
### How Your Agent Orchestrates
|
||
|
||
Your main agent (e.g., Cleo) uses `sessions_spawn` internally:
|
||
|
||
```
|
||
sessions_spawn(
|
||
task: "Read AGENT.md in /home/user/project/. Follow the core loop:
|
||
orient, pick one task from IMPLEMENTATION_PLAN.md, implement,
|
||
verify (build + test), commit, exit. Report what you completed.",
|
||
model: "sonnet",
|
||
timeoutSeconds: 600
|
||
)
|
||
```
|
||
|
||
The sub-agent:
|
||
1. Reads AGENT.md and PROJECT-SPEC.md
|
||
2. Reads IMPLEMENTATION_PLAN.md
|
||
3. Picks the first unchecked task
|
||
4. Implements and tests it
|
||
5. Commits
|
||
6. Updates the plan
|
||
7. Reports back to your main session
|
||
|
||
---
|
||
|
||
## Method 2: Cron-Based Automation
|
||
|
||
For true overnight autonomy, use OpenClaw cron jobs:
|
||
|
||
### The Iteration Job
|
||
|
||
```json
|
||
{
|
||
"name": "agent-harness-iteration",
|
||
"schedule": { "kind": "every", "everyMs": 900000 },
|
||
"sessionTarget": "isolated",
|
||
"payload": {
|
||
"kind": "agentTurn",
|
||
"message": "Read AGENT.md in /home/paulh/.openclaw/workspace/projects/my-project/. Follow the core loop: orient → pick ONE task → implement → verify → commit → update plan → exit. If all tasks are done, say DONE. If stuck, say STUCK with details.",
|
||
"model": "sonnet",
|
||
"timeoutSeconds": 600
|
||
},
|
||
"delivery": { "mode": "announce" }
|
||
}
|
||
```
|
||
|
||
This spawns a fresh agent every 15 minutes. Each one picks the next task, does it, and reports to your chat.
|
||
|
||
### The Completion Watcher
|
||
|
||
Add logic to stop the loop when done:
|
||
|
||
```json
|
||
{
|
||
"name": "agent-harness-completion-check",
|
||
"schedule": { "kind": "every", "everyMs": 3600000 },
|
||
"sessionTarget": "isolated",
|
||
"payload": {
|
||
"kind": "agentTurn",
|
||
"message": "Read IMPLEMENTATION_PLAN.md in /home/paulh/.openclaw/workspace/projects/my-project/. Count completed vs total tasks. If all tasks are checked, disable the cron job named 'agent-harness-iteration' and announce 'All tasks complete!' If less than 80% done and no progress in last 3 git commits, announce a warning.",
|
||
"model": "sonnet",
|
||
"timeoutSeconds": 120
|
||
},
|
||
"delivery": { "mode": "announce" }
|
||
}
|
||
```
|
||
|
||
### Starting and Stopping
|
||
|
||
```
|
||
You: "Start the agent harness loop for my-project"
|
||
Cleo: *creates cron job* → iterations begin every 15 min
|
||
|
||
You: "Pause the harness"
|
||
Cleo: *disables cron job*
|
||
|
||
You: "Resume"
|
||
Cleo: *re-enables cron job*
|
||
```
|
||
|
||
---
|
||
|
||
## Method 3: Shell Script Orchestration
|
||
|
||
Create a script in your workspace that uses OpenClaw's CLI:
|
||
|
||
```bash
|
||
#!/usr/bin/env bash
|
||
# openclaw-harness.sh — Run agent harness via OpenClaw sub-agents
|
||
#
|
||
# Usage:
|
||
# ./openclaw-harness.sh /path/to/project [max-iterations]
|
||
|
||
set -euo pipefail
|
||
|
||
PROJECT_DIR="${1:?Usage: $0 <project-dir> [max-iterations]}"
|
||
MAX="${2:-20}"
|
||
PLAN="$PROJECT_DIR/IMPLEMENTATION_PLAN.md"
|
||
|
||
echo "🔄 Starting OpenClaw agent harness"
|
||
echo " Project: $PROJECT_DIR"
|
||
echo " Max iterations: $MAX"
|
||
|
||
for i in $(seq 1 "$MAX"); do
|
||
echo ""
|
||
echo "━━━ Iteration $i/$MAX ━━━"
|
||
|
||
# Check if plan exists and all tasks are done
|
||
if [[ -f "$PLAN" ]] && ! grep -q '^\- \[ \]' "$PLAN" 2>/dev/null; then
|
||
echo "✅ All tasks complete!"
|
||
exit 0
|
||
fi
|
||
|
||
# Spawn sub-agent via OpenClaw
|
||
# Note: This is conceptual — actual invocation depends on OpenClaw CLI
|
||
openclaw sessions spawn \
|
||
--task "Read AGENT.md in $PROJECT_DIR. Follow the core loop: orient, pick ONE task, implement, verify, commit, exit." \
|
||
--model sonnet \
|
||
--timeout 600
|
||
|
||
echo " Iteration $i complete. Waiting 30s..."
|
||
sleep 30
|
||
done
|
||
|
||
echo "⚠️ Reached max iterations ($MAX)"
|
||
```
|
||
|
||
**Note:** The exact `openclaw` CLI syntax may vary. Check `openclaw help` for current commands. The `sessions_spawn` tool within an agent session is the most reliable method.
|
||
|
||
---
|
||
|
||
## Monitoring Sub-Agent Work
|
||
|
||
### Real-Time: Session List
|
||
```
|
||
You: "What sub-agents are running?"
|
||
Cleo: *calls sessions_list* → shows active sessions
|
||
```
|
||
|
||
### After Completion: Session History
|
||
```
|
||
You: "What did the last sub-agent do?"
|
||
Cleo: *calls sessions_history(sessionKey)* → shows full transcript
|
||
```
|
||
|
||
The transcript includes:
|
||
- Every tool call the agent made (file reads, writes, exec commands)
|
||
- The agent's reasoning
|
||
- Build/test output
|
||
- Git commits made
|
||
- Final summary
|
||
|
||
### Git Log: The Source of Truth
|
||
```bash
|
||
cd /path/to/project
|
||
git log --oneline -10
|
||
```
|
||
|
||
Each iteration should produce exactly one commit. If you see:
|
||
- **0 commits** after an iteration → agent was stuck
|
||
- **1 commit** → working correctly
|
||
- **2+ commits** → task was large or agent fixed its own mistakes
|
||
|
||
### Plan File: Progress at a Glance
|
||
```bash
|
||
# Count completed vs total
|
||
grep -c '^\- \[x\]' IMPLEMENTATION_PLAN.md # Done
|
||
grep -c '^\- \[ \]' IMPLEMENTATION_PLAN.md # Remaining
|
||
```
|
||
|
||
---
|
||
|
||
## Model Selection Strategy
|
||
|
||
Different tasks need different models:
|
||
|
||
| Task Type | Recommended Model | Why |
|
||
|-----------|------------------|-----|
|
||
| Planning (decomposition) | opus | Needs strong reasoning for task ordering |
|
||
| Scaffolding (config, structure) | sonnet | Simple, deterministic work |
|
||
| Feature implementation | sonnet | Good balance of speed and quality |
|
||
| Complex algorithms | opus | Needs deeper reasoning |
|
||
| Bug fixing | sonnet or opus | Depends on bug complexity |
|
||
| Documentation | sonnet | Writing, not complex reasoning |
|
||
|
||
### Cost Optimization
|
||
|
||
```
|
||
Planning: 1 iteration × opus = $$ (one-time cost)
|
||
Building: 15 iterations × sonnet = $ per iteration
|
||
Complex: 3 iterations × opus = $$ per iteration
|
||
Total: 19 iterations = $$$ (manageable)
|
||
```
|
||
|
||
vs.
|
||
|
||
```
|
||
Everything: 19 iterations × opus = $$$$$$ (expensive!)
|
||
```
|
||
|
||
Use the `model` parameter in sessions_spawn to control this per-iteration.
|
||
|
||
---
|
||
|
||
## OpenClaw-Specific Agent Instructions
|
||
|
||
When running in OpenClaw, add these to your AGENT.md:
|
||
|
||
```markdown
|
||
## OpenClaw Environment Notes
|
||
|
||
### File Access
|
||
- Working directory: /home/paulh/.openclaw/workspace/projects/[project]
|
||
- Use absolute paths in all exec commands
|
||
- The `~` shorthand does NOT work in exec — use full paths
|
||
|
||
### Build Commands
|
||
- Use `cd /absolute/path && command` pattern (cwd parameter is unreliable)
|
||
- Example: `cd /home/paulh/.openclaw/workspace/projects/my-project && npm test`
|
||
|
||
### Git
|
||
- Commit after each completed task
|
||
- Use descriptive commit messages: `feat: implement QFX parser with field extraction`
|
||
- Don't push — let the human decide when to push
|
||
|
||
### Communication
|
||
- Your output is delivered to the human's chat (Telegram/Discord)
|
||
- Keep summaries brief — one paragraph max
|
||
- Include: what task you completed, what test results look like, what's next
|
||
|
||
### Signals
|
||
- If all tasks done: include "ALL_TASKS_COMPLETE" in your response
|
||
- If stuck: include "STUCK:" followed by a description of the blocker
|
||
- If error: include "ERROR:" followed by the error details
|
||
```
|
||
|
||
---
|
||
|
||
## Troubleshooting OpenClaw Integration
|
||
|
||
### Sub-agent can't find files
|
||
- Verify the project path is absolute
|
||
- Check that the workspace is accessible (not in a different user's home)
|
||
|
||
### Sub-agent uses wrong model
|
||
- Specify `model` explicitly in sessions_spawn
|
||
- Check session_status to verify which model was used
|
||
- Known issue: isolated sessions may not respect model parameter (use main session sub-agents instead)
|
||
|
||
### Sub-agent times out
|
||
- Default timeout may be too short for complex tasks
|
||
- Set `timeoutSeconds: 600` (10 minutes) for implementation tasks
|
||
- Set `timeoutSeconds: 900` (15 minutes) for complex tasks
|
||
- Set `timeoutSeconds: 120` (2 minutes) for simple checks
|
||
|
||
### Cron job fires but nothing happens
|
||
- Check cron job status: `cron list`
|
||
- Review recent runs: `cron runs --jobId <id>`
|
||
- Verify the project path in the cron job message
|
||
|
||
### Results not delivered to chat
|
||
- Check `delivery.mode` is set to `"announce"` in cron job config
|
||
- For sessions_spawn from main session, results auto-deliver
|
||
|
||
---
|
||
|
||
_OpenClaw turns the agent harness from a terminal-bound tool into an always-available autonomous workforce._
|