donkeycar-rl-autoresearch/CLAUDE.md

113 lines
3.6 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Claude Code — Donkeycar RL Project Instructions
## Session Startup — Do This First
At the start of every session:
1. Read `agent/SESSION_HANDOFF.md`
2. Check if an experiment is running: `ss -tnp | grep 9091`
3. If running, **immediately arm a background monitor** on the experiment log before anything else
4. Report current status to the user
Do not wait for the user to ask. Arming the monitor is the first action.
---
## Autonomy Instruction
Continue the Donkeycar RL/sim work autonomously. Rebuild, sync, relaunch, run
diagnostics, patch code, and restart experiments as needed. Keep going until you
either have a verified fix and a running experiment, or a concrete blocker that
truly requires the user. Only pause for: risk of data loss, destructive actions,
missing credentials, or major strategy tradeoffs.
If the user says only `continue`, use the instruction above.
---
## Long-Running Task Workflow
A common failure mode: model starts a long-running task, returns to the prompt,
and waits — defeating the purpose. Claude Code has two mechanisms to avoid this.
### 1. Condition-based wakeup (background Bash task)
Use when you want to be woken when something happens (log line appears, file
changes, process exits).
```python
# Pattern: until <condition>; do sleep N; done && <show results>
Bash(
command="until grep -q 'Checkpoint saved' /path/to/run.log; do sleep 15; done "
"&& tail -20 /path/to/run.log",
run_in_background=True
)
```
When the shell command exits, the runtime delivers a `<task-notification>` into
the conversation. This wakes Claude up automatically — no polling needed on the
model side. The model then reads the output and continues work.
**Key rules:**
- Use `until <check>; do sleep N; done` — never `sleep 120 && check` (blocked)
- The notification arrives as a message in the conversation context
- Multiple background tasks can run in parallel and each fires independently
### 2. Time-based wakeup (ScheduleWakeup tool)
Use when you want to be woken after a fixed delay (e.g. "check back in 20 min").
```python
ScheduleWakeup(
delaySeconds=1200, # 20 minutes
reason="checking 100k checkpoint results",
prompt="<<autonomous-loop-dynamic>>" # or the original /loop prompt
)
```
The runtime fires the wakeup after the delay, re-entering the conversation so
the model can continue. Use this for idle waits (waiting for a build, a slow
process, etc.) when there's no clear log-file signal to watch.
**Cache timing guidance:**
- Under 270s: prompt cache stays warm (cheap)
- Over 300s: cache miss on wakeup (more expensive but fine for long waits)
- Default idle: 12001800s (2030 min)
### 3. Combining both
For long experiments, use condition-based tasks for each checkpoint, and
ScheduleWakeup as a fallback heartbeat:
```python
# Fire when checkpoint N arrives
Bash(command="until [ $(grep -c 'Checkpoint' log) -ge 5 ]; do sleep 15; done && tail log",
run_in_background=True)
# Also wake in 30 min regardless, to check for errors
ScheduleWakeup(delaySeconds=1800, reason="exp27 progress check", prompt="continue")
```
---
## Experiment Monitoring Commands
```bash
# Watch live training log
tail -f agent/models/exp27-random-roads/run_*.log
# Check all checkpoints and evals
grep "Checkpoint\|Eval\|NEW BEST" agent/models/exp27-random-roads/run_*.log
# Verify sim is up
python3 -c "import socket; s=socket.socket(); s.settimeout(3); s.connect(('127.0.0.1',9091)); print('OK'); s.close()"
# Check what's connected to sim
ss -tnp | grep 9091
```
## Session Handoff
Full experiment history, current state, and important paths:
`agent/SESSION_HANDOFF.md`