donkeycar-rl-autoresearch/.harness/EXECUTION_MASTER.md

38 lines
1.7 KiB
Markdown

# Execution Master — DonkeyCar RL Autoresearch
## Wave Status
| Wave | Description | Status |
|------|-------------|--------|
| Wave 1 | Real Training Foundation | 🟠 In progress |
| Wave 2 | Multi-Track Generalization | ⏸️ Not started |
| Wave 3 | Racing / Speed Optimization | ⏸️ Not started |
## Active Streams
| Stream | Branch | Status | Blocker |
|--------|--------|--------|---------|
| 1A: Core Runner Rebuild | main | 🟠 In progress | None |
| 1B: Tests | main | ⏸️ Planned | 1A-01 must complete first |
| 1C: First Real Autoresearch | main | ⏸️ Planned | 1A + 1B complete, sim running |
## Wave 1 Gate Criteria
Before starting Wave 2, ALL must be true:
- [ ] All 1A, 1B, 1C tasks checked off in IMPLEMENTATION_PLAN.md
- [ ] `pytest tests/ -v` — all tests green
- [ ] Champion model exists at `agent/models/champion/model.zip`
- [ ] Champion mean_reward > 100 on training track
- [ ] `champion_manifest.json` exists and is valid
- [ ] Regression baseline saved
- [ ] Wave 1 process eval written: `.harness/wave1/process-eval.md`
- [ ] All results pushed to Gitea
## Parallelism Rules
1. 1A and 1B can run in parallel (1B mocks the env)
2. 1C cannot start until 1A AND 1B are complete
3. Wave 2 cannot start until Wave 1 gate passes
4. Only one stream touches the sim at a time (1C has exclusive sim access)
## Current Agent Context
- **Active task:** 1A-01 — Rebuild donkeycar_sb3_runner.py with real PPO training
- **Read:** PROJECT-SPEC.md, DECISIONS.md, then pick from IMPLEMENTATION_PLAN.md
- **Important:** The existing autoresearch_results.jsonl contains RANDOM POLICY data — do not mix with Phase 1 real training results. New results go to `autoresearch_results_phase1.jsonl`