PHASE 1 MILESTONE: - Champion model drives the track for 599 steps (mean_reward=1022.78, std=0.45) - Path efficiency 96-100% throughout — genuine forward motion confirmed - Navigates first right-hand curve successfully - Fails at S-curve (right->left) at step ~560: speed too high for tight corners - Root cause: only 4787 training timesteps — model never sees S-curve enough to learn it PHASE 2 CONFIG (corner learning): - timesteps: 10,000-50,000 (10x more — model must experience S-curve many times) - learning_rate: 0.00005-0.002 (tightened around Phase 1 winning region) - eval_episodes: 5 (more reliable corner stats) - JOB_TIMEOUT: 3600s (50k steps on CPU needs time) - Results: autoresearch_results_phase2.jsonl (clean separation from Phase 1) Research documentation: - Phase 1 milestone added to docs/RESEARCH_LOG.md - Full trajectory analysis: start -> first corner -> S-curve crash position logged - Reward shaping v3 path efficiency victory documented - evaluate_champion.py added for visual + diagnostic evaluation Agent: pi/claude-sonnet Tests: 40/40 passing Tests-Added: 0 TypeScript: N/A |
||
|---|---|---|
| .harness | ||
| agent | ||
| docs | ||
| tests | ||
| .gitignore | ||
| AGENT.md | ||
| DECISIONS.md | ||
| IMPLEMENTATION_PLAN.md | ||
| PROJECT-KICKOFF.md | ||
| PROJECT-SPEC.md | ||
| README.md | ||
| create_gitea_repo.py | ||
| ralph-loop.sh | ||
README.md
donkeycar-rl-autoresearch
Purpose
Status
- Scaffolded with the agent harness
- Spec not filled yet
Runbook
- Fill PROJECT-SPEC.md
- Create IMPLEMENTATION_PLAN.md from the spec
- Start the implementation loop