Paul Huliganga
|
7b8830f0cb
|
milestone: Phase 1 complete — genuine driving confirmed; launch Phase 2 corner learning
PHASE 1 MILESTONE:
- Champion model drives the track for 599 steps (mean_reward=1022.78, std=0.45)
- Path efficiency 96-100% throughout — genuine forward motion confirmed
- Navigates first right-hand curve successfully
- Fails at S-curve (right->left) at step ~560: speed too high for tight corners
- Root cause: only 4787 training timesteps — model never sees S-curve enough to learn it
PHASE 2 CONFIG (corner learning):
- timesteps: 10,000-50,000 (10x more — model must experience S-curve many times)
- learning_rate: 0.00005-0.002 (tightened around Phase 1 winning region)
- eval_episodes: 5 (more reliable corner stats)
- JOB_TIMEOUT: 3600s (50k steps on CPU needs time)
- Results: autoresearch_results_phase2.jsonl (clean separation from Phase 1)
Research documentation:
- Phase 1 milestone added to docs/RESEARCH_LOG.md
- Full trajectory analysis: start -> first corner -> S-curve crash position logged
- Reward shaping v3 path efficiency victory documented
- evaluate_champion.py added for visual + diagnostic evaluation
Agent: pi/claude-sonnet
Tests: 40/40 passing
Tests-Added: 0
TypeScript: N/A
|
2026-04-13 19:33:06 -04:00 |