PHASE 2 MILESTONE DOCUMENTED:
All 3 top models complete the full track with distinct driving styles:
- Trial 20 (n_steer=3): Right lane, stable steering — CHAMPION ✅
- Trial 8 (n_steer=4): Left/center lane, oscillating (still completes!)
- Trial 18 (n_steer=3): Right shoulder, very accurate line following
Key finding: fewer steering bins (n_steer=3) = better driving (counterintuitive)
CTE symmetry explains left/right preference: random NN init determines which side
BEHAVIORAL REWARD WRAPPERS (agent/behavioral_wrappers.py):
- LanePositionWrapper: target a specific CTE offset (control left/right preference)
- AntiOscillationWrapper: penalise rapid steering changes (fix Model 2 oscillation)
- AsymmetricCTEWrapper: enforce right-lane rule (penalise left-of-centre more)
- CombinedBehavioralWrapper: all three combined in one wrapper
ENHANCED EVALUATOR (agent/evaluate_champion.py):
- Full metrics: reward, lap time, oscillation score, CTE distribution, lane position
- --compare flag: runs all top Phase 2 models side by side with comparison table
- Saves eval summary to outerloop-results/eval_summary.jsonl
- Detects lap completion events from sim info dict
IMPLEMENTATION PLAN updated: Wave 3 streams defined
RESEARCH LOG updated: Phase 2 milestone, behavioral analysis, next steps
Champion updated to Trial 20 (Phase 2)
Agent: pi/claude-sonnet
Tests: 53/53 passing (+13 behavioral wrapper tests)
Tests-Added: +13
TypeScript: N/A
- Rebuilt donkeycar_sb3_runner.py: real PPO/DQN model.learn() + evaluate_policy() + model.save()
- Added SpeedRewardWrapper: reward = speed * (1 - |cte|/max_cte)
- Added ChampionTracker: tracks best model across all trials, writes manifest.json
- Rebuilt autoresearch_controller.py: Phase 1 results separated from random-policy data
- Added timesteps to GP search space
- Added --push-every N for automatic git push
- Added 37 passing tests: discretize_action, reward_wrapper, autoresearch_controller, runner_integration
- Scaffolded project with agent harness (large mode): PROJECT-SPEC, DECISIONS, IMPLEMENTATION_PLAN, EXECUTION_MASTER
- Fixed: model.save() never called before model is defined (was root cause of all prior NameError crashes)
- Fixed: random policy replaced with real trained policy evaluation
Agent: pi/claude-sonnet
Tests: 37/37 passing
Tests-Added: +37
TypeScript: N/A