donkeycar-rl-autoresearch

History

Paul Huliganga beb04f3ebe fix: reward v6 — efficiency gate prevents circular driving, stuck_steps 80→40 v5 dropped the efficiency term to get gradient signal on hills, but this re-enabled circular driving (observed in Exp 11). v6 adds efficiency back as a GATE (not multiplier): if efficiency < 0.15, reward = 0. Otherwise reward = speed × CTE_quality (same as v5). Gate vs multiplier: v4 used efficiency as a multiplier which killed gradient on hills (all terms → 0 simultaneously). v6's gate passes when efficiency is above threshold (car moving forward, even slowly on hill) and only blocks when car is truly circling. Also reduced stuck_steps from 80 to 40 (~2.5s vs ~5s) — user reported car stuck against barriers for ~10s which is too long with DummyVecEnv.		2026-04-19 12:02:55 -04:00
..
__init__.py	feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing	2026-04-13 10:03:15 -04:00
test_autoresearch_controller.py	fix: reward v4 — full sim bypass kills circular driving at root	2026-04-13 20:56:32 -04:00
test_behavioral_wrappers.py	feat: Phase 3 — behavioral control, enhanced evaluator, 53 tests	2026-04-14 09:28:43 -04:00
test_discretize_action.py	feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing	2026-04-13 10:03:15 -04:00
test_end_to_end.py	Wave 4: scratch training on generated_track + mountain_track, zero-shot mini_monaco	2026-04-14 22:40:38 -04:00
test_reward_wrapper.py	fix: reward v6 — efficiency gate prevents circular driving, stuck_steps 80→40	2026-04-19 12:02:55 -04:00
test_runner_integration.py	feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing	2026-04-13 10:03:15 -04:00
test_wave3.py	fix: StuckTerminationWrapper + deque import + 102 tests	2026-04-15 09:17:27 -04:00