v5 dropped the efficiency term to get gradient signal on hills, but this re-enabled circular driving (observed in Exp 11). v6 adds efficiency back as a GATE (not multiplier): if efficiency < 0.15, reward = 0. Otherwise reward = speed × CTE_quality (same as v5). Gate vs multiplier: v4 used efficiency as a multiplier which killed gradient on hills (all terms → 0 simultaneously). v6's gate passes when efficiency is above threshold (car moving forward, even slowly on hill) and only blocks when car is truly circling. Also reduced stuck_steps from 80 to 40 (~2.5s vs ~5s) — user reported car stuck against barriers for ~10s which is too long with DummyVecEnv. |
||
|---|---|---|
| .. | ||
| __init__.py | ||
| test_autoresearch_controller.py | ||
| test_behavioral_wrappers.py | ||
| test_discretize_action.py | ||
| test_end_to_end.py | ||
| test_reward_wrapper.py | ||
| test_runner_integration.py | ||
| test_wave3.py | ||