History

Paul Huliganga beb04f3ebe fix: reward v6 — efficiency gate prevents circular driving, stuck_steps 80→40 v5 dropped the efficiency term to get gradient signal on hills, but this re-enabled circular driving (observed in Exp 11). v6 adds efficiency back as a GATE (not multiplier): if efficiency < 0.15, reward = 0. Otherwise reward = speed × CTE_quality (same as v5). Gate vs multiplier: v4 used efficiency as a multiplier which killed gradient on hills (all terms → 0 simultaneously). v6's gate passes when efficiency is above threshold (car moving forward, even slowly on hill) and only blocks when car is truly circling. Also reduced stuck_steps from 80 to 40 (~2.5s vs ~5s) — user reported car stuck against barriers for ~10s which is too long with DummyVecEnv.		2026-04-19 12:02:55 -04:00
..
README.md	save: all experiment scripts moved from /tmp to agent/experiments/	2026-04-18 21:30:08 -04:00
exp6_mountain_v5_proper.py	save: all experiment scripts moved from /tmp to agent/experiments/	2026-04-18 21:30:08 -04:00
exp7_mountain_proper.py	save: all experiment scripts moved from /tmp to agent/experiments/	2026-04-18 21:30:08 -04:00
exp8_mountain_clean.py	save: all experiment scripts moved from /tmp to agent/experiments/	2026-04-18 21:30:08 -04:00
exp9_mountain_v5_throttle02.py	save: all experiment scripts moved from /tmp to agent/experiments/	2026-04-18 21:30:08 -04:00
exp10_two_tracks.py	save: all experiment scripts moved from /tmp to agent/experiments/	2026-04-18 21:30:08 -04:00
exp11_parallel_envs.py	fix: reward v6 — efficiency gate prevents circular driving, stuck_steps 80→40	2026-04-19 12:02:55 -04:00
mountain_continue.py	save: all experiment scripts moved from /tmp to agent/experiments/	2026-04-18 21:30:08 -04:00
mountain_high_throttle.py	save: all experiment scripts moved from /tmp to agent/experiments/	2026-04-18 21:30:08 -04:00
mountain_v5.py	save: all experiment scripts moved from /tmp to agent/experiments/	2026-04-18 21:30:08 -04:00
overnight.py	save: all experiment scripts moved from /tmp to agent/experiments/	2026-04-18 21:30:08 -04:00
wave5_train.py	save: all experiment scripts moved from /tmp to agent/experiments/	2026-04-18 21:30:08 -04:00

README.md

Experiment Scripts

These scripts were used to run individual training experiments. Each corresponds to an entry in docs/TEST_HISTORY.md.

Script	Experiment	Key change
mountain_v5.py	Exp 5	v5 reward + throttle_min=0.5, direct model.learn()
mountain_continue.py	Exp 4	Continued Exp3 training
mountain_high_throttle.py	Exp 3	throttle_min=0.5, old v4 reward
exp6_mountain_v5_proper.py	Exp 6	v5 + termination, wrong steps_per_switch (=total)
exp7_mountain_proper.py	Exp 7	v5 + termination, correct steps_per_switch=6000, had phantom car issue
exp8_mountain_clean.py	Exp 8	v5 + throttle_min=0.5, single connection, correct checkpointing
exp9_mountain_v5_throttle02.py	Exp 9	v5 + throttle_min=0.2, OUR BEST MODEL
exp10_two_tracks.py	Exp 10	Two tracks via custom script (abandoned — used multitrack_runner.py instead)
overnight.py	Overnight runs	mountain-only and Trial9-repeat experiments
wave5_train.py	Wave 5	generated_track only with throttle_min=0.2

Rule going forward

ALL experiment scripts must be saved here and committed to git BEFORE running. Scripts in /tmp are lost on reboot.

Running experiments

Use multitrack_runner.py directly for two-track training: python3 multitrack_runner.py --total-timesteps 90000 --steps-per-switch 6000 ...

For single-track experiments, use the pattern from exp8/exp9:

VecTransposeImage(DummyVecEnv([make_env])) for env creation
Direct model.learn() loop with manual checkpointing
No close_and_switch() for single track