History

Paul Huliganga 2d52bb4ffc fix(core): replace exploit bandaids with solid physics barriers + clean reward Root cause: barriers were zero-thickness MeshCollider planes with no CCD on the car. The car tunnelled through between frames. Every Python patch was trying to catch in code what physics should enforce. Unity (source only — build in progress): - RoadBuilder.cs: CreateBarrier() now makes BoxCollider-per-segment with real 3D volume (barrierThickness=1.0m default) + half-thickness overlap at corners to seal gaps. CreateEndCap() seals open ends of non-looping tracks (generated_road). - Car.cs: rb.collisionDetectionMode = Continuous in Awake() — prevents tunneling. Python: - reward_wrapper.py v7: removed CTE-patience termination, high-CTE negative reward, solid_hit monitoring, low-speed/wedge detection. Kept: efficiency gate, no-progress (active_node) termination, lap exploit guard. Reward = speed×CTE_quality. - exp23_generated_road_clean.py: single track, no warm-start, 200k steps, clean reward, MAX_EPISODE_SECONDS=120 as safety net only. - tests: 17 tests covering clean reward properties. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>		2026-05-05 15:56:00 -04:00
..
README.md	fix: exp18 — fix circular exploit in parallel training (window=200, min_lap=12s)	2026-04-28 09:00:42 -04:00
exp6_mountain_v5_proper.py	save: all experiment scripts moved from /tmp to agent/experiments/	2026-04-18 21:30:08 -04:00
exp7_mountain_proper.py	save: all experiment scripts moved from /tmp to agent/experiments/	2026-04-18 21:30:08 -04:00
exp8_mountain_clean.py	save: all experiment scripts moved from /tmp to agent/experiments/	2026-04-18 21:30:08 -04:00
exp9_mountain_v5_throttle02.py	save: all experiment scripts moved from /tmp to agent/experiments/	2026-04-18 21:30:08 -04:00
exp10_two_tracks.py	save: all experiment scripts moved from /tmp to agent/experiments/	2026-04-18 21:30:08 -04:00
exp11_parallel_envs.py	fix: reward v6 — efficiency gate prevents circular driving, stuck_steps 80→40	2026-04-19 12:02:55 -04:00
exp11b_parallel_v6.py	feat: Exp 11b — parallel DummyVecEnv + v6 reward (anti-circle gate) + built-in eval	2026-04-19 12:03:46 -04:00
exp11c_parallel_v6_250k.py	feat: Exp 11c — parallel DummyVecEnv + v6 reward, extended to 250k steps	2026-04-19 13:27:38 -04:00
exp11d_parallel_v61.py	fix: exp11d remove progress_patience — grass fix only per ADR-020	2026-04-19 16:18:17 -04:00
exp12_mountain_single.py	fix: reward v6.1 — active_node progress terminator kills circle/stuck exploits	2026-04-19 17:01:41 -04:00
exp13_gentrack_v4.py	feat: Exp 13 — generated_track, v4 reward, back to basics (no extra heuristics)	2026-04-19 17:33:17 -04:00
exp14_finetune_v5.py	fix: import json, use make_env_base in phase switch, and run eval sequentially to avoid second concurrent sim car	2026-04-19 20:37:25 -04:00
exp14_mountain_v5.py	fix: exp14 — proper track switch via exit_scene before connecting to mountain_track	2026-04-19 19:18:33 -04:00
exp15_gentrack_from_mountain.py	fix: force scene reset before exp15 generated-track warm-start so sim actually loads generated_track	2026-04-20 16:36:00 -04:00
exp16_mountain_from_gentrack.py	feat: add cross-track warm-start experiments for mountain->generated and generated->mountain	2026-04-20 16:34:24 -04:00
exp17_parallel_450k.py	feat: add exp17 parallel DummyVecEnv 450k training + strategy docs	2026-04-28 02:42:20 -04:00
exp18_parallel_450k_v2.py	fix: exp18 — fix circular exploit in parallel training (window=200, min_lap=12s)	2026-04-28 09:00:42 -04:00
exp19_parallel_450k_v3.py	fix: exp19 — hard episode time limit to stop minutes-long stuck cars	2026-04-28 09:18:04 -04:00
exp20_parallel_450k_v5.py	feat(exp22): add solid-hit/wedge/high-CTE exploit fixes and generated-pair warm experiments	2026-05-05 14:46:13 -04:00
exp21_generated_pair_warm_v4.py	feat(exp22): add solid-hit/wedge/high-CTE exploit fixes and generated-pair warm experiments	2026-05-05 14:46:13 -04:00
exp22_generated_pair_warm_v6.py	feat(exp22): add solid-hit/wedge/high-CTE exploit fixes and generated-pair warm experiments	2026-05-05 14:46:13 -04:00
exp23_generated_road_clean.py	fix(core): replace exploit bandaids with solid physics barriers + clean reward	2026-05-05 15:56:00 -04:00
mountain_continue.py	save: all experiment scripts moved from /tmp to agent/experiments/	2026-04-18 21:30:08 -04:00
mountain_high_throttle.py	save: all experiment scripts moved from /tmp to agent/experiments/	2026-04-18 21:30:08 -04:00
mountain_v5.py	save: all experiment scripts moved from /tmp to agent/experiments/	2026-04-18 21:30:08 -04:00
overnight.py	save: all experiment scripts moved from /tmp to agent/experiments/	2026-04-18 21:30:08 -04:00
wave5_train.py	save: all experiment scripts moved from /tmp to agent/experiments/	2026-04-18 21:30:08 -04:00

README.md

Experiment Scripts

These scripts were used to run individual training experiments. Each corresponds to an entry in docs/TEST_HISTORY.md.

Script	Experiment	Key change
exp18_parallel_450k_v2.py	Exp 18	Exp 17 + exploit fix: window=200, min_lap_time=12s
exp17_parallel_450k.py	Exp 17	Parallel DummyVecEnv, 450k steps, v6 reward — circular exploit dominated
mountain_v5.py	Exp 5	v5 reward + throttle_min=0.5, direct model.learn()
mountain_continue.py	Exp 4	Continued Exp3 training
mountain_high_throttle.py	Exp 3	throttle_min=0.5, old v4 reward
exp6_mountain_v5_proper.py	Exp 6	v5 + termination, wrong steps_per_switch (=total)
exp7_mountain_proper.py	Exp 7	v5 + termination, correct steps_per_switch=6000, had phantom car issue
exp8_mountain_clean.py	Exp 8	v5 + throttle_min=0.5, single connection, correct checkpointing
exp9_mountain_v5_throttle02.py	Exp 9	v5 + throttle_min=0.2, OUR BEST MODEL
exp10_two_tracks.py	Exp 10	Two tracks via custom script (abandoned — used multitrack_runner.py instead)
overnight.py	Overnight runs	mountain-only and Trial9-repeat experiments
wave5_train.py	Wave 5	generated_track only with throttle_min=0.2

Rule going forward

ALL experiment scripts must be saved here and committed to git BEFORE running. Scripts in /tmp are lost on reboot.

Running experiments

Use multitrack_runner.py directly for two-track training: python3 multitrack_runner.py --total-timesteps 90000 --steps-per-switch 6000 ...

For single-track experiments, use the pattern from exp8/exp9:

VecTransposeImage(DummyVecEnv([make_env])) for env creation
Direct model.learn() loop with manual checkpointing
No close_and_switch() for single track