StuckTerminationWrapper wall-clock timer was resettable by barrier-sliding: car drifting 0.5m along a wall repeatedly resets the 12s timer. At low sim fps (1-2fps when both cars stuck), 40-step check also takes minutes. Fix: added max_episode_seconds=30 — hard wall-clock limit per episode, independent of position or sim fps. No episode can run longer than 30s. Also adds monitor_training.sh: independent shell process that checks every 5 minutes and appends status to /tmp/training_monitor.log — works without Claude being active. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| README.md | ||
| exp6_mountain_v5_proper.py | ||
| exp7_mountain_proper.py | ||
| exp8_mountain_clean.py | ||
| exp9_mountain_v5_throttle02.py | ||
| exp10_two_tracks.py | ||
| exp11_parallel_envs.py | ||
| exp11b_parallel_v6.py | ||
| exp11c_parallel_v6_250k.py | ||
| exp11d_parallel_v61.py | ||
| exp12_mountain_single.py | ||
| exp13_gentrack_v4.py | ||
| exp14_finetune_v5.py | ||
| exp14_mountain_v5.py | ||
| exp15_gentrack_from_mountain.py | ||
| exp16_mountain_from_gentrack.py | ||
| exp17_parallel_450k.py | ||
| exp18_parallel_450k_v2.py | ||
| exp19_parallel_450k_v3.py | ||
| mountain_continue.py | ||
| mountain_high_throttle.py | ||
| mountain_v5.py | ||
| overnight.py | ||
| wave5_train.py | ||
README.md
Experiment Scripts
These scripts were used to run individual training experiments. Each corresponds to an entry in docs/TEST_HISTORY.md.
| Script | Experiment | Key change |
|---|---|---|
| exp18_parallel_450k_v2.py | Exp 18 | Exp 17 + exploit fix: window=200, min_lap_time=12s |
| exp17_parallel_450k.py | Exp 17 | Parallel DummyVecEnv, 450k steps, v6 reward — circular exploit dominated |
| mountain_v5.py | Exp 5 | v5 reward + throttle_min=0.5, direct model.learn() |
| mountain_continue.py | Exp 4 | Continued Exp3 training |
| mountain_high_throttle.py | Exp 3 | throttle_min=0.5, old v4 reward |
| exp6_mountain_v5_proper.py | Exp 6 | v5 + termination, wrong steps_per_switch (=total) |
| exp7_mountain_proper.py | Exp 7 | v5 + termination, correct steps_per_switch=6000, had phantom car issue |
| exp8_mountain_clean.py | Exp 8 | v5 + throttle_min=0.5, single connection, correct checkpointing |
| exp9_mountain_v5_throttle02.py | Exp 9 | v5 + throttle_min=0.2, OUR BEST MODEL |
| exp10_two_tracks.py | Exp 10 | Two tracks via custom script (abandoned — used multitrack_runner.py instead) |
| overnight.py | Overnight runs | mountain-only and Trial9-repeat experiments |
| wave5_train.py | Wave 5 | generated_track only with throttle_min=0.2 |
Rule going forward
ALL experiment scripts must be saved here and committed to git BEFORE running. Scripts in /tmp are lost on reboot.
Running experiments
Use multitrack_runner.py directly for two-track training: python3 multitrack_runner.py --total-timesteps 90000 --steps-per-switch 6000 ...
For single-track experiments, use the pattern from exp8/exp9:
- VecTransposeImage(DummyVecEnv([make_env])) for env creation
- Direct model.learn() loop with manual checkpointing
- No close_and_switch() for single track