Warm-starts from wave4-trial-0009/model.zip (best mini-monaco model, completed laps). Fine-tunes on generated track with continuous Box action space preserved (no DiscretizedActionWrapper) at LR=0.00005. 50k steps, checkpoint every 5k, zero-shot mini-monaco eval at end. Tests whether additional generated-track exposure improves corner handling on mini-monaco without catastrophic forgetting of driving skill. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| README.md | ||
| eval_best_models.py | ||
| eval_gentrack_on_minimonaco.py | ||
| exp6_mountain_v5_proper.py | ||
| exp7_mountain_proper.py | ||
| exp8_mountain_clean.py | ||
| exp9_mountain_v5_throttle02.py | ||
| exp10_two_tracks.py | ||
| exp11_parallel_envs.py | ||
| exp11b_parallel_v6.py | ||
| exp11c_parallel_v6_250k.py | ||
| exp11d_parallel_v61.py | ||
| exp12_mountain_single.py | ||
| exp13_gentrack_v4.py | ||
| exp14_finetune_v5.py | ||
| exp14_mountain_v5.py | ||
| exp15_gentrack_from_mountain.py | ||
| exp16_mountain_from_gentrack.py | ||
| exp17_parallel_450k.py | ||
| exp18_parallel_450k_v2.py | ||
| exp19_parallel_450k_v3.py | ||
| exp20_parallel_450k_v5.py | ||
| exp21_generated_pair_warm_v4.py | ||
| exp22_generated_pair_warm_v6.py | ||
| exp23_generated_road_clean.py | ||
| exp24_generated_road_discrete.py | ||
| exp25_wheel_collision_fix.py | ||
| exp26_generated_road_warmstart.py | ||
| exp27_random_roads.py | ||
| exp28_gentrack_finetune.py | ||
| exp29_wave4_gentrack_finetune.py | ||
| mountain_continue.py | ||
| mountain_high_throttle.py | ||
| mountain_v5.py | ||
| overnight.py | ||
| verify_minimap_refresh.py | ||
| verify_road_regen.py | ||
| wave5_train.py | ||
README.md
Experiment Scripts
These scripts were used to run individual training experiments. Each corresponds to an entry in docs/TEST_HISTORY.md.
| Script | Experiment | Key change |
|---|---|---|
| exp18_parallel_450k_v2.py | Exp 18 | Exp 17 + exploit fix: window=200, min_lap_time=12s |
| exp17_parallel_450k.py | Exp 17 | Parallel DummyVecEnv, 450k steps, v6 reward — circular exploit dominated |
| mountain_v5.py | Exp 5 | v5 reward + throttle_min=0.5, direct model.learn() |
| mountain_continue.py | Exp 4 | Continued Exp3 training |
| mountain_high_throttle.py | Exp 3 | throttle_min=0.5, old v4 reward |
| exp6_mountain_v5_proper.py | Exp 6 | v5 + termination, wrong steps_per_switch (=total) |
| exp7_mountain_proper.py | Exp 7 | v5 + termination, correct steps_per_switch=6000, had phantom car issue |
| exp8_mountain_clean.py | Exp 8 | v5 + throttle_min=0.5, single connection, correct checkpointing |
| exp9_mountain_v5_throttle02.py | Exp 9 | v5 + throttle_min=0.2, OUR BEST MODEL |
| exp10_two_tracks.py | Exp 10 | Two tracks via custom script (abandoned — used multitrack_runner.py instead) |
| overnight.py | Overnight runs | mountain-only and Trial9-repeat experiments |
| wave5_train.py | Wave 5 | generated_track only with throttle_min=0.2 |
Rule going forward
ALL experiment scripts must be saved here and committed to git BEFORE running. Scripts in /tmp are lost on reboot.
Running experiments
Use multitrack_runner.py directly for two-track training: python3 multitrack_runner.py --total-timesteps 90000 --steps-per-switch 6000 ...
For single-track experiments, use the pattern from exp8/exp9:
- VecTransposeImage(DummyVecEnv([make_env])) for env creation
- Direct model.learn() loop with manual checkpointing
- No close_and_switch() for single track