donkeycar-rl-autoresearch/agent/experiments
Paul Huliganga 813f888502 fix: reward v6.1 — active_node progress terminator kills circle/stuck exploits
User's insight: a circling car stays near the same track waypoints, so
active_node (sim's track progress indicator) never advances. Track the
maximum active_node reached this episode. If it hasn't increased in
progress_patience=60 steps (~3.3s), terminate.

This catches:
  - Circular driving (active_node oscillates, max never increases)
  - Stuck on cone/barrier (active_node frozen)
  - NOT triggered by: legitimate cornering, slow forward progress, lap resets

On lap completion, active_node wraps to 0 — reset max_node_seen and counter.

Also: Exp 12 — single track mountain training with lap-based stopping criterion.
Train until 3 consecutive laps in eval, not fixed step count.
2026-04-19 17:01:41 -04:00
..
README.md save: all experiment scripts moved from /tmp to agent/experiments/ 2026-04-18 21:30:08 -04:00
exp6_mountain_v5_proper.py save: all experiment scripts moved from /tmp to agent/experiments/ 2026-04-18 21:30:08 -04:00
exp7_mountain_proper.py save: all experiment scripts moved from /tmp to agent/experiments/ 2026-04-18 21:30:08 -04:00
exp8_mountain_clean.py save: all experiment scripts moved from /tmp to agent/experiments/ 2026-04-18 21:30:08 -04:00
exp9_mountain_v5_throttle02.py save: all experiment scripts moved from /tmp to agent/experiments/ 2026-04-18 21:30:08 -04:00
exp10_two_tracks.py save: all experiment scripts moved from /tmp to agent/experiments/ 2026-04-18 21:30:08 -04:00
exp11_parallel_envs.py fix: reward v6 — efficiency gate prevents circular driving, stuck_steps 80→40 2026-04-19 12:02:55 -04:00
exp11b_parallel_v6.py feat: Exp 11b — parallel DummyVecEnv + v6 reward (anti-circle gate) + built-in eval 2026-04-19 12:03:46 -04:00
exp11c_parallel_v6_250k.py feat: Exp 11c — parallel DummyVecEnv + v6 reward, extended to 250k steps 2026-04-19 13:27:38 -04:00
exp11d_parallel_v61.py fix: exp11d remove progress_patience — grass fix only per ADR-020 2026-04-19 16:18:17 -04:00
exp12_mountain_single.py fix: reward v6.1 — active_node progress terminator kills circle/stuck exploits 2026-04-19 17:01:41 -04:00
mountain_continue.py save: all experiment scripts moved from /tmp to agent/experiments/ 2026-04-18 21:30:08 -04:00
mountain_high_throttle.py save: all experiment scripts moved from /tmp to agent/experiments/ 2026-04-18 21:30:08 -04:00
mountain_v5.py save: all experiment scripts moved from /tmp to agent/experiments/ 2026-04-18 21:30:08 -04:00
overnight.py save: all experiment scripts moved from /tmp to agent/experiments/ 2026-04-18 21:30:08 -04:00
wave5_train.py save: all experiment scripts moved from /tmp to agent/experiments/ 2026-04-18 21:30:08 -04:00

README.md

Experiment Scripts

These scripts were used to run individual training experiments. Each corresponds to an entry in docs/TEST_HISTORY.md.

Script Experiment Key change
mountain_v5.py Exp 5 v5 reward + throttle_min=0.5, direct model.learn()
mountain_continue.py Exp 4 Continued Exp3 training
mountain_high_throttle.py Exp 3 throttle_min=0.5, old v4 reward
exp6_mountain_v5_proper.py Exp 6 v5 + termination, wrong steps_per_switch (=total)
exp7_mountain_proper.py Exp 7 v5 + termination, correct steps_per_switch=6000, had phantom car issue
exp8_mountain_clean.py Exp 8 v5 + throttle_min=0.5, single connection, correct checkpointing
exp9_mountain_v5_throttle02.py Exp 9 v5 + throttle_min=0.2, OUR BEST MODEL
exp10_two_tracks.py Exp 10 Two tracks via custom script (abandoned — used multitrack_runner.py instead)
overnight.py Overnight runs mountain-only and Trial9-repeat experiments
wave5_train.py Wave 5 generated_track only with throttle_min=0.2

Rule going forward

ALL experiment scripts must be saved here and committed to git BEFORE running. Scripts in /tmp are lost on reboot.

Running experiments

Use multitrack_runner.py directly for two-track training: python3 multitrack_runner.py --total-timesteps 90000 --steps-per-switch 6000 ...

For single-track experiments, use the pattern from exp8/exp9:

  • VecTransposeImage(DummyVecEnv([make_env])) for env creation
  • Direct model.learn() loop with manual checkpointing
  • No close_and_switch() for single track