donkeycar-rl-autoresearch/docs/STATE.md

3.7 KiB

Project State — April 16, 2026

The Goal

Train a DonkeyCar model that generalises to any road-surface track (outdoor, asphalt, lane markings) — demonstrated by driving a never-seen track without crashing.


Models On Disk — The Ones That Matter

Model Path Trained on Steps Notes
Phase 2 champion models/champion/model.zip generated_road 13k PPO, confirmed drives generated_road
Wave 4 Trial 3 models/wave4-trial-0003/model.zip generated_track + mountain_track 157k "Amazing" laps observed Apr 15 morning — unverified cleanly
Wave 4 Trial 9 models/wave4-trial-0009/model.zip generated_track + mountain_track 90k Genuine laps in training log; scored 1435 on mini_monaco — unverified
Wave 4 Trial 14 models/wave4-trial-0014/model.zip generated_track + mountain_track 69k Scored 1573 on mini_monaco — unverified
Wave 4 Trial 25 models/wave4-trial-0025/model.zip generated_track + mountain_track ~63k Scored 1543 on mini_monaco — unverified

What We Know With Certainty

  • Phase 2 champion drives generated_road — confirmed by observation + test
  • Phase 2 champion fails on generated_track — confirmed by multitrack_eval
  • Warm-start from Phase 2 champion causes catastrophic forgetting on multi-track — confirmed (Wave 3)
  • 90k steps / trial is the reliable max before 2-hour timeout at 16 steps/sec

What We Do NOT Know

  • Whether the Wave 4 Trial 3 model genuinely drives generated_track or was exploiting
  • Whether the 1435/1573/1543 mini_monaco scores are genuine driving or shuttle exploit
  • Whether any Wave 4 model can drive generated_road (never tested)

Full Wave 4 Results (25 trials, exploit-patched reward)

Trial LR Switch mini_monaco Verdict
1 0.000300 6,000 42 Crashes fast
2 0.001000 6,000 93 Crashes
3 0.000816 8,441 timeout Lost
4 0.000209 19,927 timeout Lost
5 0.000752 9,368 32 Crashes fast
6 0.001622 5,524 177 Crashes
7 0.000307 14,103 81 Crashes
8 0.000848 14,326 116 Crashes
9 0.000725 6,851 1435 ⚠️ Unverified — test candidate
10 0.001058 4,587 141 Crashes
11 0.000445 6,345 85 Crashes
12 0.000860 6,936 132 Crashes
13 0.001912 3,574 87 Crashes
14 0.000339 5,448 1573 ⚠️ Unverified — test candidate
15 0.000399 7,747 111 Crashes
16 0.000403 3,490 60 Crashes fast
17 0.000725 5,286 106 Crashes
18 0.000474 5,999 116 Crashes
19 0.000629 8,211 231 Best honest result
20 0.000199 3,037 21 Crashes immediately
21 0.000524 7,044 86 Crashes
22 0.001104 8,756 193 Crashes
23 0.000313 4,507 151 Crashes
24 0.001925 4,185 38 Crashes fast
25 0.000313 6,836 1543 ⚠️ Unverified — test candidate

Median (excluding 3 outliers): 106. No upward trend. GP did not converge.


Pending Tests (agreed, to be run now)

# Model Track Purpose
1 Phase 2 champion generated_road Sanity baseline
2 Wave 4 Trial 3 generated_track Was the "amazing" driving real?
3 Wave 4 Trial 9 generated_track Were those 10-40s laps real?
4 Wave 4 Trial 9 mini_monaco Is 1435 genuine or exploit?
5 Wave 4 Trial 14 mini_monaco Is 1573 genuine or exploit?
6 Wave 4 Trial 25 mini_monaco Is 1543 genuine or exploit?

Pass criterion (agreed): Drives 3 laps without crashing, observed by user.