3.7 KiB
3.7 KiB
Project State — April 16, 2026
The Goal
Train a DonkeyCar model that generalises to any road-surface track (outdoor, asphalt, lane markings) — demonstrated by driving a never-seen track without crashing.
Models On Disk — The Ones That Matter
| Model | Path | Trained on | Steps | Notes |
|---|---|---|---|---|
| Phase 2 champion | models/champion/model.zip |
generated_road | 13k | PPO, confirmed drives generated_road |
| Wave 4 Trial 3 | models/wave4-trial-0003/model.zip |
generated_track + mountain_track | 157k | "Amazing" laps observed Apr 15 morning — unverified cleanly |
| Wave 4 Trial 9 | models/wave4-trial-0009/model.zip |
generated_track + mountain_track | 90k | Genuine laps in training log; scored 1435 on mini_monaco — unverified |
| Wave 4 Trial 14 | models/wave4-trial-0014/model.zip |
generated_track + mountain_track | 69k | Scored 1573 on mini_monaco — unverified |
| Wave 4 Trial 25 | models/wave4-trial-0025/model.zip |
generated_track + mountain_track | ~63k | Scored 1543 on mini_monaco — unverified |
What We Know With Certainty
- Phase 2 champion drives generated_road — confirmed by observation + test
- Phase 2 champion fails on generated_track — confirmed by multitrack_eval
- Warm-start from Phase 2 champion causes catastrophic forgetting on multi-track — confirmed (Wave 3)
- 90k steps / trial is the reliable max before 2-hour timeout at 16 steps/sec
What We Do NOT Know
- Whether the Wave 4 Trial 3 model genuinely drives generated_track or was exploiting
- Whether the 1435/1573/1543 mini_monaco scores are genuine driving or shuttle exploit
- Whether any Wave 4 model can drive generated_road (never tested)
Full Wave 4 Results (25 trials, exploit-patched reward)
| Trial | LR | Switch | mini_monaco | Verdict |
|---|---|---|---|---|
| 1 | 0.000300 | 6,000 | 42 | Crashes fast |
| 2 | 0.001000 | 6,000 | 93 | Crashes |
| 3 | 0.000816 | 8,441 | timeout | Lost |
| 4 | 0.000209 | 19,927 | timeout | Lost |
| 5 | 0.000752 | 9,368 | 32 | Crashes fast |
| 6 | 0.001622 | 5,524 | 177 | Crashes |
| 7 | 0.000307 | 14,103 | 81 | Crashes |
| 8 | 0.000848 | 14,326 | 116 | Crashes |
| 9 | 0.000725 | 6,851 | 1435 | ⚠️ Unverified — test candidate |
| 10 | 0.001058 | 4,587 | 141 | Crashes |
| 11 | 0.000445 | 6,345 | 85 | Crashes |
| 12 | 0.000860 | 6,936 | 132 | Crashes |
| 13 | 0.001912 | 3,574 | 87 | Crashes |
| 14 | 0.000339 | 5,448 | 1573 | ⚠️ Unverified — test candidate |
| 15 | 0.000399 | 7,747 | 111 | Crashes |
| 16 | 0.000403 | 3,490 | 60 | Crashes fast |
| 17 | 0.000725 | 5,286 | 106 | Crashes |
| 18 | 0.000474 | 5,999 | 116 | Crashes |
| 19 | 0.000629 | 8,211 | 231 | Best honest result |
| 20 | 0.000199 | 3,037 | 21 | Crashes immediately |
| 21 | 0.000524 | 7,044 | 86 | Crashes |
| 22 | 0.001104 | 8,756 | 193 | Crashes |
| 23 | 0.000313 | 4,507 | 151 | Crashes |
| 24 | 0.001925 | 4,185 | 38 | Crashes fast |
| 25 | 0.000313 | 6,836 | 1543 | ⚠️ Unverified — test candidate |
Median (excluding 3 outliers): 106. No upward trend. GP did not converge.
Pending Tests (agreed, to be run now)
| # | Model | Track | Purpose |
|---|---|---|---|
| 1 | Phase 2 champion | generated_road | Sanity baseline |
| 2 | Wave 4 Trial 3 | generated_track | Was the "amazing" driving real? |
| 3 | Wave 4 Trial 9 | generated_track | Were those 10-40s laps real? |
| 4 | Wave 4 Trial 9 | mini_monaco | Is 1435 genuine or exploit? |
| 5 | Wave 4 Trial 14 | mini_monaco | Is 1573 genuine or exploit? |
| 6 | Wave 4 Trial 25 | mini_monaco | Is 1543 genuine or exploit? |
Pass criterion (agreed): Drives 3 laps without crashing, observed by user.