# Project State — April 16, 2026 ## The Goal Train a DonkeyCar model that generalises to any road-surface track (outdoor, asphalt, lane markings) — demonstrated by driving a never-seen track without crashing. --- ## Models On Disk — The Ones That Matter | Model | Path | Trained on | Steps | Notes | |---|---|---|---|---| | Phase 2 champion | `models/champion/model.zip` | generated_road | 13k | PPO, confirmed drives generated_road | | Wave 4 Trial 3 | `models/wave4-trial-0003/model.zip` | generated_track + mountain_track | 157k | "Amazing" laps observed Apr 15 morning — unverified cleanly | | Wave 4 Trial 9 | `models/wave4-trial-0009/model.zip` | generated_track + mountain_track | 90k | Genuine laps in training log; scored 1435 on mini_monaco — unverified | | Wave 4 Trial 14 | `models/wave4-trial-0014/model.zip` | generated_track + mountain_track | 69k | Scored 1573 on mini_monaco — unverified | | Wave 4 Trial 25 | `models/wave4-trial-0025/model.zip` | generated_track + mountain_track | ~63k | Scored 1543 on mini_monaco — unverified | --- ## What We Know With Certainty - Phase 2 champion drives **generated_road** — confirmed by observation + test - Phase 2 champion **fails** on generated_track — confirmed by multitrack_eval - Warm-start from Phase 2 champion causes catastrophic forgetting on multi-track — confirmed (Wave 3) - 90k steps / trial is the reliable max before 2-hour timeout at 16 steps/sec ## What We Do NOT Know - Whether the Wave 4 Trial 3 model genuinely drives generated_track or was exploiting - Whether the 1435/1573/1543 mini_monaco scores are genuine driving or shuttle exploit - Whether any Wave 4 model can drive generated_road (never tested) --- ## Full Wave 4 Results (25 trials, exploit-patched reward) | Trial | LR | Switch | mini_monaco | Verdict | |---|---|---|---|---| | 1 | 0.000300 | 6,000 | 42 | Crashes fast | | 2 | 0.001000 | 6,000 | 93 | Crashes | | 3 | 0.000816 | 8,441 | timeout | Lost | | 4 | 0.000209 | 19,927 | timeout | Lost | | 5 | 0.000752 | 9,368 | 32 | Crashes fast | | 6 | 0.001622 | 5,524 | 177 | Crashes | | 7 | 0.000307 | 14,103 | 81 | Crashes | | 8 | 0.000848 | 14,326 | 116 | Crashes | | **9** | **0.000725** | **6,851** | **1435** | **⚠️ Unverified — test candidate** | | 10 | 0.001058 | 4,587 | 141 | Crashes | | 11 | 0.000445 | 6,345 | 85 | Crashes | | 12 | 0.000860 | 6,936 | 132 | Crashes | | 13 | 0.001912 | 3,574 | 87 | Crashes | | **14** | **0.000339** | **5,448** | **1573** | **⚠️ Unverified — test candidate** | | 15 | 0.000399 | 7,747 | 111 | Crashes | | 16 | 0.000403 | 3,490 | 60 | Crashes fast | | 17 | 0.000725 | 5,286 | 106 | Crashes | | 18 | 0.000474 | 5,999 | 116 | Crashes | | 19 | 0.000629 | 8,211 | 231 | Best honest result | | 20 | 0.000199 | 3,037 | 21 | Crashes immediately | | 21 | 0.000524 | 7,044 | 86 | Crashes | | 22 | 0.001104 | 8,756 | 193 | Crashes | | 23 | 0.000313 | 4,507 | 151 | Crashes | | 24 | 0.001925 | 4,185 | 38 | Crashes fast | | **25** | **0.000313** | **6,836** | **1543** | **⚠️ Unverified — test candidate** | Median (excluding 3 outliers): **106**. No upward trend. GP did not converge. --- ## Pending Tests (agreed, to be run now) | # | Model | Track | Purpose | |---|---|---|---| | 1 | Phase 2 champion | generated_road | Sanity baseline | | 2 | Wave 4 Trial 3 | generated_track | Was the "amazing" driving real? | | 3 | Wave 4 Trial 9 | generated_track | Were those 10-40s laps real? | | 4 | Wave 4 Trial 9 | mini_monaco | Is 1435 genuine or exploit? | | 5 | Wave 4 Trial 14 | mini_monaco | Is 1573 genuine or exploit? | | 6 | Wave 4 Trial 25 | mini_monaco | Is 1543 genuine or exploit? | **Pass criterion (agreed):** Drives 3 laps without crashing, observed by user.