84 lines
3.7 KiB
Markdown
84 lines
3.7 KiB
Markdown
# Project State — April 16, 2026
|
|
|
|
## The Goal
|
|
Train a DonkeyCar model that generalises to any road-surface track
|
|
(outdoor, asphalt, lane markings) — demonstrated by driving a
|
|
never-seen track without crashing.
|
|
|
|
---
|
|
|
|
## Models On Disk — The Ones That Matter
|
|
|
|
| Model | Path | Trained on | Steps | Notes |
|
|
|---|---|---|---|---|
|
|
| Phase 2 champion | `models/champion/model.zip` | generated_road | 13k | PPO, confirmed drives generated_road |
|
|
| Wave 4 Trial 3 | `models/wave4-trial-0003/model.zip` | generated_track + mountain_track | 157k | "Amazing" laps observed Apr 15 morning — unverified cleanly |
|
|
| Wave 4 Trial 9 | `models/wave4-trial-0009/model.zip` | generated_track + mountain_track | 90k | Genuine laps in training log; scored 1435 on mini_monaco — unverified |
|
|
| Wave 4 Trial 14 | `models/wave4-trial-0014/model.zip` | generated_track + mountain_track | 69k | Scored 1573 on mini_monaco — unverified |
|
|
| Wave 4 Trial 25 | `models/wave4-trial-0025/model.zip` | generated_track + mountain_track | ~63k | Scored 1543 on mini_monaco — unverified |
|
|
|
|
---
|
|
|
|
## What We Know With Certainty
|
|
|
|
- Phase 2 champion drives **generated_road** — confirmed by observation + test
|
|
- Phase 2 champion **fails** on generated_track — confirmed by multitrack_eval
|
|
- Warm-start from Phase 2 champion causes catastrophic forgetting on multi-track — confirmed (Wave 3)
|
|
- 90k steps / trial is the reliable max before 2-hour timeout at 16 steps/sec
|
|
|
|
## What We Do NOT Know
|
|
|
|
- Whether the Wave 4 Trial 3 model genuinely drives generated_track or was exploiting
|
|
- Whether the 1435/1573/1543 mini_monaco scores are genuine driving or shuttle exploit
|
|
- Whether any Wave 4 model can drive generated_road (never tested)
|
|
|
|
---
|
|
|
|
## Full Wave 4 Results (25 trials, exploit-patched reward)
|
|
|
|
| Trial | LR | Switch | mini_monaco | Verdict |
|
|
|---|---|---|---|---|
|
|
| 1 | 0.000300 | 6,000 | 42 | Crashes fast |
|
|
| 2 | 0.001000 | 6,000 | 93 | Crashes |
|
|
| 3 | 0.000816 | 8,441 | timeout | Lost |
|
|
| 4 | 0.000209 | 19,927 | timeout | Lost |
|
|
| 5 | 0.000752 | 9,368 | 32 | Crashes fast |
|
|
| 6 | 0.001622 | 5,524 | 177 | Crashes |
|
|
| 7 | 0.000307 | 14,103 | 81 | Crashes |
|
|
| 8 | 0.000848 | 14,326 | 116 | Crashes |
|
|
| **9** | **0.000725** | **6,851** | **1435** | **⚠️ Unverified — test candidate** |
|
|
| 10 | 0.001058 | 4,587 | 141 | Crashes |
|
|
| 11 | 0.000445 | 6,345 | 85 | Crashes |
|
|
| 12 | 0.000860 | 6,936 | 132 | Crashes |
|
|
| 13 | 0.001912 | 3,574 | 87 | Crashes |
|
|
| **14** | **0.000339** | **5,448** | **1573** | **⚠️ Unverified — test candidate** |
|
|
| 15 | 0.000399 | 7,747 | 111 | Crashes |
|
|
| 16 | 0.000403 | 3,490 | 60 | Crashes fast |
|
|
| 17 | 0.000725 | 5,286 | 106 | Crashes |
|
|
| 18 | 0.000474 | 5,999 | 116 | Crashes |
|
|
| 19 | 0.000629 | 8,211 | 231 | Best honest result |
|
|
| 20 | 0.000199 | 3,037 | 21 | Crashes immediately |
|
|
| 21 | 0.000524 | 7,044 | 86 | Crashes |
|
|
| 22 | 0.001104 | 8,756 | 193 | Crashes |
|
|
| 23 | 0.000313 | 4,507 | 151 | Crashes |
|
|
| 24 | 0.001925 | 4,185 | 38 | Crashes fast |
|
|
| **25** | **0.000313** | **6,836** | **1543** | **⚠️ Unverified — test candidate** |
|
|
|
|
Median (excluding 3 outliers): **106**. No upward trend. GP did not converge.
|
|
|
|
---
|
|
|
|
## Pending Tests (agreed, to be run now)
|
|
|
|
| # | Model | Track | Purpose |
|
|
|---|---|---|---|
|
|
| 1 | Phase 2 champion | generated_road | Sanity baseline |
|
|
| 2 | Wave 4 Trial 3 | generated_track | Was the "amazing" driving real? |
|
|
| 3 | Wave 4 Trial 9 | generated_track | Were those 10-40s laps real? |
|
|
| 4 | Wave 4 Trial 9 | mini_monaco | Is 1435 genuine or exploit? |
|
|
| 5 | Wave 4 Trial 14 | mini_monaco | Is 1573 genuine or exploit? |
|
|
| 6 | Wave 4 Trial 25 | mini_monaco | Is 1543 genuine or exploit? |
|
|
|
|
**Pass criterion (agreed):** Drives 3 laps without crashing, observed by user.
|
|
|