111 lines
5.1 KiB
Markdown
111 lines
5.1 KiB
Markdown
# Project State — April 16, 2026 (post-testing)
|
||
|
||
## The Goal
|
||
Train a DonkeyCar model that generalises to any road-surface track
|
||
(outdoor, asphalt, lane markings) — demonstrated by driving a
|
||
never-seen track without crashing.
|
||
|
||
---
|
||
|
||
## Confirmed Working Models (tested today, observed by user)
|
||
|
||
### ✅ Phase 2 Champion — generated_road
|
||
- **Path:** `models/champion/model.zip`
|
||
- **Trained on:** generated_road only, ~13k steps, lr=0.000225
|
||
- **Test result:** Drove full 2000 steps, 2013 reward. User: "driving very well, stayed in right-hand lane, very very good"
|
||
- **Other tracks:** Confirmed fails on generated_track (old multitrack_eval)
|
||
|
||
### ✅ Wave 4 Trial 9 — generated_track AND mini_monaco
|
||
- **Path:** `models/wave4-trial-0009/model.zip`
|
||
- **Trained on:** generated_track + mountain_track from scratch, ~90k steps, lr=0.000725, switch=6,851
|
||
- **Test on generated_track:** 3/3 episodes drove full 2000 steps, 13–16 second genuine laps
|
||
- **Test on mini_monaco:** Full 2000 steps, 40-second genuine laps (zero-shot — never seen during training)
|
||
- **This is our best model**
|
||
|
||
### ✅ Wave 4 Trial 19 — generated_track (mostly)
|
||
- **Path:** `models/wave4-trial-0019/model.zip`
|
||
- **Trained on:** generated_track + mountain_track from scratch, ~74k steps, lr=0.000629, switch=8,211
|
||
- **Test on generated_track:** 2/3 episodes drove full 2000 steps, 14–17 second genuine laps. 1 crash.
|
||
- **mini_monaco score during training:** 231 (best "honest" result from Wave 4)
|
||
|
||
---
|
||
|
||
## Key Finding: Generated Track Lighting Variation
|
||
The generated_track changes lighting conditions (sun angle, shadows) on every
|
||
env.reset() due to procedural generation. This means during training, every
|
||
episode showed a different visual appearance of the same track. The model was
|
||
forced to learn track-geometry features (road edges, markings) rather than
|
||
lighting-specific patterns. This visual robustness is almost certainly why
|
||
Trial 9 can zero-shot generalise to mini_monaco.
|
||
|
||
---
|
||
|
||
## Full Test Results — April 16
|
||
|
||
| Test | Model | Track | Laps | Steps | Verdict |
|
||
|---|---|---|---|---|---|
|
||
| 1 | Phase 2 champion | generated_road | n/a (not a loop) | 2000/2000 | ✅ DRIVES |
|
||
| 2 | Wave 4 Trial 3 | generated_track | — | — | ❌ MODEL CORRUPTED |
|
||
| 3 | Wave 4 Trial 9 | generated_track | 6 laps × 3 eps | 2000/2000 | ✅ DRIVES |
|
||
| 4 | Wave 4 Trial 9 | mini_monaco | 2 laps per ep | 2000/2000 | ✅ DRIVES (zero-shot) |
|
||
| 5 | Wave 4 Trial 14 | mini_monaco | 1 lap ep2 only | 257/901/253 | ⚠️ INCONSISTENT |
|
||
| 6 | Wave 4 Trial 25 | mini_monaco | 0 | ~147/eps | ❌ CRASHES |
|
||
| + | Wave 4 Trial 19 | generated_track | 5-6 laps × 2 eps | crash/2000/2000 | ✅ MOSTLY |
|
||
| + | Wave 4 Trial 22 | generated_track | 0 | ~110/eps | ❌ SAME SPOT |
|
||
| + | Wave 4 Trial 2 | generated_track | 0 | ~76/eps | ❌ CRASHES |
|
||
| + | Trial 3 (recovered) | generated_track | 0 | ~104/eps | ❌ CRASHES |
|
||
|
||
---
|
||
|
||
## What We Know Now
|
||
|
||
1. **Trial 9 is a genuine multi-track model.** It drives generated_track
|
||
consistently (3/3) with clean laps, AND generalises zero-shot to
|
||
mini_monaco (never seen in training). This is real progress.
|
||
|
||
2. **The "amazing" overnight model (Trial 3) is lost.** The model.zip has
|
||
a corrupted optimizer file. Policy weights were recovered but the model
|
||
crashes at ~104 steps — the "amazing" driving was at an intermediate
|
||
training checkpoint, not the final saved model.
|
||
|
||
3. **Most Wave 4 high scores were not exploits — they were real.**
|
||
Trials 5, 6, and 14 showed inconsistent results (crash some episodes,
|
||
complete lap on others). The model was genuinely learning but unreliably.
|
||
Only Trial 14 and 25's original very high scores (1573, 1543) appear
|
||
to have been exploits in the original training eval.
|
||
|
||
4. **Lighting variation on generated_track is a feature, not a bug.**
|
||
Procedural generation changes sun angle / shadows each episode, forcing
|
||
the model to learn geometry rather than appearance. This may be the key
|
||
to Trial 9's generalisation ability.
|
||
|
||
5. **Mountain_track training — unknown contribution.** We don't know if
|
||
mountain_track training helped or hurt. Trial 9 drives generated_track
|
||
and mini_monaco; whether it can drive mountain_track is untested.
|
||
|
||
---
|
||
|
||
## Open Questions for Strategy Discussion
|
||
|
||
1. Can Trial 9 also drive mountain_track? (untested)
|
||
2. Can Trial 9 drive generated_road? (untested — zero-shot to Phase 2 training track)
|
||
3. Why does Trial 9 drive mini_monaco but other models with similar
|
||
mini_monaco scores (Trial 14: 193, Trial 22: 193) don't reliably?
|
||
4. Would more training steps from Trial 9's hyperparameters produce
|
||
an even better model?
|
||
5. Is mountain_track necessary, or could we get Trial 9's results
|
||
training on generated_track alone?
|
||
|
||
---
|
||
|
||
## Models Available
|
||
|
||
| Model | Path | Status |
|
||
|---|---|---|
|
||
| Phase 2 champion | models/champion/model.zip | ✅ Good |
|
||
| Wave 4 Trial 9 | models/wave4-trial-0009/model.zip | ✅ Best model |
|
||
| Wave 4 Trial 19 | models/wave4-trial-0019/model.zip | ✅ Good |
|
||
| Wave 4 Trial 3 | models/wave4-trial-0003/model.zip | ❌ Corrupted |
|
||
| Wave 4 Trials 1,2,5-8,10-25 | models/wave4-trial-XXXX/ | Available, mostly crash on generated_track |
|
||
|