docs: TEST_HISTORY Exp10 plan added
Exp10: generated_track + mountain_track, v5 reward, throttle_min=0.2 Same as Exp9 but with visual diversity from second track. Agent: pi
This commit is contained in:
parent
b19dcc8b80
commit
fecba1dd35
|
|
@ -225,3 +225,18 @@ Train on mountain_track + generated_track together (v5 reward, throttle_min=0.2)
|
||||||
Both tracks have random lighting each episode → model forced to learn lighting-invariant features.
|
Both tracks have random lighting each episode → model forced to learn lighting-invariant features.
|
||||||
Goal: model that is reliable on both training tracks, then test generalisation to generated_road and mini_monaco.
|
Goal: model that is reliable on both training tracks, then test generalisation to generated_road and mini_monaco.
|
||||||
|
|
||||||
|
|
||||||
|
### Exp 10 — Two tracks: generated_track + mountain_track, v5 reward, throttle_min=0.2
|
||||||
|
- **Change from Exp9:** Added generated_track as second training track
|
||||||
|
- **Reward:** v5 (speed × CTE) — unchanged
|
||||||
|
- **throttle_min:** 0.2 — unchanged from Exp9
|
||||||
|
- **Training tracks:** generated_track + mountain_track (round-robin, switch every 6,000 steps)
|
||||||
|
- **Total steps:** 90,000 | Steps per switch: 6,000 | ~7.5 rotations through both tracks
|
||||||
|
- **lr:** 0.000725 — unchanged
|
||||||
|
- **Hypothesis:** Adding generated_track visual diversity forces model to learn
|
||||||
|
lighting-invariant road-following features. Mountain_track teaches hill throttle.
|
||||||
|
Together should generalise better to generated_road and potentially mini_monaco.
|
||||||
|
- **Expected results:** mountain_track reliable, generated_track reliable,
|
||||||
|
generated_road improved, mini_monaco TBD
|
||||||
|
- **This is essentially Trial 9 repeated with:** v5 reward + throttle_min=0.2 + proper checkpointing + exploit fix
|
||||||
|
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue