diff --git a/docs/TEST_HISTORY.md b/docs/TEST_HISTORY.md index 18330ed..22f6f5c 100644 --- a/docs/TEST_HISTORY.md +++ b/docs/TEST_HISTORY.md @@ -225,3 +225,18 @@ Train on mountain_track + generated_track together (v5 reward, throttle_min=0.2) Both tracks have random lighting each episode → model forced to learn lighting-invariant features. Goal: model that is reliable on both training tracks, then test generalisation to generated_road and mini_monaco. + +### Exp 10 — Two tracks: generated_track + mountain_track, v5 reward, throttle_min=0.2 +- **Change from Exp9:** Added generated_track as second training track +- **Reward:** v5 (speed × CTE) — unchanged +- **throttle_min:** 0.2 — unchanged from Exp9 +- **Training tracks:** generated_track + mountain_track (round-robin, switch every 6,000 steps) +- **Total steps:** 90,000 | Steps per switch: 6,000 | ~7.5 rotations through both tracks +- **lr:** 0.000725 — unchanged +- **Hypothesis:** Adding generated_track visual diversity forces model to learn + lighting-invariant road-following features. Mountain_track teaches hill throttle. + Together should generalise better to generated_road and potentially mini_monaco. +- **Expected results:** mountain_track reliable, generated_track reliable, + generated_road improved, mini_monaco TBD +- **This is essentially Trial 9 repeated with:** v5 reward + throttle_min=0.2 + proper checkpointing + exploit fix +