diff --git a/docs/TEST_HISTORY.md b/docs/TEST_HISTORY.md
index cffbbac..07f0dc2 100644
--- a/docs/TEST_HISTORY.md
+++ b/docs/TEST_HISTORY.md
@@ -154,3 +154,44 @@ All experiments: mountain_track only, lr=0.000725, throttle_min varies, 90k step
 3. **Test combined model** on all 4 tracks — can it generalise to mini_monaco like Trial 9 did?
 4. **If yes:** We have reproduced Trial 9 reliably with a better reward function
 
+
+### Exp 8 — Mountain track, v5 reward, throttle_min=0.5, CORRECT checkpointing ✅ COMPLETED
+- **Reward:** v5 (speed × CTE-quality)
+- **throttle_min:** 0.5
+- **Method:** Direct model.learn() loop, single TCP connection, NO close_and_switch
+- **Steps:** 90,000 total | 6,000 per segment | 15 checkpoints
+- **Circle exploit fix:** Short-lap terminates episode immediately
+- **Peak segment:** Seg 10 (step 60,000) — 567 reward / 2000 steps (FULL EVAL on mountain_track!)
+- **Policy diverged:** Seg 11-15 (31, 20 reward) — best_model.zip captured the peak correctly
+- **Checkpoints saved:** checkpoint_0006000.zip through checkpoint_0090000.zip + best_model.zip
+- **Final eval results using best_model.zip (step 60k weights):**
+
+| Track | Ep1 | Ep2 | Ep3 | Mean steps | Result |
+|---|---|---|---|---|---|
+| mountain_track (training) | 382 | 529 | 182 | 364 | ❌ crashes |
+| generated_track (zero-shot) | 63 | 61 | 61 | 62 | ❌ crashes |
+| mini_monaco (zero-shot) | 154 | 155 | 104 | 138 | ❌ crashes at one corner |
+| generated_road (zero-shot) | 41 | 42 | 41 | 41 | ❌ crashes |
+
+- **Throttle test:** mini_monaco at throttle_min=0.5 over 5 episodes: 93/94/79/95/94 steps (mean=91, very consistent = same corner every time). throttle_min=0.2 test impossible — action space baked in at training time.
+- **Key findings:**
+  1. ✅ Circle exploit fully eliminated — no short laps observed
+  2. ✅ Best model saving worked — captured step 60k peak, not step 90k drift
+  3. ✅ Genuine 20-22 second laps during training from step ~18k onward
+  4. ❌ Model crashes at exactly the same corner on mini_monaco every time (too fast)
+  5. ❌ throttle_min=0.5 baked into action space — model cannot output throttle < 0.5, cannot slow for corners
+  6. 🔑 INSIGHT: v4 + 0.2 failed because v4 gradient = 0 on hill. v5 gradient is non-zero — model CAN learn to apply high throttle when needed even with 0.2 floor
+
+### Exp 9 — Mountain track, v5 reward, throttle_min=0.2 (RUNNING)
+- **Change from Exp8:** throttle_min: 0.5 → **0.2** (only change)
+- **Reward:** v5 (speed × CTE-quality) — UNCHANGED
+- **Hypothesis:** v5 reward provides non-zero gradient signal on hill (∂reward/∂speed is non-zero). 
+  Model CAN learn to output high throttle on hill. With 0.2 floor, model has full range [0.2, 1.0]
+  and can apply lower throttle on corners — potentially solving mini_monaco corner crash.
+- **What we never tested:** (0.2, v4) failed. (0.5, v5) worked. (0.2, v5) was never tried.
+- **Risk:** Model may still stall on hill if gradient convergence is slow in early training.
+  StuckTermination (-1.0) + v5 speed gradient together should push toward higher throttle.
+- **Next test (Exp10):** Add track_progress bonus to reward (v6) — one variable at a time.
+- **Save dir:** models/exp9-mountain-v5-throttle02/
+- **Watch:** tail -f /tmp/exp9.log
+