docs: capture robust mountain finetune winner at 36k and preserve eval comparison

2026-04-20 00:43:27 -04:00 · 2026-04-20 00:43:27 -04:00 · 0da04327ef
parent 2b90de2fba
commit 0da04327ef
7 changed files with 407 additions and 1 deletions
--- a/DECISIONS.md
+++ b/DECISIONS.md
@ -480,3 +480,70 @@ The car exits through the gap, CTE quickly exceeds 16m, hits `pass` — episode
 - Resets counter when car returns to within 4m (brief excursions allowed)

 **Note:** We cannot fix the Unity sim code directly.
+
+---
+
+## ADR-022: Promote Mountain Finetune Checkpoint by Robustness, Not Raw Lap Time
+
+**Date:** 2026-04-19
+**Status:** Accepted
+
+**Context:** The mountain finetune (`exp14_finetune_v5.py`) produced several early
+checkpoints with very fast lap times under a temporary `throttle_floor=0.4`, but
+those checkpoints were extremely fragile in repeated deterministic evaluation.
+Later checkpoints degraded badly and often failed to complete any laps.
+
+**Decision:** Select the best mountain finetune checkpoint using a combined
+speed + robustness criterion, not raw best lap alone.
+
+**Chosen model:**
+- canonical checkpoint: `agent/models/exp14-mountain-v5-finetune/checkpoint_0036000.zip`
+- promoted copy: `agent/models/exp14-mountain-v5-finetune/best_robust_model_0036000.zip`
+
+**Why:** In a fresh 9-episode deterministic mountain evaluation, the 36k
+checkpoint achieved:
+- 9/9 successful episodes
+- 25 total laps
+- mean lap 27.93s
+- best lap 26.16s
+
+This beat:
+- the original exp14 mountain champion in robustness
+- the 6k/24k/30k finetune checkpoints in robustness
+- the later 42k/48k+ checkpoints in both robustness and stability
+
+**Consequence:** Future mountain work should warm-start from the promoted 36k
+robust checkpoint, not from the final finetune model and not from the fastest
+but fragile 0.4-floor checkpoints.
+
+---
+
+## ADR-023: Mountain Track Likely Has a Real Unity Physics / Traction Problem
+
+**Date:** 2026-04-19
+**Status:** Accepted
+
+**Context:** The user observed that on mountain hills, the DonkeyCar often tries
+to move forward while the wheels visibly spin and the car barely advances.
+This looked like traction loss rather than a pure policy error.
+
+**Evidence from Unity source:**
+- Unity repo path: `/mnt/c/Users/Paul/Documents/projects/sdsandbox`
+- `sdsim/Assets/Scripts/WheelPhys.cs` scales wheel friction by the hit collider's
+  physics material static friction:
+  - `hit.collider.material.staticFriction * originalForwardStiffness`
+- `sdsim/Assets/Scenes/mountain_track.unity` contains 4 explicit `Slippery`
+  material assignments on colliders from the imported `long_road` FBX instance
+- physics material values:
+  - `Slippery.staticFriction = 0.1`
+  - `Road.staticFriction = 0.5`
+  - `Grippy.staticFriction = 0.66`
+
+**Decision:** Treat mountain wheelspin / poor uphill progress as a likely real
+sim-physics issue, not just an RL reward or hyperparameter issue.
+
+**Consequence:** Before trusting further mountain finetuning results, investigate
+and likely patch the Unity scene on branch:
+- `investigate-mountain-friction`
+
+This should be prioritized over adding more reward heuristics.
--- a/agent/models/exp14-mountain-v5-finetune/best_robust_model_0036000.zip
+++ b/agent/models/exp14-mountain-v5-finetune/best_robust_model_0036000.zip
--- a/agent/outerloop-results/mountain_candidate_eval_2026-04-19.jsonl
+++ b/agent/outerloop-results/mountain_candidate_eval_2026-04-19.jsonl
@ -0,0 +1,63 @@
+{"label": "exp14_base", "model_path": "models/exp14-mountain-v5/best_model.zip", "throttle_floor": 0.2, "episode": 1, "steps": 814, "laps": 1, "lap_times": [31.09375], "reward": 132.39361070096493}
+{"label": "exp14_base", "model_path": "models/exp14-mountain-v5/best_model.zip", "throttle_floor": 0.2, "episode": 2, "steps": 2000, "laps": 3, "lap_times": [29.046875, 28.390625, 28.71875], "reward": 345.59998378881755}
+{"label": "exp14_base", "model_path": "models/exp14-mountain-v5/best_model.zip", "throttle_floor": 0.2, "episode": 3, "steps": 1859, "laps": 3, "lap_times": [30.109375, 27.875, 27.015625], "reward": 323.1350573108066}
+{"label": "exp14_base", "model_path": "models/exp14-mountain-v5/best_model.zip", "throttle_floor": 0.2, "episode": 4, "steps": 2000, "laps": 3, "lap_times": [30.484375, 27.953125, 28.890625], "reward": 338.7155503882095}
+{"label": "exp14_base", "model_path": "models/exp14-mountain-v5/best_model.zip", "throttle_floor": 0.2, "episode": 5, "steps": 190, "laps": 0, "lap_times": [], "reward": 34.7614578341836}
+{"label": "exp14_base", "model_path": "models/exp14-mountain-v5/best_model.zip", "throttle_floor": 0.2, "episode": 6, "steps": 689, "laps": 0, "lap_times": [], "reward": 89.68411724013276}
+{"label": "exp14_base", "model_path": "models/exp14-mountain-v5/best_model.zip", "throttle_floor": 0.2, "episode": 7, "steps": 1164, "laps": 1, "lap_times": [29.34375], "reward": 193.3484125709192}
+{"label": "exp14_base", "model_path": "models/exp14-mountain-v5/best_model.zip", "throttle_floor": 0.2, "episode": 8, "steps": 2000, "laps": 3, "lap_times": [29.125, 36.109375, 28.0], "reward": 325.9044730809983}
+{"label": "exp14_base", "model_path": "models/exp14-mountain-v5/best_model.zip", "throttle_floor": 0.2, "episode": 9, "steps": 1275, "laps": 2, "lap_times": [28.484375, 27.140625], "reward": 230.20822774680255}
+{"label": "ft_006k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0006000.zip", "throttle_floor": 0.4, "episode": 1, "steps": 177, "laps": 0, "lap_times": [], "reward": 36.94930865149945}
+{"label": "ft_006k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0006000.zip", "throttle_floor": 0.4, "episode": 2, "steps": 126, "laps": 0, "lap_times": [], "reward": 27.089253417694636}
+{"label": "ft_006k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0006000.zip", "throttle_floor": 0.4, "episode": 3, "steps": 161, "laps": 0, "lap_times": [], "reward": 35.617936324328184}
+{"label": "ft_006k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0006000.zip", "throttle_floor": 0.4, "episode": 4, "steps": 372, "laps": 0, "lap_times": [], "reward": 81.32034226437099}
+{"label": "ft_006k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0006000.zip", "throttle_floor": 0.4, "episode": 5, "steps": 348, "laps": 0, "lap_times": [], "reward": 80.90182103098687}
+{"label": "ft_006k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0006000.zip", "throttle_floor": 0.4, "episode": 6, "steps": 351, "laps": 0, "lap_times": [], "reward": 83.38630702279897}
+{"label": "ft_006k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0006000.zip", "throttle_floor": 0.4, "episode": 7, "steps": 769, "laps": 1, "lap_times": [21.359375], "reward": 181.13429199228995}
+{"label": "ft_006k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0006000.zip", "throttle_floor": 0.4, "episode": 8, "steps": 356, "laps": 0, "lap_times": [], "reward": 82.55838803868119}
+{"label": "ft_006k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0006000.zip", "throttle_floor": 0.4, "episode": 9, "steps": 357, "laps": 0, "lap_times": [], "reward": 83.45234980319492}
+{"label": "ft_024k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0024000.zip", "throttle_floor": 0.4, "episode": 1, "steps": 395, "laps": 0, "lap_times": [], "reward": 92.56352121913824}
+{"label": "ft_024k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0024000.zip", "throttle_floor": 0.4, "episode": 2, "steps": 813, "laps": 1, "lap_times": [22.5625], "reward": 174.79000809043646}
+{"label": "ft_024k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0024000.zip", "throttle_floor": 0.4, "episode": 3, "steps": 776, "laps": 1, "lap_times": [21.9375], "reward": 168.13511967613886}
+{"label": "ft_024k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0024000.zip", "throttle_floor": 0.4, "episode": 4, "steps": 1248, "laps": 2, "lap_times": [20.53125, 21.21875], "reward": 268.3056740048578}
+{"label": "ft_024k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0024000.zip", "throttle_floor": 0.4, "episode": 5, "steps": 382, "laps": 0, "lap_times": [], "reward": 80.95213605418394}
+{"label": "ft_024k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0024000.zip", "throttle_floor": 0.4, "episode": 6, "steps": 555, "laps": 1, "lap_times": [21.640625], "reward": 127.2610695830399}
+{"label": "ft_024k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0024000.zip", "throttle_floor": 0.4, "episode": 7, "steps": 149, "laps": 0, "lap_times": [], "reward": 29.529473824529305}
+{"label": "ft_024k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0024000.zip", "throttle_floor": 0.4, "episode": 8, "steps": 438, "laps": 0, "lap_times": [], "reward": 91.48110114145402}
+{"label": "ft_024k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0024000.zip", "throttle_floor": 0.4, "episode": 9, "steps": 416, "laps": 0, "lap_times": [], "reward": 91.06411632476375}
+{"label": "ft_030k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0030000.zip", "throttle_floor": 0.4, "episode": 1, "steps": 72, "laps": 0, "lap_times": [], "reward": 10.501809980045437}
+{"label": "ft_030k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0030000.zip", "throttle_floor": 0.4, "episode": 2, "steps": 451, "laps": 0, "lap_times": [], "reward": 87.36528500499116}
+{"label": "ft_030k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0030000.zip", "throttle_floor": 0.4, "episode": 3, "steps": 1012, "laps": 2, "lap_times": [22.34375, 20.71875], "reward": 224.14087196643231}
+{"label": "ft_030k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0030000.zip", "throttle_floor": 0.4, "episode": 4, "steps": 157, "laps": 0, "lap_times": [], "reward": 36.810340652579725}
+{"label": "ft_030k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0030000.zip", "throttle_floor": 0.4, "episode": 5, "steps": 167, "laps": 0, "lap_times": [], "reward": 38.044355134905345}
+{"label": "ft_030k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0030000.zip", "throttle_floor": 0.4, "episode": 6, "steps": 167, "laps": 0, "lap_times": [], "reward": 34.502623422992656}
+{"label": "ft_030k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0030000.zip", "throttle_floor": 0.4, "episode": 7, "steps": 314, "laps": 0, "lap_times": [], "reward": 64.56299563837547}
+{"label": "ft_030k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0030000.zip", "throttle_floor": 0.4, "episode": 8, "steps": 183, "laps": 0, "lap_times": [], "reward": 39.53914854944924}
+{"label": "ft_030k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0030000.zip", "throttle_floor": 0.4, "episode": 9, "steps": 329, "laps": 0, "lap_times": [], "reward": 65.17448510392569}
+{"label": "ft_036k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0036000.zip", "throttle_floor": 0.2, "episode": 1, "steps": 811, "laps": 1, "lap_times": [29.921875], "reward": 134.79952276336735}
+{"label": "ft_036k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0036000.zip", "throttle_floor": 0.2, "episode": 2, "steps": 2000, "laps": 3, "lap_times": [27.890625, 26.390625, 26.78125], "reward": 355.31346766664956}
+{"label": "ft_036k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0036000.zip", "throttle_floor": 0.2, "episode": 3, "steps": 2000, "laps": 3, "lap_times": [28.765625, 27.765625, 27.46875], "reward": 344.61911652631443}
+{"label": "ft_036k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0036000.zip", "throttle_floor": 0.2, "episode": 4, "steps": 1833, "laps": 3, "lap_times": [27.5625, 27.375, 26.15625], "reward": 330.4355698036379}
+{"label": "ft_036k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0036000.zip", "throttle_floor": 0.2, "episode": 5, "steps": 1928, "laps": 3, "lap_times": [29.125, 28.109375, 27.6875], "reward": 332.7582264354696}
+{"label": "ft_036k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0036000.zip", "throttle_floor": 0.2, "episode": 6, "steps": 2000, "laps": 3, "lap_times": [29.46875, 26.8125, 28.5], "reward": 342.77918509633764}
+{"label": "ft_036k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0036000.zip", "throttle_floor": 0.2, "episode": 7, "steps": 2000, "laps": 3, "lap_times": [28.515625, 26.4375, 28.046875], "reward": 356.1632144622272}
+{"label": "ft_036k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0036000.zip", "throttle_floor": 0.2, "episode": 8, "steps": 2000, "laps": 3, "lap_times": [29.4375, 28.140625, 26.546875], "reward": 351.6208579884842}
+{"label": "ft_036k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0036000.zip", "throttle_floor": 0.2, "episode": 9, "steps": 2000, "laps": 3, "lap_times": [29.484375, 27.078125, 28.71875], "reward": 346.25121597342695}
+{"label": "ft_042k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0042000.zip", "throttle_floor": 0.2, "episode": 1, "steps": 727, "laps": 1, "lap_times": [27.65625], "reward": 127.74538866006424}
+{"label": "ft_042k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0042000.zip", "throttle_floor": 0.2, "episode": 2, "steps": 1082, "laps": 1, "lap_times": [29.53125], "reward": 180.6854192542378}
+{"label": "ft_042k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0042000.zip", "throttle_floor": 0.2, "episode": 3, "steps": 2000, "laps": 3, "lap_times": [28.796875, 27.171875, 31.359375], "reward": 316.97187187187683}
+{"label": "ft_042k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0042000.zip", "throttle_floor": 0.2, "episode": 4, "steps": 436, "laps": 0, "lap_times": [], "reward": 76.15429569082335}
+{"label": "ft_042k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0042000.zip", "throttle_floor": 0.2, "episode": 5, "steps": 1255, "laps": 2, "lap_times": [29.375, 27.84375], "reward": 214.58350544204313}
+{"label": "ft_042k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0042000.zip", "throttle_floor": 0.2, "episode": 6, "steps": 2000, "laps": 3, "lap_times": [29.265625, 27.09375, 28.046875], "reward": 346.9328897984469}
+{"label": "ft_042k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0042000.zip", "throttle_floor": 0.2, "episode": 7, "steps": 2000, "laps": 3, "lap_times": [29.609375, 28.25, 29.0625], "reward": 325.0631527120613}
+{"label": "ft_042k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0042000.zip", "throttle_floor": 0.2, "episode": 8, "steps": 1139, "laps": 1, "lap_times": [36.640625], "reward": 174.01025118848793}
+{"label": "ft_042k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0042000.zip", "throttle_floor": 0.2, "episode": 9, "steps": 2000, "laps": 3, "lap_times": [28.421875, 31.546875, 27.59375], "reward": 317.8546572496798}
+{"label": "ft_048k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0048000.zip", "throttle_floor": 0.2, "episode": 1, "steps": 2000, "laps": 2, "lap_times": [41.96875, 29.703125], "reward": 295.66797001392115}
+{"label": "ft_048k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0048000.zip", "throttle_floor": 0.2, "episode": 2, "steps": 209, "laps": 0, "lap_times": [], "reward": 39.03441974380985}
+{"label": "ft_048k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0048000.zip", "throttle_floor": 0.2, "episode": 3, "steps": 1353, "laps": 2, "lap_times": [31.84375, 28.46875], "reward": 227.3901946817232}
+{"label": "ft_048k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0048000.zip", "throttle_floor": 0.2, "episode": 4, "steps": 2000, "laps": 3, "lap_times": [30.5, 31.171875, 29.15625], "reward": 332.7062042630514}
+{"label": "ft_048k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0048000.zip", "throttle_floor": 0.2, "episode": 5, "steps": 201, "laps": 0, "lap_times": [], "reward": 38.25372101787252}
+{"label": "ft_048k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0048000.zip", "throttle_floor": 0.2, "episode": 6, "steps": 2000, "laps": 3, "lap_times": [29.875, 28.90625, 28.3125], "reward": 332.71756463419115}
+{"label": "ft_048k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0048000.zip", "throttle_floor": 0.2, "episode": 7, "steps": 1365, "laps": 2, "lap_times": [30.859375, 29.921875], "reward": 235.64415748882038}
+{"label": "ft_048k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0048000.zip", "throttle_floor": 0.2, "episode": 8, "steps": 184, "laps": 0, "lap_times": [], "reward": 32.21486316375649}
+{"label": "ft_048k", "model_path": "models/exp14-mountain-v5-finetune/checkpoint_0048000.zip", "throttle_floor": 0.2, "episode": 9, "steps": 830, "laps": 1, "lap_times": [34.25], "reward": 131.9192705488067}
--- a/agent/outerloop-results/mountain_candidate_eval_2026-04-19.md
+++ b/agent/outerloop-results/mountain_candidate_eval_2026-04-19.md
@ -0,0 +1,29 @@
+# Mountain candidate checkpoint evaluation — 2026-04-19
+
+Deterministic eval on `mountain_track`, 9 episodes per model, max 2000 steps/episode.
+
+| model | floor | success eps | full 2k eps | avg laps/ep | total laps | mean lap (all) | best lap | avg steps | notes |
+|---|---:|---:|---:|---:|---:|---:|---:|---:|---|
+| exp14_base | 0.2 | 7/9 | 3/9 | 1.78 | 16 | 29.24 | 27.02 | 1332 | original champion |
+| ft_006k | 0.4 | 1/9 | 0/9 | 0.11 | 1 | 21.36 | 21.36 | 335 | very fast when it works, extremely fragile |
+| ft_024k | 0.4 | 4/9 | 0/9 | 0.56 | 5 | 21.58 | 20.53 | 575 | fast but fragile |
+| ft_030k | 0.4 | 1/9 | 0/9 | 0.22 | 2 | 21.53 | 20.72 | 317 | very fast but extremely fragile |
+| ft_036k | 0.2 | 9/9 | 6/9 | 2.78 | 25 | 27.93 | 26.16 | 1841 | best balance: fastest robust candidate |
+| ft_042k | 0.2 | 8/9 | 4/9 | 1.89 | 17 | 29.25 | 27.09 | 1404 | decent, but worse than 36k |
+| ft_048k | 0.2 | 6/9 | 3/9 | 1.44 | 13 | 31.15 | 28.31 | 1127 | degraded |
+
+## Recommendation
+
+Best overall candidate:
+- `models/exp14-mountain-v5-finetune/checkpoint_0036000.zip`
+
+Reason:
+- 9/9 successful episodes
+- 25 total laps across 9 episodes
+- mean lap 27.93s
+- best lap 26.16s
+- clearly more robust than the original exp14 champion and later finetune checkpoints
+
+## Raw result file
+
+- `agent/outerloop-results/mountain_candidate_eval_2026-04-19.jsonl`
--- a/agent/tmp_eval_mountain_candidates.py
+++ b/agent/tmp_eval_mountain_candidates.py
@ -0,0 +1,149 @@
+import os, json, time
+from datetime import datetime
+
+from stable_baselines3 import PPO
+from stable_baselines3.common.vec_env import DummyVecEnv, VecTransposeImage
+import gymnasium as gym
+import numpy as np
+
+from donkeycar_sb3_runner import ThrottleClampWrapper
+
+HOST='10.0.0.55'
+PORT=9091
+TRACK_ID='donkey-mountain-track-v0'
+MAX_STEPS=2000
+EPISODES=9
+OUT='outerloop-results/mountain_candidate_eval_2026-04-19.jsonl'
+
+CANDIDATES = [
+    ('exp14_base', 'models/exp14-mountain-v5/best_model.zip', 0.2),
+    ('ft_006k', 'models/exp14-mountain-v5-finetune/checkpoint_0006000.zip', 0.4),
+    ('ft_024k', 'models/exp14-mountain-v5-finetune/checkpoint_0024000.zip', 0.4),
+    ('ft_030k', 'models/exp14-mountain-v5-finetune/checkpoint_0030000.zip', 0.4),
+    ('ft_036k', 'models/exp14-mountain-v5-finetune/checkpoint_0036000.zip', 0.2),
+    ('ft_042k', 'models/exp14-mountain-v5-finetune/checkpoint_0042000.zip', 0.2),
+    ('ft_048k', 'models/exp14-mountain-v5-finetune/checkpoint_0048000.zip', 0.2),
+]
+
+class V5RewardWrapper(gym.Wrapper):
+    def __init__(self, env, max_cte=8.0, min_lap_time=5.0):
+        super().__init__(env)
+        self.max_cte = max_cte
+        self.min_lap_time = min_lap_time
+        self._last_lc = 0
+    def reset(self, **kwargs):
+        self._last_lc = 0
+        return self.env.reset(**kwargs)
+    def step(self, action):
+        result = self.env.step(action)
+        if len(result) == 5:
+            obs, _sim, terminated, truncated, info = result
+            done = terminated or truncated
+        else:
+            obs, _sim, done, info = result
+            terminated, truncated = done, False
+        try:
+            cte = float(info.get('cte', 0.0) or 0.0)
+        except Exception:
+            cte = 0.0
+        cte_quality = 1.0 - min(abs(cte) / self.max_cte, 1.0)
+        try:
+            speed = max(0.0, float(info.get('speed', 0.0) or 0.0))
+        except Exception:
+            speed = 0.0
+        speed_norm = min(speed / 10.0, 1.0)
+        reward = cte_quality * speed_norm
+        try:
+            current_lc = int(info.get('lap_count', 0) or 0)
+        except Exception:
+            current_lc = self._last_lc
+        force_terminate = False
+        if current_lc > self._last_lc:
+            self._last_lc = current_lc
+            try:
+                lap_time = float(info.get('last_lap_time', 999.0) or 999.0)
+            except Exception:
+                lap_time = 999.0
+            if lap_time < self.min_lap_time:
+                reward = -10.0 * (self.min_lap_time / max(lap_time, 0.1))
+                force_terminate = True
+        if len(result) == 5:
+            return obs, reward, terminated or force_terminate, truncated, info
+        return obs, reward, terminated or force_terminate, info
+
+
+def make_env(base_throttle=0.2, throttle_floor=None):
+    def _init():
+        raw = gym.make(TRACK_ID, conf={'host': HOST, 'port': PORT})
+        env = ThrottleClampWrapper(raw, throttle_min=base_throttle)
+        if throttle_floor is not None:
+            class ThrottleFloorWrapper(gym.Wrapper):
+                def __init__(self, env, floor):
+                    super().__init__(env)
+                    self.floor = floor
+                def step(self, action):
+                    act = np.array(action)
+                    try:
+                        act[1] = max(act[1], self.floor)
+                    except Exception:
+                        pass
+                    return self.env.step(act)
+                def reset(self, **kwargs):
+                    return self.env.reset(**kwargs)
+            env = ThrottleFloorWrapper(env, throttle_floor)
+        env = V5RewardWrapper(env)
+        return env
+    return _init
+
+os.makedirs(os.path.dirname(OUT), exist_ok=True)
+
+all_rows = []
+for label, model_path, floor in CANDIDATES:
+    print(f'\n=== Evaluating {label} floor={floor} path={model_path}', flush=True)
+    env = VecTransposeImage(DummyVecEnv([make_env(0.2, floor)]))
+    model = PPO.load(model_path, device='cpu')
+    model.set_env(env)
+    episodes = []
+    for ep in range(EPISODES):
+        obs = env.reset()
+        steps = 0
+        laps = 0
+        prev_lc = 0
+        lap_times = []
+        total_reward = 0.0
+        while steps < MAX_STEPS:
+            action, _ = model.predict(obs, deterministic=True)
+            obs, r, d, info = env.step(action)
+            inf = info[0] if isinstance(info, (list, tuple)) else info
+            total_reward += float(r[0])
+            steps += 1
+            lc = int(inf.get('lap_count', 0) or 0)
+            if lc > prev_lc:
+                try:
+                    lap_times.append(float(inf.get('last_lap_time', 0) or 0))
+                except Exception:
+                    lap_times.append(0.0)
+                prev_lc = lc
+                laps = lc
+            if bool(d[0]):
+                break
+        row = {
+            'label': label,
+            'model_path': model_path,
+            'throttle_floor': floor,
+            'episode': ep + 1,
+            'steps': steps,
+            'laps': laps,
+            'lap_times': lap_times,
+            'reward': total_reward,
+        }
+        episodes.append(row)
+        all_rows.append(row)
+        print(f"  ep{ep+1}: steps={steps} laps={laps} lap_times={lap_times}", flush=True)
+    env.close()
+    time.sleep(2)
+
+with open(OUT, 'w') as f:
+    for row in all_rows:
+        f.write(json.dumps(row) + '\n')
+print('\nSaved to', OUT)
--- a/docs/SESSION_LOG_2026-04-19.md
+++ b/docs/SESSION_LOG_2026-04-19.md
@ -120,6 +120,59 @@ parallel envs are working.
  - Exp 11c (v6 reward, 250k): aborted — grass exploit found on generated_track
  - Exp 11d: pending fixes before re-run

+## Mountain Track Finetune + Physics Investigation (2026-04-19 late session)
+
+### Finetune outcome summary
+- Created and ran `agent/experiments/exp14_finetune_v5.py`
+- Warm-start source: `agent/models/exp14-mountain-v5/best_model.zip`
+- Schedule used:
+  - phase 1: runtime throttle floor `0.4`
+  - phase 2: runtime throttle floor `0.2`
+- Training later degraded badly; later checkpoints became poor / unstable
+- Best usable finetune checkpoint was **not** the final model
+
+### Robust checkpoint comparison on mountain_track
+We ran a deterministic mountain-only comparison over 9 episodes per candidate.
+Results saved to:
+- `agent/outerloop-results/mountain_candidate_eval_2026-04-19.jsonl`
+- `agent/outerloop-results/mountain_candidate_eval_2026-04-19.md`
+
+Winner:
+- `agent/models/exp14-mountain-v5-finetune/checkpoint_0036000.zip`
+- promoted copy:
+  - `agent/models/exp14-mountain-v5-finetune/best_robust_model_0036000.zip`
+
+Key result:
+- **ft_036k** achieved:
+  - 9/9 successful episodes
+  - 25 total laps across 9 episodes
+  - mean lap **27.93s**
+  - best lap **26.16s**
+- This beat:
+  - original mountain champion for robustness
+  - earlier `0.4`-floor checkpoints for robustness
+  - later finetune checkpoints, which had degraded badly
+
+### Mountain physics discovery in Unity sim
+Unity source path confirmed:
+- `/mnt/c/Users/Paul/Documents/projects/sdsandbox`
+
+We found a likely real root cause for hill wheelspin:
+- `sdsim/Assets/Scripts/WheelPhys.cs` scales wheel friction by the hit collider's physics material:
+  - `hit.collider.material.staticFriction * originalForwardStiffness`
+- `mountain_track.unity` contains **4 explicit `Slippery` physics-material assignments** on the imported `long_road` FBX instance
+- `Slippery.staticFriction = 0.1`
+- `Road.staticFriction = 0.5`
+- `Grippy.staticFriction = 0.66`
+
+Interpretation:
+- mountain road traction is likely much lower than normal road tracks
+- this matches observed wheelspin / poor uphill progress / getting stuck on hills
+
+We created a dedicated Unity investigation branch before changing anything:
+- repo: `/mnt/c/Users/Paul/Documents/projects/sdsandbox`
+- branch: `investigate-mountain-friction`
+
 ## Critical Known Facts (DO NOT LOSE)

 ### throttle_min history (from Exp 1-9)
--- a/docs/TEST_HISTORY.md
+++ b/docs/TEST_HISTORY.md
@ -1,6 +1,6 @@
 # Test History — DonkeyCar RL Autoresearch

-Last updated: 2026-04-18
+Last updated: 2026-04-19

 This document records every significant training experiment, what was
 changed, what was observed, and what was learned.  Use this to make
@ -400,3 +400,48 @@ camera angle) to achieve generalization instead?
 - Try v5 reward with parallel envs but longer training (accept some circling)
 - Check if efficiency gate triggers too aggressively during normal cornering

+---
+
+## Exp 14b — Mountain finetune from exp14 champion (2026-04-19)
+
+- **Script:** `agent/experiments/exp14_finetune_v5.py`
+- **Warm start:** `agent/models/exp14-mountain-v5/best_model.zip`
+- **Schedule:**
+  - phase 1: runtime throttle floor `0.4`
+  - phase 2: runtime throttle floor `0.2`
+- **Goal:** improve hill climbing, robustness, and lap time on `mountain_track`
+
+### Important outcome
+The finetune run **did not improve monotonically**. It briefly improved, then later degraded badly.
+This means the final/latest checkpoint is **not** the model we want to keep.
+
+### Candidate checkpoint comparison
+We ran a fresh deterministic comparison on mountain only:
+- **9 episodes per model**
+- **2000 step cap**
+- Results saved to:
+  - `agent/outerloop-results/mountain_candidate_eval_2026-04-19.jsonl`
+  - `agent/outerloop-results/mountain_candidate_eval_2026-04-19.md`
+
+| Model | Floor | Success eps | Full 2k eps | Avg laps/ep | Total laps | Mean lap | Best lap | Avg steps | Verdict |
+|---|---:|---:|---:|---:|---:|---:|---:|---:|---|
+| exp14_base | 0.2 | 7/9 | 3/9 | 1.78 | 16 | 29.24s | 27.02s | 1332 | Original champion |
+| ft_006k | 0.4 | 1/9 | 0/9 | 0.11 | 1 | 21.36s | 21.36s | 335 | Very fast but unusably fragile |
+| ft_024k | 0.4 | 4/9 | 0/9 | 0.56 | 5 | 21.58s | 20.53s | 575 | Fast but fragile |
+| ft_030k | 0.4 | 1/9 | 0/9 | 0.22 | 2 | 21.53s | 20.72s | 317 | Very fast but unusably fragile |
+| **ft_036k** | **0.2** | **9/9** | **6/9** | **2.78** | **25** | **27.93s** | **26.16s** | **1841** | **Best overall balance** |
+| ft_042k | 0.2 | 8/9 | 4/9 | 1.89 | 17 | 29.25s | 27.09s | 1404 | Decent, but worse than 36k |
+| ft_048k | 0.2 | 6/9 | 3/9 | 1.44 | 13 | 31.15s | 28.31s | 1127 | Degraded |
+
+### Best model captured
+Best overall checkpoint from the finetune:
+- `agent/models/exp14-mountain-v5-finetune/checkpoint_0036000.zip`
+
+Promoted copy saved as:
+- `agent/models/exp14-mountain-v5-finetune/best_robust_model_0036000.zip`
+
+### Key learning
+- Early `0.4`-floor checkpoints can produce very fast laps, but are too fragile to trust.
+- The best mountain finetune model is the **36k checkpoint after switching back to 0.2 floor**, not the later checkpoints.
+- Later finetune checkpoints collapsed badly, matching the user's visual observation of wheelspin / poor driving.
+