Commit Graph

11 Commits

Author SHA1 Message Date
Paul Huliganga b1ec14e3cb fix: exp14 — proper track switch via exit_scene before connecting to mountain_track 2026-04-19 19:18:33 -04:00
Paul Huliganga 1405a88699 feat: Exp 14 — mountain_track, v5 reward, lap-based stopping
v5 required for mountain hills (v4 gives zero gradient on hills - documented Exp 1).
Same simple approach as Exp 13 which worked: single track, minimal wrappers,
lap-based stopping. ThrottleClamp + V5Reward only.
2026-04-19 19:15:00 -04:00
Paul Huliganga 5a1693b4ec feat: Exp 13 — generated_track, v4 reward, back to basics (no extra heuristics)
Return to Wave 4 setup that produced Trial 9 (2000/2000 on generated_track).
v4 reward: base x efficiency x speed. Circles give ~0 reward naturally.
No StuckTerminationWrapper, no CTE patience, no progress terminator.
Just ThrottleClamp + V4Reward. Lap-based stopping criterion.
2026-04-19 17:33:17 -04:00
Paul Huliganga 813f888502 fix: reward v6.1 — active_node progress terminator kills circle/stuck exploits
User's insight: a circling car stays near the same track waypoints, so
active_node (sim's track progress indicator) never advances. Track the
maximum active_node reached this episode. If it hasn't increased in
progress_patience=60 steps (~3.3s), terminate.

This catches:
  - Circular driving (active_node oscillates, max never increases)
  - Stuck on cone/barrier (active_node frozen)
  - NOT triggered by: legitimate cornering, slow forward progress, lap resets

On lap completion, active_node wraps to 0 — reset max_node_seen and counter.

Also: Exp 12 — single track mountain training with lap-based stopping criterion.
Train until 3 consecutive laps in eval, not fixed step count.
2026-04-19 17:01:41 -04:00
Paul Huliganga dc563e2b6c fix: exp11d remove progress_patience — grass fix only per ADR-020 2026-04-19 16:18:17 -04:00
Paul Huliganga f730a2e0ba docs: ADR-020/021 + session log — throttle/hill history and grass exploit root cause
Critical facts documented permanently:
- throttle_min=0.5 bakes into action space (too fast for corners)
- throttle_min=0.2 + v5 reward CAN learn hill (proved Exp 9, mountain only 90k)
- Mountain failure in parallel is contamination from grass exploit, not throttle
- Grass exploit root cause: sim determine_episode_over() passes when CTE>16m
- DO NOT confuse mountain rollback with stuck issue
- DO NOT change throttle_min as first response to mountain failure
2026-04-19 16:14:28 -04:00
Paul Huliganga 16bd379e95 feat: Exp 11c — parallel DummyVecEnv + v6 reward, extended to 250k steps 2026-04-19 13:27:38 -04:00
Paul Huliganga 91ce8fc1fa feat: Exp 11b — parallel DummyVecEnv + v6 reward (anti-circle gate) + built-in eval 2026-04-19 12:03:46 -04:00
Paul Huliganga beb04f3ebe fix: reward v6 — efficiency gate prevents circular driving, stuck_steps 80→40
v5 dropped the efficiency term to get gradient signal on hills, but this
re-enabled circular driving (observed in Exp 11). v6 adds efficiency back
as a GATE (not multiplier): if efficiency < 0.15, reward = 0. Otherwise
reward = speed × CTE_quality (same as v5).

Gate vs multiplier: v4 used efficiency as a multiplier which killed gradient
on hills (all terms → 0 simultaneously). v6's gate passes when efficiency
is above threshold (car moving forward, even slowly on hill) and only
blocks when car is truly circling.

Also reduced stuck_steps from 80 to 40 (~2.5s vs ~5s) — user reported
car stuck against barriers for ~10s which is too long with DummyVecEnv.
2026-04-19 12:02:55 -04:00
Paul Huliganga 21addf268e feat: Exp 11 — parallel DummyVecEnv multi-track training (two sim instances) 2026-04-19 11:05:22 -04:00
Paul Huliganga 6e9546cd22 save: all experiment scripts moved from /tmp to agent/experiments/
Scripts in /tmp are lost on reboot and not reproducible.
All experiment scripts now committed to git with README.

Exp5 script was already gone (lost before this fix).
All others (Exp6-Exp10, overnight, wave5, etc.) now preserved.

Rule going forward: scripts saved to agent/experiments/ and committed
BEFORE running, not after.

Agent: pi
Tests: 102 passed
Tests-Added: 0
TypeScript: N/A
2026-04-18 21:30:08 -04:00