Commit Graph

8 Commits

Author SHA1 Message Date
Paul Huliganga 147198e681 docs(handoff): update for exp27 running state, document all session fixes
Records exp27 running status (PID 1094759, self-intersection fix deployed),
full results table through 280k steps, and all critical fixes from this
session: regen_road null-TrainingManager bug, MapOverlay minimap refresh,
BrakeOnUpdateCallback, PathManager self-intersection retry loop, and the
DLL rsync source gotcha (Builds/ not Library/ScriptAssemblies/).

Also includes supplementary exp23 log files from the same session.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-14 15:32:49 -04:00
Paul Huliganga bb889ab4a1 docs(handoff): exp24 complete, exp25 running with wheel fix
exp24 final: 19/19 full episodes, best 365.5r@170k, 3-road mean 346.5r.
exp25 launched on patched sim (wheel OverlapSphere), log file confirmed writing.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-05 22:13:57 -04:00
Paul Huliganga f784fdebd1 feat(exp25): wheel OverlapSphere collision fix + auto-transition
Car.cs (sdsandbox): per-wheel OverlapSphereNonAlloc in FixedUpdate catches barrier
contact from any angle, any throttle — forward raycast only covered nose-first.
Built, rsync'd, sim restart pending exp24 completion.

exp25 script: identical to exp24 params, fresh weights, patched Unity binary.
Auto-transition monitor armed: kills sim, restarts with new binary, launches exp25
when exp24 finishes (~22:00 EST).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-05 20:28:31 -04:00
Paul Huliganga c6a18e7fee chore(exp24): launch exp24, fix logging setup, update handoff
- Exp23 complete (mean 2000 steps / 408.6r, high variance confirms nose-first stuck issue)
- Unity 6000.4.4f1 rebuild done: Assembly-CSharp.dll updated with Car.cs raycast fix
- Rsync'd to runtime folder, sim restarted on port 9091
- Exp24 launched (PID 733053) — discrete(7), speed stuck, road regen
- Fix logging.basicConfig no-op: use file_log.addHandler() directly
- Monitor via /tmp/exp24.out (log file was 0 bytes with old approach)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-05 20:12:36 -04:00
Paul Huliganga 78d81827b7 docs(handoff): update SESSION_HANDOFF for exp24 readiness
Document WheelCollider root cause, Car.cs raycast fix requirement, road
generation behavior, exp24 launch steps, and updated success criteria.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-05 17:59:32 -04:00
Paul Huliganga 75f7857250 chore(exp23): launched — clean barriers verified, training started
Exp 23 running PID 647921 on generated_road:9091.
- Barriers visually confirmed by Paul (3D box barriers, both sides, end caps visible)
- Unity build synced to runtime folders
- Fresh PPO, 200k steps, v7 clean reward

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-05 16:04:21 -04:00
Paul Huliganga 2d52bb4ffc fix(core): replace exploit bandaids with solid physics barriers + clean reward
Root cause: barriers were zero-thickness MeshCollider planes with no CCD on the
car. The car tunnelled through between frames. Every Python patch was trying to
catch in code what physics should enforce.

Unity (source only — build in progress):
- RoadBuilder.cs: CreateBarrier() now makes BoxCollider-per-segment with real 3D
  volume (barrierThickness=1.0m default) + half-thickness overlap at corners to
  seal gaps. CreateEndCap() seals open ends of non-looping tracks (generated_road).
- Car.cs: rb.collisionDetectionMode = Continuous in Awake() — prevents tunneling.

Python:
- reward_wrapper.py v7: removed CTE-patience termination, high-CTE negative
  reward, solid_hit monitoring, low-speed/wedge detection. Kept: efficiency gate,
  no-progress (active_node) termination, lap exploit guard. Reward = speed×CTE_quality.
- exp23_generated_road_clean.py: single track, no warm-start, 200k steps, clean
  reward, MAX_EPISODE_SECONDS=120 as safety net only.
- tests: 17 tests covering clean reward properties.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-05 15:56:00 -04:00
Paul Huliganga 138c65270f feat(exp22): add solid-hit/wedge/high-CTE exploit fixes and generated-pair warm experiments
- reward_wrapper: detect barrier/wall/tree solid hits, terminate on head-on impact
  or 4 sustained solid-hit frames; prevents car wedging against invisible barriers
- reward_wrapper: add low-speed/wedge termination — kills episode when car is pinned
  motionless (below threshold, no displacement) after grace period
- reward_wrapper: high-CTE exploit fix — return -0.25 immediately when CTE >
  max_cte_terminate (not after patience), so PPO cannot collect positive speed
  rewards while driving the large outside-road circle
- tests: 23 passing unit tests covering all new termination paths
- exp20/21/22: add parallel DummyVecEnv experiments on generated_road+generated_track
  with warm-start from champion model; exp22 is current active run
- SESSION_HANDOFF.md: live handoff doc for next session continuity

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-05 14:46:13 -04:00