Root cause: barriers were zero-thickness MeshCollider planes with no CCD on the
car. The car tunnelled through between frames. Every Python patch was trying to
catch in code what physics should enforce.
Unity (source only — build in progress):
- RoadBuilder.cs: CreateBarrier() now makes BoxCollider-per-segment with real 3D
volume (barrierThickness=1.0m default) + half-thickness overlap at corners to
seal gaps. CreateEndCap() seals open ends of non-looping tracks (generated_road).
- Car.cs: rb.collisionDetectionMode = Continuous in Awake() — prevents tunneling.
Python:
- reward_wrapper.py v7: removed CTE-patience termination, high-CTE negative
reward, solid_hit monitoring, low-speed/wedge detection. Kept: efficiency gate,
no-progress (active_node) termination, lap exploit guard. Reward = speed×CTE_quality.
- exp23_generated_road_clean.py: single track, no warm-start, 200k steps, clean
reward, MAX_EPISODE_SECONDS=120 as safety net only.
- tests: 17 tests covering clean reward properties.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- reward_wrapper: detect barrier/wall/tree solid hits, terminate on head-on impact
or 4 sustained solid-hit frames; prevents car wedging against invisible barriers
- reward_wrapper: add low-speed/wedge termination — kills episode when car is pinned
motionless (below threshold, no displacement) after grace period
- reward_wrapper: high-CTE exploit fix — return -0.25 immediately when CTE >
max_cte_terminate (not after patience), so PPO cannot collect positive speed
rewards while driving the large outside-road circle
- tests: 23 passing unit tests covering all new termination paths
- exp20/21/22: add parallel DummyVecEnv experiments on generated_road+generated_track
with warm-start from champion model; exp22 is current active run
- SESSION_HANDOFF.md: live handoff doc for next session continuity
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>