donkeycar-rl-autoresearch/tests
Paul Huliganga 2d52bb4ffc fix(core): replace exploit bandaids with solid physics barriers + clean reward
Root cause: barriers were zero-thickness MeshCollider planes with no CCD on the
car. The car tunnelled through between frames. Every Python patch was trying to
catch in code what physics should enforce.

Unity (source only — build in progress):
- RoadBuilder.cs: CreateBarrier() now makes BoxCollider-per-segment with real 3D
  volume (barrierThickness=1.0m default) + half-thickness overlap at corners to
  seal gaps. CreateEndCap() seals open ends of non-looping tracks (generated_road).
- Car.cs: rb.collisionDetectionMode = Continuous in Awake() — prevents tunneling.

Python:
- reward_wrapper.py v7: removed CTE-patience termination, high-CTE negative
  reward, solid_hit monitoring, low-speed/wedge detection. Kept: efficiency gate,
  no-progress (active_node) termination, lap exploit guard. Reward = speed×CTE_quality.
- exp23_generated_road_clean.py: single track, no warm-start, 200k steps, clean
  reward, MAX_EPISODE_SECONDS=120 as safety net only.
- tests: 17 tests covering clean reward properties.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-05 15:56:00 -04:00
..
__init__.py feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing 2026-04-13 10:03:15 -04:00
test_autoresearch_controller.py fix: reward v4 — full sim bypass kills circular driving at root 2026-04-13 20:56:32 -04:00
test_behavioral_wrappers.py feat: Phase 3 — behavioral control, enhanced evaluator, 53 tests 2026-04-14 09:28:43 -04:00
test_discretize_action.py feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing 2026-04-13 10:03:15 -04:00
test_end_to_end.py Wave 4: scratch training on generated_track + mountain_track, zero-shot mini_monaco 2026-04-14 22:40:38 -04:00
test_reward_wrapper.py fix(core): replace exploit bandaids with solid physics barriers + clean reward 2026-05-05 15:56:00 -04:00
test_runner_integration.py feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing 2026-04-13 10:03:15 -04:00
test_wave3.py fix: StuckTerminationWrapper — wall-clock timeout (12s) prevents 1min+ stuck episodes 2026-04-19 16:30:50 -04:00