Commit Graph

5 Commits

Author SHA1 Message Date
Paul Huliganga f9f6a09744 fix: StuckTerminationWrapper + deque import + 102 tests
StuckTerminationWrapper added to wrap_env stack (between ThrottleClamp
and SpeedReward):
- Terminates episode after stuck_steps=80 steps with <0.5m displacement
- Handles slow barrier contact that Unity hit detection misses
- Handles off-lap-line circles (efficiency→0 gave zero reward but no
  termination; now gives -1.0 after 80 steps = ~4s of non-progress)
- Wrapper stack: ThrottleClamp → StuckTermination → SpeedReward

Also: missing deque import in multitrack_runner.py caused NameError.

Phase 4 results cleared again (Trial 1 ran without StuckTermination).

Tests: 2 new stuck-termination tests, 102 total.

Agent: pi
Tests: 102 passed
Tests-Added: 2
TypeScript: N/A
2026-04-15 09:17:27 -04:00
Paul Huliganga 7534527722 Wave 4: scratch training on generated_track + mountain_track, zero-shot mini_monaco
Strategy change driven by Trial 1 data analysis:
- generated_road removed: too similar to generated_track, and Phase-2
  warm-start caused catastrophic forgetting (reward 2388→37 in one rotation)
- mountain_track mean reward was only 17 — model never converged there
- mini_monaco score 24.9 (37 steps) — model was outputting degenerate actions

Wave 4 approach:
- NO warm-start: fresh random weights every trial
- Train: generated_track + mountain_track (visually distinct backgrounds,
  both have road markings — forces model to learn general mark-following)
- Test (zero-shot): mini_monaco only (never seen during training)
- Wider LR search: [1e-4, 2e-3] (scratch model needs different range)
- Larger step budgets: 60k-250k total (fresh model needs more time)
- Seed params: lr=0.0003 and lr=0.001 (diverse from the start)

Files:
- multitrack_runner.py: 2 training tracks, no warm-start auto-detection
- wave4_controller.py: new Wave 4 GP+UCB controller
- tests updated: TRAINING_TRACKS assertion, seed param tests → wave4
- 96 tests passing

ADR-013 to follow.

Agent: pi
Tests: 96 passed
Tests-Added: 0
TypeScript: N/A
2026-04-14 22:40:38 -04:00
Paul Huliganga 7ed2456896 fix: remove Warren from test set — indoor carpet, broken done condition
Warren track surface is green carpet (not outdoor road), and the
episode-done condition (|CTE| > max_cte) does not fire when the car
crosses the INSIDE boundary.  Car can drive off-track and bump into
chairs indefinitely, making scores meaningless as a test metric.

Changes:
- multitrack_runner.py: TEST_TRACKS now mini_monaco only
- wave3_controller.py: drop warren_reward from parse/save/champion paths
- tests/test_wave3.py: update assertions to match single test track
- All 83 tests pass

Track classification (final):
  TRAIN : generated_road, generated_track, mountain_track
  TEST  : mini_monaco (outdoor, proper road, correct done condition)
  SKIP  : warren, warehouse, robo_racing_league, waveshare, circuit_launch
  SKIP  : avc_sparkfun (orange markings)

ADR-010 to be updated.

Agent: pi
Tests: 83 passed
Tests-Added: 0
TypeScript: N/A
2026-04-14 13:47:28 -04:00
Paul Huliganga 86657a26b8 wave3: fix track-switch bug (viewer not raw socket) + shorten trial budgets
Bug: send_exit_scene_raw() opened a NEW TCP connection, creating a second
phantom vehicle. The sim sent exit_scene to the phantom, leaving the real
training connection stuck on generated_road for the entire run.

Fix: _send_exit_scene() now calls env.unwrapped.viewer.exit_scene() on the
EXISTING TCP connection that the training env already holds. This is the
only reliable way to switch scenes mid-session (matches track_switcher.py).

Also:
- Removed send_exit_scene_raw() import from multitrack_runner.py
- Simplified initial connection (no spurious exit_scene at startup)
- Reduced search space: total_timesteps 80k-400k -> 30k-150k
- Reduced seed params: 150k/300k -> 45k/90k (~35-45 min per trial)
- Added test: test_close_and_switch_uses_viewer_not_raw_socket

83 tests passing

Agent: pi
Tests: 83 passed
Tests-Added: 1
TypeScript: N/A
2026-04-14 13:29:49 -04:00
Paul Huliganga 4ca5304a71 wave3: add multi-track autoresearch system (83 tests passing)
New files:
- agent/multitrack_runner.py: trains PPO round-robin across generated_road,
  generated_track, mountain_track; zero-shot evaluates on mini_monaco + warren
- agent/wave3_controller.py: GP+UCB outer loop optimising combined test score
- tests/test_wave3.py: 30 new tests (83 total)

Track classification (from visual analysis of all 10 screenshots):
  Training  : generated_road, generated_track, mountain_track
  Test (ZSL): mini_monaco, warren (pseudo-outdoor — proper road markings)
  Skip      : warehouse, robo_racing_league, waveshare, circuit_launch (indoor floor)
              avc_sparkfun (orange markings — different visual domain)

Key design decisions:
  ADR-010: Warren = pseudo-outdoor track (proper road lines, not floor marks)
  ADR-011: Test tracks NEVER used in training; GP optimises test score only
  ADR-012: All trials warm-start from Phase 2 champion model
  Switching: env.close() + send_exit_scene_raw() + 4s wait + gym.make()

Pre-Wave-3 baseline: 1/10 tracks drivable (0/2 held-out test tracks)
Wave 3 goal: 2/2 test tracks drivable (mini_monaco + warren)

Agent: pi
Tests: 83 passed
Tests-Added: 30
TypeScript: N/A
2026-04-14 12:47:12 -04:00