donkeycar-rl-autoresearch

History

Paul Huliganga 86657a26b8 wave3: fix track-switch bug (viewer not raw socket) + shorten trial budgets Bug: send_exit_scene_raw() opened a NEW TCP connection, creating a second phantom vehicle. The sim sent exit_scene to the phantom, leaving the real training connection stuck on generated_road for the entire run. Fix: _send_exit_scene() now calls env.unwrapped.viewer.exit_scene() on the EXISTING TCP connection that the training env already holds. This is the only reliable way to switch scenes mid-session (matches track_switcher.py). Also: - Removed send_exit_scene_raw() import from multitrack_runner.py - Simplified initial connection (no spurious exit_scene at startup) - Reduced search space: total_timesteps 80k-400k -> 30k-150k - Reduced seed params: 150k/300k -> 45k/90k (~35-45 min per trial) - Added test: test_close_and_switch_uses_viewer_not_raw_socket 83 tests passing Agent: pi Tests: 83 passed Tests-Added: 1 TypeScript: N/A		2026-04-14 13:29:49 -04:00
..
model-000	Initial commit: stable RL sweep runner, legacy and new scripts, full docs included	2026-04-12 22:57:50 -04:00
model-001	Initial commit: stable RL sweep runner, legacy and new scripts, full docs included	2026-04-12 22:57:50 -04:00
model-002	Initial commit: stable RL sweep runner, legacy and new scripts, full docs included	2026-04-12 22:57:50 -04:00
model-003	Initial commit: stable RL sweep runner, legacy and new scripts, full docs included	2026-04-12 22:57:50 -04:00
autoresearch_log.txt	AUTORESEARCH: 300 total trials complete - best mean_reward=141.85 at n_steer=8, n_throttle=5, lr=0.00202	2026-04-13 01:56:06 -04:00
autoresearch_phase1_log.txt	milestone: Phase 1 complete — genuine driving confirmed; launch Phase 2 corner learning	2026-04-13 19:33:06 -04:00
autoresearch_phase1_log_CORRUPTED_circular_driving.txt	fix: path-efficiency reward (v3) defeats circular driving exploit	2026-04-13 13:36:17 -04:00
autoresearch_phase1_log_CORRUPTED_reward_hacking.txt	fix: hack-proof reward shaping + reward hacking detection + research log	2026-04-13 12:27:48 -04:00
autoresearch_phase2_log.txt	wave3: fix track-switch bug (viewer not raw socket) + shorten trial budgets	2026-04-14 13:29:49 -04:00
autoresearch_phase3_log.txt	wave3: fix track-switch bug (viewer not raw socket) + shorten trial budgets	2026-04-14 13:29:49 -04:00
autoresearch_results.jsonl	AUTORESEARCH: 300 total trials complete - best mean_reward=141.85 at n_steer=8, n_throttle=5, lr=0.00202	2026-04-13 01:56:06 -04:00
autoresearch_results_phase1.jsonl	autoresearch: phase1 trial 50 results	2026-04-13 19:17:56 -04:00
autoresearch_results_phase1_CORRUPTED_circular_driving.jsonl	fix: path-efficiency reward (v3) defeats circular driving exploit	2026-04-13 13:36:17 -04:00
autoresearch_results_phase1_CORRUPTED_reward_hacking.jsonl	fix: hack-proof reward shaping + reward hacking detection + research log	2026-04-13 12:27:48 -04:00
autoresearch_results_phase2.jsonl	autoresearch: phase1 trial 20 results	2026-04-14 04:35:45 -04:00
clean_sweep_results.jsonl	AUTORESEARCH: Full Karpathy-style GP+UCB meta-controller, clean base data, fixed all paths, ready to run	2026-04-13 00:52:00 -04:00
eval_summary.jsonl	fix: track switching via unwrapped viewer.exit_scene() — automatic scene changes work	2026-04-14 10:04:15 -04:00
multitrack_results.jsonl	results: complete multi-track generalization baseline — 1/10 tracks drivable pre-Wave3	2026-04-14 11:31:08 -04:00
nohup_outerloop.log	Initial commit: stable RL sweep runner, legacy and new scripts, full docs included	2026-04-12 22:57:50 -04:00
outer_monitor.log	Initial commit: stable RL sweep runner, legacy and new scripts, full docs included	2026-04-12 22:57:50 -04:00
sweep_results.jsonl	Initial commit: stable RL sweep runner, legacy and new scripts, full docs included	2026-04-12 22:57:50 -04:00