donkeycar-rl-autoresearch/agent/outerloop-results/autoresearch_phase4_log.txt

367 lines
39 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

[2026-04-14 22:40:44] =================================================================
[2026-04-14 22:40:44] [Wave4] Multi-Track Autoresearch — GP+UCB Generalization Search
[2026-04-14 22:40:44] [Wave4] Training tracks : generated_track, mountain_track (no generated_road, no warm-start)
[2026-04-14 22:40:44] [Wave4] Test tracks : mini_monaco only (zero-shot; warren removed — broken done condition)
[2026-04-14 22:40:44] [Wave4] Max trials : 25 | kappa=2.0 | push every 5
[2026-04-14 22:40:44] [Wave4] Results file : /home/paulh/projects/donkeycar-rl-autoresearch/agent/outerloop-results/autoresearch_results_phase4.jsonl
[2026-04-14 22:40:44] [Wave4] Champion dir : /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave4-champion
[2026-04-14 22:40:44] [Wave4] Warm start : NONE (training from scratch each trial)
[2026-04-14 22:40:44] =================================================================
[2026-04-14 22:40:44] [Wave4] Loaded 0 existing Phase 3 results.
[2026-04-14 22:40:44] [Wave4] No Wave 3 champion yet.
[2026-04-14 22:40:44] [Wave4] Starting from trial 1.
[2026-04-14 22:40:44]
[Wave4] ========== Trial 1/25 ==========
[2026-04-14 22:40:44] [Wave4] Seed trial 1/2: using hardcoded params.
[2026-04-14 22:40:44] [Wave4] Proposed params: {'learning_rate': 0.0003, 'steps_per_switch': 6000, 'total_timesteps': 80000}
[2026-04-14 22:40:46] [Wave4] Launching trial 1: {'learning_rate': 0.0003, 'steps_per_switch': 6000, 'total_timesteps': 80000}
[2026-04-14 22:40:46] [Wave4] Command: python3 /home/paulh/projects/donkeycar-rl-autoresearch/agent/multitrack_runner.py --total-timesteps 80000 --steps-per-switch 6000 --learning-rate 0.0003 --eval-episodes 3 --save-dir /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave4-trial-0001
[2026-04-14 22:44:24] =================================================================
[2026-04-14 22:44:24] [Wave4] Multi-Track Autoresearch — GP+UCB Generalization Search
[2026-04-14 22:44:24] [Wave4] Training tracks : generated_track, mountain_track (no generated_road, no warm-start)
[2026-04-14 22:44:24] [Wave4] Test tracks : mini_monaco only (zero-shot; warren removed — broken done condition)
[2026-04-14 22:44:24] [Wave4] Max trials : 25 | kappa=2.0 | push every 5
[2026-04-14 22:44:24] [Wave4] Results file : /home/paulh/projects/donkeycar-rl-autoresearch/agent/outerloop-results/autoresearch_results_phase4.jsonl
[2026-04-14 22:44:24] [Wave4] Champion dir : /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave4-champion
[2026-04-14 22:44:24] [Wave4] Warm start : NONE (training from scratch each trial)
[2026-04-14 22:44:24] =================================================================
[2026-04-14 22:44:24] [Wave4] Loaded 0 existing Phase 3 results.
[2026-04-14 22:44:24] [Wave4] No Wave 3 champion yet.
[2026-04-14 22:44:24] [Wave4] Starting from trial 1.
[2026-04-14 22:44:24]
[Wave4] ========== Trial 1/25 ==========
[2026-04-14 22:44:24] [Wave4] Seed trial 1/2: using hardcoded params.
[2026-04-14 22:44:24] [Wave4] Proposed params: {'learning_rate': 0.0003, 'steps_per_switch': 6000, 'total_timesteps': 80000}
[2026-04-14 22:44:26] [Wave4] Launching trial 1: {'learning_rate': 0.0003, 'steps_per_switch': 6000, 'total_timesteps': 80000}
[2026-04-14 22:44:26] [Wave4] Command: python3 /home/paulh/projects/donkeycar-rl-autoresearch/agent/multitrack_runner.py --total-timesteps 80000 --steps-per-switch 6000 --learning-rate 0.0003 --eval-episodes 3 --save-dir /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave4-trial-0001
[2026-04-15 00:02:45] [Wave4] Trial 1 finished in 4699.3s, rc=0
[2026-04-15 00:02:45] [Wave4] Parsed: combined=45.6693 mini_monaco=45.6693
[2026-04-15 00:02:45] [Champion] 🏆 NEW BEST! Trial 1: score=45.67 (mini_monaco=45.7) params={'learning_rate': 0.0003, 'steps_per_switch': 6000, 'total_timesteps': 80000}
[2026-04-15 00:02:45] [Wave4] ===== Trial 1 Summary =====
[2026-04-15 00:02:45] GP data points : 1
[2026-04-15 00:02:45] Wave4 Champion: trial=1 score=45.67 params={'learning_rate': 0.0003, 'steps_per_switch': 6000, 'total_timesteps': 80000}
[2026-04-15 00:02:45] Top 5:
[2026-04-15 00:02:45] score=45.67 params={'learning_rate': 0.0003, 'steps_per_switch': 6000, 'total_timesteps': 80000}
[2026-04-15 00:02:47]
[Wave4] ========== Trial 2/25 ==========
[2026-04-15 00:02:47] [Wave4] Seed trial 2/2: using hardcoded params.
[2026-04-15 00:02:47] [Wave4] Proposed params: {'learning_rate': 0.001, 'steps_per_switch': 6000, 'total_timesteps': 80000}
[2026-04-15 00:02:49] [Wave4] Launching trial 2: {'learning_rate': 0.001, 'steps_per_switch': 6000, 'total_timesteps': 80000}
[2026-04-15 00:02:49] [Wave4] Command: python3 /home/paulh/projects/donkeycar-rl-autoresearch/agent/multitrack_runner.py --total-timesteps 80000 --steps-per-switch 6000 --learning-rate 0.001 --eval-episodes 3 --save-dir /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave4-trial-0002
[2026-04-15 01:21:38] [Wave4] Trial 2 finished in 4728.4s, rc=0
[2026-04-15 01:21:38] [Wave4] Parsed: combined=222.0731 mini_monaco=222.0731
[2026-04-15 01:21:38] [Champion] 🏆 NEW BEST! Trial 2: score=222.07 (mini_monaco=222.1) params={'learning_rate': 0.001, 'steps_per_switch': 6000, 'total_timesteps': 80000}
[2026-04-15 01:21:38] [Wave4] ===== Trial 2 Summary =====
[2026-04-15 01:21:38] GP data points : 2
[2026-04-15 01:21:38] Wave4 Champion: trial=2 score=222.07 params={'learning_rate': 0.001, 'steps_per_switch': 6000, 'total_timesteps': 80000}
[2026-04-15 01:21:38] Top 5:
[2026-04-15 01:21:38] score=222.07 params={'learning_rate': 0.001, 'steps_per_switch': 6000, 'total_timesteps': 80000}
[2026-04-15 01:21:38] score=45.67 params={'learning_rate': 0.0003, 'steps_per_switch': 6000, 'total_timesteps': 80000}
[2026-04-15 01:21:40]
[Wave4] ========== Trial 3/25 ==========
[2026-04-15 01:21:40] [Wave4] Only 2 results — using random proposal.
[2026-04-15 01:21:40] [Wave4] Proposed params: {'learning_rate': 0.0006852550685205609, 'steps_per_switch': 17499, 'total_timesteps': 157743}
[2026-04-15 01:21:42] [Wave4] Launching trial 3: {'learning_rate': 0.0006852550685205609, 'steps_per_switch': 17499, 'total_timesteps': 157743}
[2026-04-15 01:21:42] [Wave4] Command: python3 /home/paulh/projects/donkeycar-rl-autoresearch/agent/multitrack_runner.py --total-timesteps 157743 --steps-per-switch 17499 --learning-rate 0.0006852550685205609 --eval-episodes 3 --save-dir /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave4-trial-0003
[2026-04-15 03:15:46] [Wave4] Trial 3 finished in 6843.7s, rc=0
[2026-04-15 03:15:46] [Wave4] Parsed: combined=1943.1038 mini_monaco=1943.1038
[2026-04-15 03:15:46] [Champion] 🏆 NEW BEST! Trial 3: score=1943.10 (mini_monaco=1943.1) params={'learning_rate': 0.0006852550685205609, 'steps_per_switch': 17499, 'total_timesteps': 157743}
[2026-04-15 03:15:46] [Wave4] ===== Trial 3 Summary =====
[2026-04-15 03:15:46] GP data points : 3
[2026-04-15 03:15:46] Wave4 Champion: trial=3 score=1943.10 params={'learning_rate': 0.0006852550685205609, 'steps_per_switch': 17499, 'total_timesteps': 157743}
[2026-04-15 03:15:46] Top 5:
[2026-04-15 03:15:46] score=1943.10 params={'learning_rate': 0.0006852550685205609, 'steps_per_switch': 17499, 'total_timesteps': 157743}
[2026-04-15 03:15:46] score=222.07 params={'learning_rate': 0.001, 'steps_per_switch': 6000, 'total_timesteps': 80000}
[2026-04-15 03:15:46] score=45.67 params={'learning_rate': 0.0003, 'steps_per_switch': 6000, 'total_timesteps': 80000}
[2026-04-15 03:15:48]
[Wave4] ========== Trial 4/25 ==========
[2026-04-15 03:15:48] [Wave4] GP UCB top-5 proposals:
[2026-04-15 03:15:48] UCB=2.4560 mu=0.8788 σ=0.7886 params={'learning_rate': 0.0003250095463348546, 'steps_per_switch': 19054, 'total_timesteps': 197116}
[2026-04-15 03:15:48] UCB=2.4518 mu=0.8393 σ=0.8062 params={'learning_rate': 0.00121703003154963, 'steps_per_switch': 16951, 'total_timesteps': 180865}
[2026-04-15 03:15:48] UCB=2.4512 mu=0.7637 σ=0.8437 params={'learning_rate': 0.00036067077082995895, 'steps_per_switch': 16532, 'total_timesteps': 211219}
[2026-04-15 03:15:48] UCB=2.4501 mu=0.9283 σ=0.7609 params={'learning_rate': 0.0005325315186424085, 'steps_per_switch': 18992, 'total_timesteps': 205595}
[2026-04-15 03:15:48] UCB=2.4492 mu=0.9106 σ=0.7693 params={'learning_rate': 0.001163360064352729, 'steps_per_switch': 19652, 'total_timesteps': 151744}
[2026-04-15 03:15:48] [Wave4] Proposed params: {'learning_rate': 0.0003250095463348546, 'steps_per_switch': 19054, 'total_timesteps': 197116}
[2026-04-15 03:15:50] [Wave4] Launching trial 4: {'learning_rate': 0.0003250095463348546, 'steps_per_switch': 19054, 'total_timesteps': 197116}
[2026-04-15 03:15:50] [Wave4] Command: python3 /home/paulh/projects/donkeycar-rl-autoresearch/agent/multitrack_runner.py --total-timesteps 197116 --steps-per-switch 19054 --learning-rate 0.0003250095463348546 --eval-episodes 3 --save-dir /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave4-trial-0004
[2026-04-15 05:15:51] [Wave4] Trial 4 TIMED OUT — killing runner.
[2026-04-15 05:15:51] [Wave4] Trial 4 finished in 7200.5s, rc=-9
[2026-04-15 05:15:51] [Wave4] Parsed: combined=None mini_monaco=None
[2026-04-15 05:15:51] [Wave4] ⚠️ No test score parsed — defaulting to 0.0
[2026-04-15 05:15:51] [Wave4] combined_test_score=0 — excluded from GP (crash/timeout).
[2026-04-15 05:15:51] [Wave4] ===== Trial 4 Summary =====
[2026-04-15 05:15:51] GP data points : 3
[2026-04-15 05:15:51] Wave4 Champion: trial=3 score=1943.10 params={'learning_rate': 0.0006852550685205609, 'steps_per_switch': 17499, 'total_timesteps': 157743}
[2026-04-15 05:15:51] Top 5:
[2026-04-15 05:15:51] score=1943.10 params={'learning_rate': 0.0006852550685205609, 'steps_per_switch': 17499, 'total_timesteps': 157743}
[2026-04-15 05:15:51] score=222.07 params={'learning_rate': 0.001, 'steps_per_switch': 6000, 'total_timesteps': 80000}
[2026-04-15 05:15:51] score=45.67 params={'learning_rate': 0.0003, 'steps_per_switch': 6000, 'total_timesteps': 80000}
[2026-04-15 05:15:53]
[Wave4] ========== Trial 5/25 ==========
[2026-04-15 05:15:53] [Wave4] GP UCB top-5 proposals:
[2026-04-15 05:15:53] UCB=2.4597 mu=0.8329 σ=0.8134 params={'learning_rate': 0.0003927960467617446, 'steps_per_switch': 19892, 'total_timesteps': 201785}
[2026-04-15 05:15:53] UCB=2.4568 mu=0.8585 σ=0.7991 params={'learning_rate': 0.0011330710879806035, 'steps_per_switch': 18089, 'total_timesteps': 193054}
[2026-04-15 05:15:53] UCB=2.4560 mu=0.7832 σ=0.8364 params={'learning_rate': 0.0006110661120319741, 'steps_per_switch': 17141, 'total_timesteps': 219583}
[2026-04-15 05:15:53] UCB=2.4560 mu=0.8338 σ=0.8111 params={'learning_rate': 0.000602366907571214, 'steps_per_switch': 16527, 'total_timesteps': 215069}
[2026-04-15 05:15:53] UCB=2.4522 mu=0.8120 σ=0.8201 params={'learning_rate': 0.0004035684210100053, 'steps_per_switch': 16067, 'total_timesteps': 208387}
[2026-04-15 05:15:53] [Wave4] Proposed params: {'learning_rate': 0.0003927960467617446, 'steps_per_switch': 19892, 'total_timesteps': 201785}
[2026-04-15 05:15:55] [Wave4] Launching trial 5: {'learning_rate': 0.0003927960467617446, 'steps_per_switch': 19892, 'total_timesteps': 201785}
[2026-04-15 05:15:55] [Wave4] Command: python3 /home/paulh/projects/donkeycar-rl-autoresearch/agent/multitrack_runner.py --total-timesteps 201785 --steps-per-switch 19892 --learning-rate 0.0003927960467617446 --eval-episodes 3 --save-dir /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave4-trial-0005
[2026-04-15 07:15:57] [Wave4] Trial 5 TIMED OUT — killing runner.
[2026-04-15 07:15:57] [Wave4] Trial 5 finished in 7202.3s, rc=-9
[2026-04-15 07:15:57] [Wave4] Parsed: combined=None mini_monaco=None
[2026-04-15 07:15:57] [Wave4] ⚠️ No test score parsed — defaulting to 0.0
[2026-04-15 07:15:57] [Wave4] combined_test_score=0 — excluded from GP (crash/timeout).
[2026-04-15 07:15:57] [Wave4] ===== Trial 5 Summary =====
[2026-04-15 07:15:57] GP data points : 3
[2026-04-15 07:15:57] Wave4 Champion: trial=3 score=1943.10 params={'learning_rate': 0.0006852550685205609, 'steps_per_switch': 17499, 'total_timesteps': 157743}
[2026-04-15 07:15:57] Top 5:
[2026-04-15 07:15:57] score=1943.10 params={'learning_rate': 0.0006852550685205609, 'steps_per_switch': 17499, 'total_timesteps': 157743}
[2026-04-15 07:15:57] score=222.07 params={'learning_rate': 0.001, 'steps_per_switch': 6000, 'total_timesteps': 80000}
[2026-04-15 07:15:57] score=45.67 params={'learning_rate': 0.0003, 'steps_per_switch': 6000, 'total_timesteps': 80000}
[2026-04-15 07:15:59] [Wave4] ✅ Git push complete after trial 5
[2026-04-15 07:16:01]
[Wave4] ========== Trial 6/25 ==========
[2026-04-15 07:16:01] [Wave4] GP UCB top-5 proposals:
[2026-04-15 07:16:01] UCB=2.4565 mu=0.8712 σ=0.7926 params={'learning_rate': 0.0011062087200910864, 'steps_per_switch': 18318, 'total_timesteps': 194470}
[2026-04-15 07:16:01] UCB=2.4485 mu=0.9338 σ=0.7573 params={'learning_rate': 0.0004307107164246544, 'steps_per_switch': 19141, 'total_timesteps': 199878}
[2026-04-15 07:16:01] UCB=2.4478 mu=0.8840 σ=0.7819 params={'learning_rate': 0.00041215765557335777, 'steps_per_switch': 16229, 'total_timesteps': 203707}
[2026-04-15 07:16:01] UCB=2.4468 mu=0.8283 σ=0.8092 params={'learning_rate': 0.0009928039664024839, 'steps_per_switch': 19629, 'total_timesteps': 113788}
[2026-04-15 07:16:01] UCB=2.4456 mu=0.9298 σ=0.7579 params={'learning_rate': 0.0002412156295150517, 'steps_per_switch': 19116, 'total_timesteps': 179367}
[2026-04-15 07:16:01] [Wave4] Proposed params: {'learning_rate': 0.0011062087200910864, 'steps_per_switch': 18318, 'total_timesteps': 194470}
[2026-04-15 07:16:03] [Wave4] Launching trial 6: {'learning_rate': 0.0011062087200910864, 'steps_per_switch': 18318, 'total_timesteps': 194470}
[2026-04-15 07:16:03] [Wave4] Command: python3 /home/paulh/projects/donkeycar-rl-autoresearch/agent/multitrack_runner.py --total-timesteps 194470 --steps-per-switch 18318 --learning-rate 0.0011062087200910864 --eval-episodes 3 --save-dir /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave4-trial-0006
[2026-04-15 09:06:26] =================================================================
[2026-04-15 09:06:26] [Wave4] Multi-Track Autoresearch — GP+UCB Generalization Search
[2026-04-15 09:06:26] [Wave4] Training tracks : generated_track, mountain_track (no generated_road, no warm-start)
[2026-04-15 09:06:26] [Wave4] Test tracks : mini_monaco only (zero-shot; warren removed — broken done condition)
[2026-04-15 09:06:26] [Wave4] Max trials : 25 | kappa=2.0 | push every 5
[2026-04-15 09:06:26] [Wave4] Results file : /home/paulh/projects/donkeycar-rl-autoresearch/agent/outerloop-results/autoresearch_results_phase4.jsonl
[2026-04-15 09:06:26] [Wave4] Champion dir : /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave4-champion
[2026-04-15 09:06:26] [Wave4] Warm start : NONE (training from scratch each trial)
[2026-04-15 09:06:26] =================================================================
[2026-04-15 09:06:26] [Wave4] Loaded 0 existing Phase 3 results.
[2026-04-15 09:06:26] [Wave4] Wave4 Champion: trial=3 score=1943.10 params={'learning_rate': 0.0006852550685205609, 'steps_per_switch': 17499, 'total_timesteps': 157743}
[2026-04-15 09:06:26] [Wave4] Starting from trial 1.
[2026-04-15 09:06:26]
[Wave4] ========== Trial 1/25 ==========
[2026-04-15 09:06:26] [Wave4] Seed trial 1/2: using hardcoded params.
[2026-04-15 09:06:26] [Wave4] Proposed params: {'learning_rate': 0.0003, 'steps_per_switch': 6000, 'total_timesteps': 80000}
[2026-04-15 09:06:28] [Wave4] Launching trial 1: {'learning_rate': 0.0003, 'steps_per_switch': 6000, 'total_timesteps': 80000}
[2026-04-15 09:06:28] [Wave4] Command: python3 /home/paulh/projects/donkeycar-rl-autoresearch/agent/multitrack_runner.py --total-timesteps 80000 --steps-per-switch 6000 --learning-rate 0.0003 --eval-episodes 3 --save-dir /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave4-trial-0001
[2026-04-15 09:17:28] =================================================================
[2026-04-15 09:17:28] [Wave4] Multi-Track Autoresearch — GP+UCB Generalization Search
[2026-04-15 09:17:28] [Wave4] Training tracks : generated_track, mountain_track (no generated_road, no warm-start)
[2026-04-15 09:17:28] [Wave4] Test tracks : mini_monaco only (zero-shot; warren removed — broken done condition)
[2026-04-15 09:17:28] [Wave4] Max trials : 25 | kappa=2.0 | push every 5
[2026-04-15 09:17:28] [Wave4] Results file : /home/paulh/projects/donkeycar-rl-autoresearch/agent/outerloop-results/autoresearch_results_phase4.jsonl
[2026-04-15 09:17:28] [Wave4] Champion dir : /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave4-champion
[2026-04-15 09:17:28] [Wave4] Warm start : NONE (training from scratch each trial)
[2026-04-15 09:17:28] =================================================================
[2026-04-15 09:17:28] [Wave4] Loaded 0 existing Phase 3 results.
[2026-04-15 09:17:28] [Wave4] Wave4 Champion: trial=3 score=1943.10 params={'learning_rate': 0.0006852550685205609, 'steps_per_switch': 17499, 'total_timesteps': 157743}
[2026-04-15 09:17:28] [Wave4] Starting from trial 1.
[2026-04-15 09:17:28]
[Wave4] ========== Trial 1/25 ==========
[2026-04-15 09:17:28] [Wave4] Seed trial 1/2: using hardcoded params.
[2026-04-15 09:17:28] [Wave4] Proposed params: {'learning_rate': 0.0003, 'steps_per_switch': 6000, 'total_timesteps': 80000}
[2026-04-15 09:17:30] [Wave4] Launching trial 1: {'learning_rate': 0.0003, 'steps_per_switch': 6000, 'total_timesteps': 80000}
[2026-04-15 09:17:30] [Wave4] Command: python3 /home/paulh/projects/donkeycar-rl-autoresearch/agent/multitrack_runner.py --total-timesteps 80000 --steps-per-switch 6000 --learning-rate 0.0003 --eval-episodes 3 --save-dir /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave4-trial-0001
[2026-04-15 10:33:27] [Wave4] Trial 1 finished in 4557.0s, rc=0
[2026-04-15 10:33:27] [Wave4] Parsed: combined=42.2964 mini_monaco=42.2964
[2026-04-15 10:33:27] [Wave4] ===== Trial 1 Summary =====
[2026-04-15 10:33:27] GP data points : 1
[2026-04-15 10:33:27] Wave4 Champion: trial=3 score=1943.10 params={'learning_rate': 0.0006852550685205609, 'steps_per_switch': 17499, 'total_timesteps': 157743}
[2026-04-15 10:33:27] Top 5:
[2026-04-15 10:33:27] score=42.30 params={'learning_rate': 0.0003, 'steps_per_switch': 6000, 'total_timesteps': 80000}
[2026-04-15 10:33:29]
[Wave4] ========== Trial 2/25 ==========
[2026-04-15 10:33:29] [Wave4] Seed trial 2/2: using hardcoded params.
[2026-04-15 10:33:29] [Wave4] Proposed params: {'learning_rate': 0.001, 'steps_per_switch': 6000, 'total_timesteps': 80000}
[2026-04-15 10:33:31] [Wave4] Launching trial 2: {'learning_rate': 0.001, 'steps_per_switch': 6000, 'total_timesteps': 80000}
[2026-04-15 10:33:31] [Wave4] Command: python3 /home/paulh/projects/donkeycar-rl-autoresearch/agent/multitrack_runner.py --total-timesteps 80000 --steps-per-switch 6000 --learning-rate 0.001 --eval-episodes 3 --save-dir /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave4-trial-0002
[2026-04-15 11:39:05] [Wave4] Trial 2 finished in 3934.0s, rc=0
[2026-04-15 11:39:05] [Wave4] Parsed: combined=93.3894 mini_monaco=93.3894
[2026-04-15 11:39:05] [Wave4] ===== Trial 2 Summary =====
[2026-04-15 11:39:05] GP data points : 2
[2026-04-15 11:39:05] Wave4 Champion: trial=3 score=1943.10 params={'learning_rate': 0.0006852550685205609, 'steps_per_switch': 17499, 'total_timesteps': 157743}
[2026-04-15 11:39:05] Top 5:
[2026-04-15 11:39:05] score=93.39 params={'learning_rate': 0.001, 'steps_per_switch': 6000, 'total_timesteps': 80000}
[2026-04-15 11:39:05] score=42.30 params={'learning_rate': 0.0003, 'steps_per_switch': 6000, 'total_timesteps': 80000}
[2026-04-15 11:39:07]
[Wave4] ========== Trial 3/25 ==========
[2026-04-15 11:39:07] [Wave4] Only 2 results — using random proposal.
[2026-04-15 11:39:07] [Wave4] Proposed params: {'learning_rate': 0.0008162408849407889, 'steps_per_switch': 8441, 'total_timesteps': 140634}
[2026-04-15 11:39:09] [Wave4] Launching trial 3: {'learning_rate': 0.0008162408849407889, 'steps_per_switch': 8441, 'total_timesteps': 140634}
[2026-04-15 11:39:09] [Wave4] Command: python3 /home/paulh/projects/donkeycar-rl-autoresearch/agent/multitrack_runner.py --total-timesteps 140634 --steps-per-switch 8441 --learning-rate 0.0008162408849407889 --eval-episodes 3 --save-dir /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave4-trial-0003
[2026-04-15 13:39:32] [Wave4] Trial 3 TIMED OUT — killing runner.
[2026-04-15 13:39:32] [Wave4] Trial 3 finished in 7222.7s, rc=-9
[2026-04-15 13:39:32] [Wave4] Parsed: combined=None mini_monaco=None
[2026-04-15 13:39:32] [Wave4] ⚠️ No test score parsed — defaulting to 0.0
[2026-04-15 13:39:32] [Wave4] combined_test_score=0 — excluded from GP (crash/timeout).
[2026-04-15 13:39:32] [Wave4] ===== Trial 3 Summary =====
[2026-04-15 13:39:32] GP data points : 2
[2026-04-15 13:39:32] Wave4 Champion: trial=3 score=1943.10 params={'learning_rate': 0.0006852550685205609, 'steps_per_switch': 17499, 'total_timesteps': 157743}
[2026-04-15 13:39:32] Top 5:
[2026-04-15 13:39:32] score=93.39 params={'learning_rate': 0.001, 'steps_per_switch': 6000, 'total_timesteps': 80000}
[2026-04-15 13:39:32] score=42.30 params={'learning_rate': 0.0003, 'steps_per_switch': 6000, 'total_timesteps': 80000}
[2026-04-15 13:39:34]
[Wave4] ========== Trial 4/25 ==========
[2026-04-15 13:39:34] [Wave4] Only 2 results — using random proposal.
[2026-04-15 13:39:34] [Wave4] Proposed params: {'learning_rate': 0.00020853884350577402, 'steps_per_switch': 19927, 'total_timesteps': 138928}
[2026-04-15 13:39:36] [Wave4] Launching trial 4: {'learning_rate': 0.00020853884350577402, 'steps_per_switch': 19927, 'total_timesteps': 138928}
[2026-04-15 13:39:36] [Wave4] Command: python3 /home/paulh/projects/donkeycar-rl-autoresearch/agent/multitrack_runner.py --total-timesteps 138928 --steps-per-switch 19927 --learning-rate 0.00020853884350577402 --eval-episodes 3 --save-dir /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave4-trial-0004
[2026-04-15 15:39:37] [Wave4] Trial 4 TIMED OUT — killing runner.
[2026-04-15 15:39:37] [Wave4] Trial 4 finished in 7200.7s, rc=-9
[2026-04-15 15:39:37] [Wave4] Parsed: combined=None mini_monaco=None
[2026-04-15 15:39:37] [Wave4] ⚠️ No test score parsed — defaulting to 0.0
[2026-04-15 15:39:37] [Wave4] combined_test_score=0 — excluded from GP (crash/timeout).
[2026-04-15 15:39:37] [Wave4] ===== Trial 4 Summary =====
[2026-04-15 15:39:37] GP data points : 2
[2026-04-15 15:39:37] Wave4 Champion: trial=3 score=1943.10 params={'learning_rate': 0.0006852550685205609, 'steps_per_switch': 17499, 'total_timesteps': 157743}
[2026-04-15 15:39:37] Top 5:
[2026-04-15 15:39:37] score=93.39 params={'learning_rate': 0.001, 'steps_per_switch': 6000, 'total_timesteps': 80000}
[2026-04-15 15:39:37] score=42.30 params={'learning_rate': 0.0003, 'steps_per_switch': 6000, 'total_timesteps': 80000}
[2026-04-15 15:39:39]
[Wave4] ========== Trial 5/25 ==========
[2026-04-15 15:39:39] [Wave4] Only 2 results — using random proposal.
[2026-04-15 15:39:39] [Wave4] Proposed params: {'learning_rate': 0.0007517877668650138, 'steps_per_switch': 9368, 'total_timesteps': 104878}
[2026-04-15 15:39:41] [Wave4] Launching trial 5: {'learning_rate': 0.0007517877668650138, 'steps_per_switch': 9368, 'total_timesteps': 104878}
[2026-04-15 15:39:41] [Wave4] Command: python3 /home/paulh/projects/donkeycar-rl-autoresearch/agent/multitrack_runner.py --total-timesteps 104878 --steps-per-switch 9368 --learning-rate 0.0007517877668650138 --eval-episodes 3 --save-dir /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave4-trial-0005
[2026-04-15 17:08:50] [Wave4] Trial 5 finished in 5348.8s, rc=0
[2026-04-15 17:08:50] [Wave4] Parsed: combined=31.73 mini_monaco=31.73
[2026-04-15 17:08:50] [Wave4] ===== Trial 5 Summary =====
[2026-04-15 17:08:50] GP data points : 3
[2026-04-15 17:08:50] Wave4 Champion: trial=3 score=1943.10 params={'learning_rate': 0.0006852550685205609, 'steps_per_switch': 17499, 'total_timesteps': 157743}
[2026-04-15 17:08:50] Top 5:
[2026-04-15 17:08:50] score=93.39 params={'learning_rate': 0.001, 'steps_per_switch': 6000, 'total_timesteps': 80000}
[2026-04-15 17:08:50] score=42.30 params={'learning_rate': 0.0003, 'steps_per_switch': 6000, 'total_timesteps': 80000}
[2026-04-15 17:08:50] score=31.73 params={'learning_rate': 0.0007517877668650138, 'steps_per_switch': 9368, 'total_timesteps': 104878}
[2026-04-15 17:08:51] [Wave4] ✅ Git push complete after trial 5
[2026-04-15 17:08:53]
[Wave4] ========== Trial 6/25 ==========
[2026-04-15 17:08:53] [Wave4] GP UCB top-5 proposals:
[2026-04-15 17:08:53] UCB=2.9352 mu=1.3419 σ=0.7966 params={'learning_rate': 0.0016223486895735558, 'steps_per_switch': 5524, 'total_timesteps': 79532}
[2026-04-15 17:08:53] UCB=2.8442 mu=1.0827 σ=0.8808 params={'learning_rate': 0.0017357820530198068, 'steps_per_switch': 5009, 'total_timesteps': 87890}
[2026-04-15 17:08:53] UCB=2.8258 mu=1.0947 σ=0.8655 params={'learning_rate': 0.001668571948240882, 'steps_per_switch': 4814, 'total_timesteps': 101589}
[2026-04-15 17:08:53] UCB=2.8168 mu=0.9976 σ=0.9096 params={'learning_rate': 0.001788300003253932, 'steps_per_switch': 4215, 'total_timesteps': 80751}
[2026-04-15 17:08:53] UCB=2.8147 mu=1.5860 σ=0.6144 params={'learning_rate': 0.0012975326127189415, 'steps_per_switch': 3961, 'total_timesteps': 100527}
[2026-04-15 17:08:53] [Wave4] Proposed params: {'learning_rate': 0.0016223486895735558, 'steps_per_switch': 5524, 'total_timesteps': 79532}
[2026-04-15 17:08:55] [Wave4] Launching trial 6: {'learning_rate': 0.0016223486895735558, 'steps_per_switch': 5524, 'total_timesteps': 79532}
[2026-04-15 17:08:55] [Wave4] Command: python3 /home/paulh/projects/donkeycar-rl-autoresearch/agent/multitrack_runner.py --total-timesteps 79532 --steps-per-switch 5524 --learning-rate 0.0016223486895735558 --eval-episodes 3 --save-dir /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave4-trial-0006
[2026-04-15 18:19:30] [Wave4] Trial 6 finished in 4235.3s, rc=0
[2026-04-15 18:19:30] [Wave4] Parsed: combined=176.6721 mini_monaco=176.6721
[2026-04-15 18:19:31] [Wave4] ===== Trial 6 Summary =====
[2026-04-15 18:19:31] GP data points : 4
[2026-04-15 18:19:31] Wave4 Champion: trial=3 score=1943.10 params={'learning_rate': 0.0006852550685205609, 'steps_per_switch': 17499, 'total_timesteps': 157743}
[2026-04-15 18:19:31] Top 5:
[2026-04-15 18:19:31] score=176.67 params={'learning_rate': 0.0016223486895735558, 'steps_per_switch': 5524, 'total_timesteps': 79532}
[2026-04-15 18:19:31] score=93.39 params={'learning_rate': 0.001, 'steps_per_switch': 6000, 'total_timesteps': 80000}
[2026-04-15 18:19:31] score=42.30 params={'learning_rate': 0.0003, 'steps_per_switch': 6000, 'total_timesteps': 80000}
[2026-04-15 18:19:31] score=31.73 params={'learning_rate': 0.0007517877668650138, 'steps_per_switch': 9368, 'total_timesteps': 104878}
[2026-04-15 18:19:33]
[Wave4] ========== Trial 7/25 ==========
[2026-04-15 18:19:33] [Wave4] GP UCB top-5 proposals:
[2026-04-15 18:19:33] UCB=2.5520 mu=1.1319 σ=0.7101 params={'learning_rate': 0.001779556625962812, 'steps_per_switch': 4226, 'total_timesteps': 123651}
[2026-04-15 18:19:33] UCB=2.5115 mu=0.7590 σ=0.8763 params={'learning_rate': 0.0017725367196782225, 'steps_per_switch': 3941, 'total_timesteps': 145375}
[2026-04-15 18:19:33] UCB=2.4988 mu=1.1176 σ=0.6906 params={'learning_rate': 0.0018568268138302447, 'steps_per_switch': 6910, 'total_timesteps': 119000}
[2026-04-15 18:19:33] UCB=2.4973 mu=0.8571 σ=0.8201 params={'learning_rate': 0.0019597767383017994, 'steps_per_switch': 9245, 'total_timesteps': 113699}
[2026-04-15 18:19:33] UCB=2.4874 mu=1.2498 σ=0.6188 params={'learning_rate': 0.001739834862935009, 'steps_per_switch': 4764, 'total_timesteps': 117423}
[2026-04-15 18:19:33] [Wave4] Proposed params: {'learning_rate': 0.001779556625962812, 'steps_per_switch': 4226, 'total_timesteps': 123651}
[2026-04-15 18:19:35] [Wave4] Launching trial 7: {'learning_rate': 0.001779556625962812, 'steps_per_switch': 4226, 'total_timesteps': 123651}
[2026-04-15 18:19:35] [Wave4] Command: python3 /home/paulh/projects/donkeycar-rl-autoresearch/agent/multitrack_runner.py --total-timesteps 123651 --steps-per-switch 4226 --learning-rate 0.001779556625962812 --eval-episodes 3 --save-dir /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave4-trial-0007
[2026-04-15 20:19:38] [Wave4] Trial 7 TIMED OUT — killing runner.
[2026-04-15 20:19:38] [Wave4] Trial 7 finished in 7203.4s, rc=-9
[2026-04-15 20:19:38] [Wave4] Parsed: combined=None mini_monaco=None
[2026-04-15 20:19:38] [Wave4] ⚠️ No test score parsed — defaulting to 0.0
[2026-04-15 20:19:38] [Wave4] combined_test_score=0 — excluded from GP (crash/timeout).
[2026-04-15 20:19:38] [Wave4] ===== Trial 7 Summary =====
[2026-04-15 20:19:38] GP data points : 4
[2026-04-15 20:19:38] Wave4 Champion: trial=3 score=1943.10 params={'learning_rate': 0.0006852550685205609, 'steps_per_switch': 17499, 'total_timesteps': 157743}
[2026-04-15 20:19:38] Top 5:
[2026-04-15 20:19:38] score=176.67 params={'learning_rate': 0.0016223486895735558, 'steps_per_switch': 5524, 'total_timesteps': 79532}
[2026-04-15 20:19:38] score=93.39 params={'learning_rate': 0.001, 'steps_per_switch': 6000, 'total_timesteps': 80000}
[2026-04-15 20:19:38] score=42.30 params={'learning_rate': 0.0003, 'steps_per_switch': 6000, 'total_timesteps': 80000}
[2026-04-15 20:19:38] score=31.73 params={'learning_rate': 0.0007517877668650138, 'steps_per_switch': 9368, 'total_timesteps': 104878}
[2026-04-15 20:19:40]
[Wave4] ========== Trial 8/25 ==========
[2026-04-15 20:19:40] [Wave4] GP UCB top-5 proposals:
[2026-04-15 20:19:40] UCB=2.5481 mu=1.2081 σ=0.6700 params={'learning_rate': 0.0019521225364821895, 'steps_per_switch': 4185, 'total_timesteps': 110996}
[2026-04-15 20:19:40] UCB=2.5357 mu=0.8887 σ=0.8235 params={'learning_rate': 0.001901474046587741, 'steps_per_switch': 5568, 'total_timesteps': 136179}
[2026-04-15 20:19:40] UCB=2.5188 mu=1.1408 σ=0.6890 params={'learning_rate': 0.0018359364507444984, 'steps_per_switch': 5746, 'total_timesteps': 122356}
[2026-04-15 20:19:40] UCB=2.4994 mu=0.7140 σ=0.8927 params={'learning_rate': 0.0019730433529852004, 'steps_per_switch': 4578, 'total_timesteps': 144668}
[2026-04-15 20:19:40] UCB=2.4934 mu=1.4169 σ=0.5382 params={'learning_rate': 0.0018644207617691767, 'steps_per_switch': 3071, 'total_timesteps': 75515}
[2026-04-15 20:19:40] [Wave4] Proposed params: {'learning_rate': 0.0019521225364821895, 'steps_per_switch': 4185, 'total_timesteps': 110996}
[2026-04-15 20:19:42] [Wave4] Launching trial 8: {'learning_rate': 0.0019521225364821895, 'steps_per_switch': 4185, 'total_timesteps': 110996}
[2026-04-15 20:19:42] [Wave4] Command: python3 /home/paulh/projects/donkeycar-rl-autoresearch/agent/multitrack_runner.py --total-timesteps 110996 --steps-per-switch 4185 --learning-rate 0.0019521225364821895 --eval-episodes 3 --save-dir /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave4-trial-0008
[2026-04-15 22:19:42] [Wave4] Trial 8 TIMED OUT — killing runner.
[2026-04-15 22:19:42] [Wave4] Trial 8 finished in 7200.2s, rc=-9
[2026-04-15 22:19:42] [Wave4] Parsed: combined=None mini_monaco=None
[2026-04-15 22:19:42] [Wave4] ⚠️ No test score parsed — defaulting to 0.0
[2026-04-15 22:19:42] [Wave4] combined_test_score=0 — excluded from GP (crash/timeout).
[2026-04-15 22:19:42] [Wave4] ===== Trial 8 Summary =====
[2026-04-15 22:19:42] GP data points : 4
[2026-04-15 22:19:42] Wave4 Champion: trial=3 score=1943.10 params={'learning_rate': 0.0006852550685205609, 'steps_per_switch': 17499, 'total_timesteps': 157743}
[2026-04-15 22:19:42] Top 5:
[2026-04-15 22:19:42] score=176.67 params={'learning_rate': 0.0016223486895735558, 'steps_per_switch': 5524, 'total_timesteps': 79532}
[2026-04-15 22:19:42] score=93.39 params={'learning_rate': 0.001, 'steps_per_switch': 6000, 'total_timesteps': 80000}
[2026-04-15 22:19:42] score=42.30 params={'learning_rate': 0.0003, 'steps_per_switch': 6000, 'total_timesteps': 80000}
[2026-04-15 22:19:42] score=31.73 params={'learning_rate': 0.0007517877668650138, 'steps_per_switch': 9368, 'total_timesteps': 104878}
[2026-04-15 22:19:44]
[Wave4] ========== Trial 9/25 ==========
[2026-04-15 22:19:44] [Wave4] GP UCB top-5 proposals:
[2026-04-15 22:19:44] UCB=2.5432 mu=1.2960 σ=0.6236 params={'learning_rate': 0.001989433235306402, 'steps_per_switch': 3575, 'total_timesteps': 65457}
[2026-04-15 22:19:44] UCB=2.5396 mu=1.1027 σ=0.7185 params={'learning_rate': 0.001979671928972082, 'steps_per_switch': 8884, 'total_timesteps': 73116}
[2026-04-15 22:19:44] UCB=2.5350 mu=1.2006 σ=0.6672 params={'learning_rate': 0.0019228314496482347, 'steps_per_switch': 4960, 'total_timesteps': 115468}
[2026-04-15 22:19:44] UCB=2.5256 mu=1.1804 σ=0.6726 params={'learning_rate': 0.0016231147459723914, 'steps_per_switch': 3259, 'total_timesteps': 115708}
[2026-04-15 22:19:44] UCB=2.5201 mu=0.9834 σ=0.7683 params={'learning_rate': 0.0016588035055714473, 'steps_per_switch': 4314, 'total_timesteps': 131749}
[2026-04-15 22:19:44] [Wave4] Proposed params: {'learning_rate': 0.001989433235306402, 'steps_per_switch': 3575, 'total_timesteps': 65457}
[2026-04-15 22:19:46] [Wave4] Launching trial 9: {'learning_rate': 0.001989433235306402, 'steps_per_switch': 3575, 'total_timesteps': 65457}
[2026-04-15 22:19:46] [Wave4] Command: python3 /home/paulh/projects/donkeycar-rl-autoresearch/agent/multitrack_runner.py --total-timesteps 65457 --steps-per-switch 3575 --learning-rate 0.001989433235306402 --eval-episodes 3 --save-dir /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave4-trial-0009
[2026-04-15 22:23:21] [Wave4] Trial 9 finished in 215.0s, rc=101
[2026-04-15 22:23:21] [Wave4] Parsed: combined=None mini_monaco=None
[2026-04-15 22:23:21] [Wave4] ⚠️ No test score parsed — defaulting to 0.0
[2026-04-15 22:23:21] [Wave4] combined_test_score=0 — excluded from GP (crash/timeout).
[2026-04-15 22:23:21] [Wave4] ===== Trial 9 Summary =====
[2026-04-15 22:23:21] GP data points : 4
[2026-04-15 22:23:21] Wave4 Champion: trial=3 score=1943.10 params={'learning_rate': 0.0006852550685205609, 'steps_per_switch': 17499, 'total_timesteps': 157743}
[2026-04-15 22:23:21] Top 5:
[2026-04-15 22:23:21] score=176.67 params={'learning_rate': 0.0016223486895735558, 'steps_per_switch': 5524, 'total_timesteps': 79532}
[2026-04-15 22:23:21] score=93.39 params={'learning_rate': 0.001, 'steps_per_switch': 6000, 'total_timesteps': 80000}
[2026-04-15 22:23:21] score=42.30 params={'learning_rate': 0.0003, 'steps_per_switch': 6000, 'total_timesteps': 80000}
[2026-04-15 22:23:21] score=31.73 params={'learning_rate': 0.0007517877668650138, 'steps_per_switch': 9368, 'total_timesteps': 104878}
[2026-04-15 22:23:23]
[Wave4] ========== Trial 10/25 ==========
[2026-04-15 22:23:23] [Wave4] GP UCB top-5 proposals:
[2026-04-15 22:23:23] UCB=2.5794 mu=0.9857 σ=0.7969 params={'learning_rate': 0.00192547022313727, 'steps_per_switch': 3237, 'total_timesteps': 124659}
[2026-04-15 22:23:23] UCB=2.5191 mu=1.3579 σ=0.5806 params={'learning_rate': 0.0019414376395480834, 'steps_per_switch': 3402, 'total_timesteps': 69220}
[2026-04-15 22:23:23] UCB=2.5097 mu=0.7258 σ=0.8919 params={'learning_rate': 0.0019051112417148412, 'steps_per_switch': 3607, 'total_timesteps': 144368}
[2026-04-15 22:23:23] UCB=2.4894 mu=1.2599 σ=0.6148 params={'learning_rate': 0.001905194185221269, 'steps_per_switch': 5874, 'total_timesteps': 111439}
[2026-04-15 22:23:23] UCB=2.4776 mu=1.1168 σ=0.6804 params={'learning_rate': 0.0017822503576577222, 'steps_per_switch': 6596, 'total_timesteps': 121681}
[2026-04-15 22:23:23] [Wave4] Proposed params: {'learning_rate': 0.00192547022313727, 'steps_per_switch': 3237, 'total_timesteps': 124659}
[2026-04-15 22:23:25] [Wave4] Launching trial 10: {'learning_rate': 0.00192547022313727, 'steps_per_switch': 3237, 'total_timesteps': 124659}
[2026-04-15 22:23:25] [Wave4] Command: python3 /home/paulh/projects/donkeycar-rl-autoresearch/agent/multitrack_runner.py --total-timesteps 124659 --steps-per-switch 3237 --learning-rate 0.00192547022313727 --eval-episodes 3 --save-dir /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave4-trial-0010
[2026-04-15 22:26:54] =================================================================
[2026-04-15 22:26:54] [Wave4] Multi-Track Autoresearch — GP+UCB Generalization Search
[2026-04-15 22:26:54] [Wave4] Training tracks : generated_track, mountain_track (no generated_road, no warm-start)
[2026-04-15 22:26:54] [Wave4] Test tracks : mini_monaco only (zero-shot; warren removed — broken done condition)
[2026-04-15 22:26:54] [Wave4] Max trials : 25 | kappa=2.0 | push every 5
[2026-04-15 22:26:54] [Wave4] Results file : /home/paulh/projects/donkeycar-rl-autoresearch/agent/outerloop-results/autoresearch_results_phase4.jsonl
[2026-04-15 22:26:54] [Wave4] Champion dir : /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave4-champion
[2026-04-15 22:26:54] [Wave4] Warm start : NONE (training from scratch each trial)
[2026-04-15 22:26:54] =================================================================
[2026-04-15 22:26:54] [Wave4] Loaded 5 existing Phase 3 results.
[2026-04-15 22:26:54] [Wave4] Wave4 Champion: trial=3 score=1943.10 params={'learning_rate': 0.0006852550685205609, 'steps_per_switch': 17499, 'total_timesteps': 157743}
[2026-04-15 22:26:54] [Wave4] Starting from trial 6.
[2026-04-15 22:26:54]
[Wave4] ========== Trial 6/25 ==========
[2026-04-15 22:26:54] [Wave4] GP UCB top-5 proposals:
[2026-04-15 22:26:54] UCB=2.8029 mu=1.3217 σ=0.7406 params={'learning_rate': 0.0009434282949002715, 'steps_per_switch': 14966, 'total_timesteps': 83094}
[2026-04-15 22:26:54] UCB=2.7637 mu=1.4556 σ=0.6540 params={'learning_rate': 0.001016649027182601, 'steps_per_switch': 14757, 'total_timesteps': 85809}
[2026-04-15 22:26:54] UCB=2.7344 mu=1.1173 σ=0.8085 params={'learning_rate': 0.000525489856531106, 'steps_per_switch': 14503, 'total_timesteps': 81150}
[2026-04-15 22:26:54] UCB=2.7210 mu=1.0163 σ=0.8523 params={'learning_rate': 0.000448503297396427, 'steps_per_switch': 14723, 'total_timesteps': 80477}
[2026-04-15 22:26:54] UCB=2.6726 mu=0.9116 σ=0.8805 params={'learning_rate': 0.0011227428004033503, 'steps_per_switch': 14832, 'total_timesteps': 81442}
[2026-04-15 22:26:54] [Wave4] Proposed params: {'learning_rate': 0.0009434282949002715, 'steps_per_switch': 14966, 'total_timesteps': 83094}
[2026-04-15 22:26:56] [Wave4] Launching trial 6: {'learning_rate': 0.0009434282949002715, 'steps_per_switch': 14966, 'total_timesteps': 83094}
[2026-04-15 22:26:56] [Wave4] Command: python3 /home/paulh/projects/donkeycar-rl-autoresearch/agent/multitrack_runner.py --total-timesteps 83094 --steps-per-switch 14966 --learning-rate 0.0009434282949002715 --eval-episodes 3 --save-dir /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave4-trial-0006