donkeycar-rl-autoresearch/agent/outerloop-results/autoresearch_phase3_log.txt

201 lines
21 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

[2026-04-14 12:44:38] [Wave3] Seed trial 1/2: using hardcoded params.
[2026-04-14 12:44:38] [Wave3] Seed trial 2/2: using hardcoded params.
[2026-04-14 12:44:38] [Wave3] Only 0 results — using random proposal.
[2026-04-14 12:44:38] [Champion] 🏆 NEW BEST! Trial 3: combined=1500.00 (mini_monaco=900.0, warren=600.0) params={'learning_rate': 0.0002, 'steps_per_switch': 8000, 'total_timesteps': 150000}
[2026-04-14 12:44:38] [Champion] 🏆 NEW BEST! Trial 1: combined=2000.00 (mini_monaco=1200.0, warren=800.0) params={}
[2026-04-14 12:45:00] [Wave3] Seed trial 1/2: using hardcoded params.
[2026-04-14 12:45:00] [Wave3] Seed trial 2/2: using hardcoded params.
[2026-04-14 12:45:00] [Wave3] Only 0 results — using random proposal.
[2026-04-14 12:45:00] [Champion] 🏆 NEW BEST! Trial 3: combined=1500.00 (mini_monaco=900.0, warren=600.0) params={'learning_rate': 0.0002, 'steps_per_switch': 8000, 'total_timesteps': 150000}
[2026-04-14 12:45:00] [Champion] 🏆 NEW BEST! Trial 1: combined=2000.00 (mini_monaco=1200.0, warren=800.0) params={}
[2026-04-14 12:45:27] [Wave3] Seed trial 1/2: using hardcoded params.
[2026-04-14 12:45:27] [Wave3] Seed trial 2/2: using hardcoded params.
[2026-04-14 12:45:27] [Wave3] Only 0 results — using random proposal.
[2026-04-14 12:45:27] [Champion] 🏆 NEW BEST! Trial 3: combined=1500.00 (mini_monaco=900.0, warren=600.0) params={'learning_rate': 0.0002, 'steps_per_switch': 8000, 'total_timesteps': 150000}
[2026-04-14 12:45:27] [Champion] 🏆 NEW BEST! Trial 1: combined=2000.00 (mini_monaco=1200.0, warren=800.0) params={}
[2026-04-14 12:45:39] [Wave3] Seed trial 1/2: using hardcoded params.
[2026-04-14 12:45:39] [Wave3] Seed trial 2/2: using hardcoded params.
[2026-04-14 12:45:39] [Wave3] Only 0 results — using random proposal.
[2026-04-14 12:45:39] [Champion] 🏆 NEW BEST! Trial 3: combined=1500.00 (mini_monaco=900.0, warren=600.0) params={'learning_rate': 0.0002, 'steps_per_switch': 8000, 'total_timesteps': 150000}
[2026-04-14 12:45:39] [Champion] 🏆 NEW BEST! Trial 1: combined=2000.00 (mini_monaco=1200.0, warren=800.0) params={}
[2026-04-14 12:47:25] =================================================================
[2026-04-14 12:47:25] [Wave3] Multi-Track Autoresearch — GP+UCB Generalization Search
[2026-04-14 12:47:25] [Wave3] Training tracks : generated_road, generated_track, mountain_track
[2026-04-14 12:47:25] [Wave3] Test tracks : mini_monaco, warren (zero-shot)
[2026-04-14 12:47:25] [Wave3] Max trials : 25 | kappa=2.0 | push every 5
[2026-04-14 12:47:25] [Wave3] Results file : /home/paulh/projects/donkeycar-rl-autoresearch/agent/outerloop-results/autoresearch_results_phase3.jsonl
[2026-04-14 12:47:25] [Wave3] Champion dir : /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave3-champion
[2026-04-14 12:47:25] [Wave3] Warm start : /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/champion/model.zip
[2026-04-14 12:47:25] =================================================================
[2026-04-14 12:47:25] [Wave3] Loaded 0 existing Phase 3 results.
[2026-04-14 12:47:25] [Wave3] No Wave 3 champion yet.
[2026-04-14 12:47:25] [Wave3] Starting from trial 1.
[2026-04-14 12:47:25]
[Wave3] ========== Trial 1/25 ==========
[2026-04-14 12:47:25] [Wave3] Seed trial 1/2: using hardcoded params.
[2026-04-14 12:47:25] [Wave3] Proposed params: {'learning_rate': 0.000225, 'steps_per_switch': 10000, 'total_timesteps': 150000}
[2026-04-14 12:47:27] [Wave3] Launching trial 1: {'learning_rate': 0.000225, 'steps_per_switch': 10000, 'total_timesteps': 150000}
[2026-04-14 12:47:27] [Wave3] Command: python3 /home/paulh/projects/donkeycar-rl-autoresearch/agent/multitrack_runner.py --total-timesteps 150000 --steps-per-switch 10000 --learning-rate 0.000225 --eval-episodes 3 --save-dir /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave3-trial-0001 --warm-start /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/champion/model.zip
[2026-04-14 13:28:47] [Wave3] Seed trial 1/2: using hardcoded params.
[2026-04-14 13:28:47] [Wave3] Seed trial 2/2: using hardcoded params.
[2026-04-14 13:28:47] [Wave3] Only 0 results — using random proposal.
[2026-04-14 13:28:47] [Champion] 🏆 NEW BEST! Trial 3: combined=1500.00 (mini_monaco=900.0, warren=600.0) params={'learning_rate': 0.0002, 'steps_per_switch': 8000, 'total_timesteps': 150000}
[2026-04-14 13:28:47] [Champion] 🏆 NEW BEST! Trial 1: combined=2000.00 (mini_monaco=1200.0, warren=800.0) params={}
[2026-04-14 13:29:08] [Wave3] Seed trial 1/2: using hardcoded params.
[2026-04-14 13:29:08] [Wave3] Seed trial 2/2: using hardcoded params.
[2026-04-14 13:29:08] [Wave3] Only 0 results — using random proposal.
[2026-04-14 13:29:08] [Champion] 🏆 NEW BEST! Trial 3: combined=1500.00 (mini_monaco=900.0, warren=600.0) params={'learning_rate': 0.0002, 'steps_per_switch': 8000, 'total_timesteps': 150000}
[2026-04-14 13:29:08] [Champion] 🏆 NEW BEST! Trial 1: combined=2000.00 (mini_monaco=1200.0, warren=800.0) params={}
[2026-04-14 13:29:34] [Wave3] Seed trial 1/2: using hardcoded params.
[2026-04-14 13:29:34] [Wave3] Seed trial 2/2: using hardcoded params.
[2026-04-14 13:29:34] [Wave3] Only 0 results — using random proposal.
[2026-04-14 13:29:34] [Champion] 🏆 NEW BEST! Trial 3: combined=1500.00 (mini_monaco=900.0, warren=600.0) params={'learning_rate': 0.0002, 'steps_per_switch': 8000, 'total_timesteps': 150000}
[2026-04-14 13:29:34] [Champion] 🏆 NEW BEST! Trial 1: combined=2000.00 (mini_monaco=1200.0, warren=800.0) params={}
[2026-04-14 13:36:58] =================================================================
[2026-04-14 13:36:58] [Wave3] Multi-Track Autoresearch — GP+UCB Generalization Search
[2026-04-14 13:36:58] [Wave3] Training tracks : generated_road, generated_track, mountain_track
[2026-04-14 13:36:58] [Wave3] Test tracks : mini_monaco, warren (zero-shot)
[2026-04-14 13:36:58] [Wave3] Max trials : 25 | kappa=2.0 | push every 5
[2026-04-14 13:36:58] [Wave3] Results file : /home/paulh/projects/donkeycar-rl-autoresearch/agent/outerloop-results/autoresearch_results_phase3.jsonl
[2026-04-14 13:36:58] [Wave3] Champion dir : /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave3-champion
[2026-04-14 13:36:58] [Wave3] Warm start : /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/champion/model.zip
[2026-04-14 13:36:58] =================================================================
[2026-04-14 13:36:58] [Wave3] Loaded 0 existing Phase 3 results.
[2026-04-14 13:36:58] [Wave3] No Wave 3 champion yet.
[2026-04-14 13:36:58] [Wave3] Starting from trial 1.
[2026-04-14 13:36:58]
[Wave3] ========== Trial 1/25 ==========
[2026-04-14 13:36:58] [Wave3] Seed trial 1/2: using hardcoded params.
[2026-04-14 13:36:58] [Wave3] Proposed params: {'learning_rate': 0.000225, 'steps_per_switch': 5000, 'total_timesteps': 45000}
[2026-04-14 13:37:00] [Wave3] Launching trial 1: {'learning_rate': 0.000225, 'steps_per_switch': 5000, 'total_timesteps': 45000}
[2026-04-14 13:37:00] [Wave3] Command: python3 /home/paulh/projects/donkeycar-rl-autoresearch/agent/multitrack_runner.py --total-timesteps 45000 --steps-per-switch 5000 --learning-rate 0.000225 --eval-episodes 3 --save-dir /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave3-trial-0001 --warm-start /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/champion/model.zip
[2026-04-14 13:47:17] [Wave3] Seed trial 1/2: using hardcoded params.
[2026-04-14 13:47:17] [Wave3] Seed trial 2/2: using hardcoded params.
[2026-04-14 13:47:17] [Wave3] Only 0 results — using random proposal.
[2026-04-14 13:47:17] [Champion] 🏆 NEW BEST! Trial 3: score=1500.00 (mini_monaco=1500.0) params={'learning_rate': 0.0002, 'steps_per_switch': 8000, 'total_timesteps': 150000}
[2026-04-14 13:47:17] [Champion] 🏆 NEW BEST! Trial 1: score=2000.00 (mini_monaco=2000.0) params={}
[2026-04-14 13:47:34] =================================================================
[2026-04-14 13:47:34] [Wave3] Multi-Track Autoresearch — GP+UCB Generalization Search
[2026-04-14 13:47:34] [Wave3] Training tracks : generated_road, generated_track, mountain_track
[2026-04-14 13:47:34] [Wave3] Test tracks : mini_monaco only (zero-shot; warren removed — broken done condition)
[2026-04-14 13:47:34] [Wave3] Max trials : 25 | kappa=2.0 | push every 5
[2026-04-14 13:47:34] [Wave3] Results file : /home/paulh/projects/donkeycar-rl-autoresearch/agent/outerloop-results/autoresearch_results_phase3.jsonl
[2026-04-14 13:47:34] [Wave3] Champion dir : /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave3-champion
[2026-04-14 13:47:34] [Wave3] Warm start : /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/champion/model.zip
[2026-04-14 13:47:34] =================================================================
[2026-04-14 13:47:34] [Wave3] Loaded 0 existing Phase 3 results.
[2026-04-14 13:47:34] [Wave3] No Wave 3 champion yet.
[2026-04-14 13:47:34] [Wave3] Starting from trial 1.
[2026-04-14 13:47:34]
[Wave3] ========== Trial 1/25 ==========
[2026-04-14 13:47:34] [Wave3] Seed trial 1/2: using hardcoded params.
[2026-04-14 13:47:34] [Wave3] Proposed params: {'learning_rate': 0.000225, 'steps_per_switch': 5000, 'total_timesteps': 45000}
[2026-04-14 13:47:36] [Wave3] Launching trial 1: {'learning_rate': 0.000225, 'steps_per_switch': 5000, 'total_timesteps': 45000}
[2026-04-14 13:47:36] [Wave3] Command: python3 /home/paulh/projects/donkeycar-rl-autoresearch/agent/multitrack_runner.py --total-timesteps 45000 --steps-per-switch 5000 --learning-rate 0.000225 --eval-episodes 3 --save-dir /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave3-trial-0001 --warm-start /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/champion/model.zip
[2026-04-14 14:34:25] [Wave3] Trial 1 finished in 2808.7s, rc=0
[2026-04-14 14:34:25] [Wave3] Parsed: combined=24.7695 mini_monaco=24.7695
[2026-04-14 14:34:25] [Champion] 🏆 NEW BEST! Trial 1: score=24.77 (mini_monaco=24.8) params={'learning_rate': 0.000225, 'steps_per_switch': 5000, 'total_timesteps': 45000}
[2026-04-14 14:34:25] [Wave3] ===== Trial 1 Summary =====
[2026-04-14 14:34:25] GP data points : 1
[2026-04-14 14:34:25] Wave3 Champion: trial=1 score=24.77 params={'learning_rate': 0.000225, 'steps_per_switch': 5000, 'total_timesteps': 45000}
[2026-04-14 14:34:25] Top 5:
[2026-04-14 14:34:25] score=24.77 params={'learning_rate': 0.000225, 'steps_per_switch': 5000, 'total_timesteps': 45000}
[2026-04-14 14:34:27]
[Wave3] ========== Trial 2/25 ==========
[2026-04-14 14:34:27] [Wave3] Seed trial 2/2: using hardcoded params.
[2026-04-14 14:34:27] [Wave3] Proposed params: {'learning_rate': 0.000225, 'steps_per_switch': 10000, 'total_timesteps': 90000}
[2026-04-14 14:34:29] [Wave3] Launching trial 2: {'learning_rate': 0.000225, 'steps_per_switch': 10000, 'total_timesteps': 90000}
[2026-04-14 14:34:29] [Wave3] Command: python3 /home/paulh/projects/donkeycar-rl-autoresearch/agent/multitrack_runner.py --total-timesteps 90000 --steps-per-switch 10000 --learning-rate 0.000225 --eval-episodes 3 --save-dir /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave3-trial-0002 --warm-start /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/champion/model.zip
[2026-04-14 15:12:53] [Wave3] Seed trial 1/2: using hardcoded params.
[2026-04-14 15:12:53] [Wave3] Seed trial 2/2: using hardcoded params.
[2026-04-14 15:12:53] [Wave3] Only 0 results — using random proposal.
[2026-04-14 15:12:53] [Champion] 🏆 NEW BEST! Trial 3: score=1500.00 (mini_monaco=1500.0) params={'learning_rate': 0.0002, 'steps_per_switch': 8000, 'total_timesteps': 150000}
[2026-04-14 15:12:53] [Champion] 🏆 NEW BEST! Trial 1: score=2000.00 (mini_monaco=2000.0) params={}
[2026-04-14 15:13:16] =================================================================
[2026-04-14 15:13:16] [Wave3] Multi-Track Autoresearch — GP+UCB Generalization Search
[2026-04-14 15:13:16] [Wave3] Training tracks : generated_road, generated_track, mountain_track
[2026-04-14 15:13:16] [Wave3] Test tracks : mini_monaco only (zero-shot; warren removed — broken done condition)
[2026-04-14 15:13:16] [Wave3] Max trials : 25 | kappa=2.0 | push every 5
[2026-04-14 15:13:16] [Wave3] Results file : /home/paulh/projects/donkeycar-rl-autoresearch/agent/outerloop-results/autoresearch_results_phase3.jsonl
[2026-04-14 15:13:16] [Wave3] Champion dir : /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave3-champion
[2026-04-14 15:13:16] [Wave3] Warm start : /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/champion/model.zip
[2026-04-14 15:13:16] =================================================================
[2026-04-14 15:13:16] [Wave3] Loaded 1 existing Phase 3 results.
[2026-04-14 15:13:16] [Wave3] Wave3 Champion: trial=1 score=24.77 params={'learning_rate': 0.000225, 'steps_per_switch': 5000, 'total_timesteps': 45000}
[2026-04-14 15:13:16] [Wave3] Starting from trial 2.
[2026-04-14 15:13:16]
[Wave3] ========== Trial 2/25 ==========
[2026-04-14 15:13:16] [Wave3] Seed trial 2/2: using hardcoded params.
[2026-04-14 15:13:16] [Wave3] Proposed params: {'learning_rate': 0.000225, 'steps_per_switch': 10000, 'total_timesteps': 90000}
[2026-04-14 15:13:18] [Wave3] Launching trial 2: {'learning_rate': 0.000225, 'steps_per_switch': 10000, 'total_timesteps': 90000}
[2026-04-14 15:13:18] [Wave3] Command: python3 /home/paulh/projects/donkeycar-rl-autoresearch/agent/multitrack_runner.py --total-timesteps 90000 --steps-per-switch 10000 --learning-rate 0.000225 --eval-episodes 3 --save-dir /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave3-trial-0002 --warm-start /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/champion/model.zip
[2026-04-14 16:33:15] [Wave3] Trial 2 finished in 4797.4s, rc=0
[2026-04-14 16:33:15] [Wave3] Parsed: combined=14.61 mini_monaco=14.61
[2026-04-14 16:33:15] [Wave3] ===== Trial 2 Summary =====
[2026-04-14 16:33:15] GP data points : 2
[2026-04-14 16:33:15] Wave3 Champion: trial=1 score=24.77 params={'learning_rate': 0.000225, 'steps_per_switch': 5000, 'total_timesteps': 45000}
[2026-04-14 16:33:15] Top 5:
[2026-04-14 16:33:15] score=24.77 params={'learning_rate': 0.000225, 'steps_per_switch': 5000, 'total_timesteps': 45000}
[2026-04-14 16:33:15] score=14.61 params={'learning_rate': 0.000225, 'steps_per_switch': 10000, 'total_timesteps': 90000}
[2026-04-14 16:33:17]
[Wave3] ========== Trial 3/25 ==========
[2026-04-14 16:33:17] [Wave3] Only 2 results — using random proposal.
[2026-04-14 16:33:17] [Wave3] Proposed params: {'learning_rate': 0.0004302041414294587, 'steps_per_switch': 6993, 'total_timesteps': 31040}
[2026-04-14 16:33:19] [Wave3] Launching trial 3: {'learning_rate': 0.0004302041414294587, 'steps_per_switch': 6993, 'total_timesteps': 31040}
[2026-04-14 16:33:19] [Wave3] Command: python3 /home/paulh/projects/donkeycar-rl-autoresearch/agent/multitrack_runner.py --total-timesteps 31040 --steps-per-switch 6993 --learning-rate 0.0004302041414294587 --eval-episodes 3 --save-dir /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave3-trial-0003 --warm-start /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/champion/model.zip
[2026-04-14 17:07:08] [Wave3] Trial 3 finished in 2029.0s, rc=0
[2026-04-14 17:07:08] [Wave3] Parsed: combined=27.6387 mini_monaco=27.6387
[2026-04-14 17:07:09] [Champion] 🏆 NEW BEST! Trial 3: score=27.64 (mini_monaco=27.6) params={'learning_rate': 0.0004302041414294587, 'steps_per_switch': 6993, 'total_timesteps': 31040}
[2026-04-14 17:07:09] [Wave3] ===== Trial 3 Summary =====
[2026-04-14 17:07:09] GP data points : 3
[2026-04-14 17:07:09] Wave3 Champion: trial=3 score=27.64 params={'learning_rate': 0.0004302041414294587, 'steps_per_switch': 6993, 'total_timesteps': 31040}
[2026-04-14 17:07:09] Top 5:
[2026-04-14 17:07:09] score=27.64 params={'learning_rate': 0.0004302041414294587, 'steps_per_switch': 6993, 'total_timesteps': 31040}
[2026-04-14 17:07:09] score=24.77 params={'learning_rate': 0.000225, 'steps_per_switch': 5000, 'total_timesteps': 45000}
[2026-04-14 17:07:09] score=14.61 params={'learning_rate': 0.000225, 'steps_per_switch': 10000, 'total_timesteps': 90000}
[2026-04-14 17:07:11]
[Wave3] ========== Trial 4/25 ==========
[2026-04-14 17:07:11] [Wave3] GP UCB top-5 proposals:
[2026-04-14 17:07:11] UCB=2.2675 mu=0.3935 σ=0.9370 params={'learning_rate': 0.0006723430224246657, 'steps_per_switch': 2400, 'total_timesteps': 31110}
[2026-04-14 17:07:11] UCB=2.2527 mu=0.5001 σ=0.8763 params={'learning_rate': 0.0006709448038149618, 'steps_per_switch': 4056, 'total_timesteps': 46285}
[2026-04-14 17:07:11] UCB=2.2459 mu=0.6168 σ=0.8145 params={'learning_rate': 0.0006364798923287126, 'steps_per_switch': 4185, 'total_timesteps': 39120}
[2026-04-14 17:07:11] UCB=2.2458 mu=0.6302 σ=0.8078 params={'learning_rate': 0.0006330433971389486, 'steps_per_switch': 4177, 'total_timesteps': 37673}
[2026-04-14 17:07:11] UCB=2.2435 mu=0.5485 σ=0.8475 params={'learning_rate': 0.0005264768982970893, 'steps_per_switch': 2587, 'total_timesteps': 32853}
[2026-04-14 17:07:11] [Wave3] Proposed params: {'learning_rate': 0.0006723430224246657, 'steps_per_switch': 2400, 'total_timesteps': 31110}
[2026-04-14 17:07:13] [Wave3] Launching trial 4: {'learning_rate': 0.0006723430224246657, 'steps_per_switch': 2400, 'total_timesteps': 31110}
[2026-04-14 17:07:13] [Wave3] Command: python3 /home/paulh/projects/donkeycar-rl-autoresearch/agent/multitrack_runner.py --total-timesteps 31110 --steps-per-switch 2400 --learning-rate 0.0006723430224246657 --eval-episodes 3 --save-dir /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave3-trial-0004 --warm-start /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/champion/model.zip
[2026-04-14 17:52:55] [Wave3] Trial 4 finished in 2742.5s, rc=0
[2026-04-14 17:52:55] [Wave3] Parsed: combined=28.0934 mini_monaco=28.0934
[2026-04-14 17:52:56] [Champion] 🏆 NEW BEST! Trial 4: score=28.09 (mini_monaco=28.1) params={'learning_rate': 0.0006723430224246657, 'steps_per_switch': 2400, 'total_timesteps': 31110}
[2026-04-14 17:52:56] [Wave3] ===== Trial 4 Summary =====
[2026-04-14 17:52:56] GP data points : 4
[2026-04-14 17:52:56] Wave3 Champion: trial=4 score=28.09 params={'learning_rate': 0.0006723430224246657, 'steps_per_switch': 2400, 'total_timesteps': 31110}
[2026-04-14 17:52:56] Top 5:
[2026-04-14 17:52:56] score=28.09 params={'learning_rate': 0.0006723430224246657, 'steps_per_switch': 2400, 'total_timesteps': 31110}
[2026-04-14 17:52:56] score=27.64 params={'learning_rate': 0.0004302041414294587, 'steps_per_switch': 6993, 'total_timesteps': 31040}
[2026-04-14 17:52:56] score=24.77 params={'learning_rate': 0.000225, 'steps_per_switch': 5000, 'total_timesteps': 45000}
[2026-04-14 17:52:56] score=14.61 params={'learning_rate': 0.000225, 'steps_per_switch': 10000, 'total_timesteps': 90000}
[2026-04-14 17:52:58]
[Wave3] ========== Trial 5/25 ==========
[2026-04-14 17:52:58] [Wave3] GP UCB top-5 proposals:
[2026-04-14 17:52:58] UCB=2.2084 mu=0.4374 σ=0.8855 params={'learning_rate': 0.0008293130840877947, 'steps_per_switch': 7847, 'total_timesteps': 31625}
[2026-04-14 17:52:58] UCB=2.1978 mu=0.3485 σ=0.9246 params={'learning_rate': 0.0008923041447868616, 'steps_per_switch': 7241, 'total_timesteps': 43058}
[2026-04-14 17:52:58] UCB=2.1976 mu=0.3420 σ=0.9278 params={'learning_rate': 0.0008833374140842175, 'steps_per_switch': 7589, 'total_timesteps': 42051}
[2026-04-14 17:52:58] UCB=2.1805 mu=0.3449 σ=0.9178 params={'learning_rate': 0.0009537044683873197, 'steps_per_switch': 6193, 'total_timesteps': 33355}
[2026-04-14 17:52:58] UCB=2.1634 mu=0.2762 σ=0.9436 params={'learning_rate': 0.0008911867335423042, 'steps_per_switch': 6839, 'total_timesteps': 57412}
[2026-04-14 17:52:58] [Wave3] Proposed params: {'learning_rate': 0.0008293130840877947, 'steps_per_switch': 7847, 'total_timesteps': 31625}
[2026-04-14 17:53:00] [Wave3] Launching trial 5: {'learning_rate': 0.0008293130840877947, 'steps_per_switch': 7847, 'total_timesteps': 31625}
[2026-04-14 17:53:00] [Wave3] Command: python3 /home/paulh/projects/donkeycar-rl-autoresearch/agent/multitrack_runner.py --total-timesteps 31625 --steps-per-switch 7847 --learning-rate 0.0008293130840877947 --eval-episodes 3 --save-dir /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave3-trial-0005 --warm-start /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/champion/model.zip
[2026-04-14 18:22:44] [Wave3] Trial 5 finished in 1784.4s, rc=0
[2026-04-14 18:22:44] [Wave3] Parsed: combined=137.5814 mini_monaco=137.5814
[2026-04-14 18:22:44] [Champion] 🏆 NEW BEST! Trial 5: score=137.58 (mini_monaco=137.6) params={'learning_rate': 0.0008293130840877947, 'steps_per_switch': 7847, 'total_timesteps': 31625}
[2026-04-14 18:22:44] [Wave3] ===== Trial 5 Summary =====
[2026-04-14 18:22:44] GP data points : 5
[2026-04-14 18:22:44] Wave3 Champion: trial=5 score=137.58 params={'learning_rate': 0.0008293130840877947, 'steps_per_switch': 7847, 'total_timesteps': 31625}
[2026-04-14 18:22:44] Top 5:
[2026-04-14 18:22:44] score=137.58 params={'learning_rate': 0.0008293130840877947, 'steps_per_switch': 7847, 'total_timesteps': 31625}
[2026-04-14 18:22:44] score=28.09 params={'learning_rate': 0.0006723430224246657, 'steps_per_switch': 2400, 'total_timesteps': 31110}
[2026-04-14 18:22:44] score=27.64 params={'learning_rate': 0.0004302041414294587, 'steps_per_switch': 6993, 'total_timesteps': 31040}
[2026-04-14 18:22:44] score=24.77 params={'learning_rate': 0.000225, 'steps_per_switch': 5000, 'total_timesteps': 45000}
[2026-04-14 18:22:44] score=14.61 params={'learning_rate': 0.000225, 'steps_per_switch': 10000, 'total_timesteps': 90000}