donkeycar-rl-autoresearch/agent/outerloop-results/autoresearch_phase3_log.txt

418 lines
45 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

[2026-04-14 12:44:38] [Wave3] Seed trial 1/2: using hardcoded params.
[2026-04-14 12:44:38] [Wave3] Seed trial 2/2: using hardcoded params.
[2026-04-14 12:44:38] [Wave3] Only 0 results — using random proposal.
[2026-04-14 12:44:38] [Champion] 🏆 NEW BEST! Trial 3: combined=1500.00 (mini_monaco=900.0, warren=600.0) params={'learning_rate': 0.0002, 'steps_per_switch': 8000, 'total_timesteps': 150000}
[2026-04-14 12:44:38] [Champion] 🏆 NEW BEST! Trial 1: combined=2000.00 (mini_monaco=1200.0, warren=800.0) params={}
[2026-04-14 12:45:00] [Wave3] Seed trial 1/2: using hardcoded params.
[2026-04-14 12:45:00] [Wave3] Seed trial 2/2: using hardcoded params.
[2026-04-14 12:45:00] [Wave3] Only 0 results — using random proposal.
[2026-04-14 12:45:00] [Champion] 🏆 NEW BEST! Trial 3: combined=1500.00 (mini_monaco=900.0, warren=600.0) params={'learning_rate': 0.0002, 'steps_per_switch': 8000, 'total_timesteps': 150000}
[2026-04-14 12:45:00] [Champion] 🏆 NEW BEST! Trial 1: combined=2000.00 (mini_monaco=1200.0, warren=800.0) params={}
[2026-04-14 12:45:27] [Wave3] Seed trial 1/2: using hardcoded params.
[2026-04-14 12:45:27] [Wave3] Seed trial 2/2: using hardcoded params.
[2026-04-14 12:45:27] [Wave3] Only 0 results — using random proposal.
[2026-04-14 12:45:27] [Champion] 🏆 NEW BEST! Trial 3: combined=1500.00 (mini_monaco=900.0, warren=600.0) params={'learning_rate': 0.0002, 'steps_per_switch': 8000, 'total_timesteps': 150000}
[2026-04-14 12:45:27] [Champion] 🏆 NEW BEST! Trial 1: combined=2000.00 (mini_monaco=1200.0, warren=800.0) params={}
[2026-04-14 12:45:39] [Wave3] Seed trial 1/2: using hardcoded params.
[2026-04-14 12:45:39] [Wave3] Seed trial 2/2: using hardcoded params.
[2026-04-14 12:45:39] [Wave3] Only 0 results — using random proposal.
[2026-04-14 12:45:39] [Champion] 🏆 NEW BEST! Trial 3: combined=1500.00 (mini_monaco=900.0, warren=600.0) params={'learning_rate': 0.0002, 'steps_per_switch': 8000, 'total_timesteps': 150000}
[2026-04-14 12:45:39] [Champion] 🏆 NEW BEST! Trial 1: combined=2000.00 (mini_monaco=1200.0, warren=800.0) params={}
[2026-04-14 12:47:25] =================================================================
[2026-04-14 12:47:25] [Wave3] Multi-Track Autoresearch — GP+UCB Generalization Search
[2026-04-14 12:47:25] [Wave3] Training tracks : generated_road, generated_track, mountain_track
[2026-04-14 12:47:25] [Wave3] Test tracks : mini_monaco, warren (zero-shot)
[2026-04-14 12:47:25] [Wave3] Max trials : 25 | kappa=2.0 | push every 5
[2026-04-14 12:47:25] [Wave3] Results file : /home/paulh/projects/donkeycar-rl-autoresearch/agent/outerloop-results/autoresearch_results_phase3.jsonl
[2026-04-14 12:47:25] [Wave3] Champion dir : /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave3-champion
[2026-04-14 12:47:25] [Wave3] Warm start : /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/champion/model.zip
[2026-04-14 12:47:25] =================================================================
[2026-04-14 12:47:25] [Wave3] Loaded 0 existing Phase 3 results.
[2026-04-14 12:47:25] [Wave3] No Wave 3 champion yet.
[2026-04-14 12:47:25] [Wave3] Starting from trial 1.
[2026-04-14 12:47:25]
[Wave3] ========== Trial 1/25 ==========
[2026-04-14 12:47:25] [Wave3] Seed trial 1/2: using hardcoded params.
[2026-04-14 12:47:25] [Wave3] Proposed params: {'learning_rate': 0.000225, 'steps_per_switch': 10000, 'total_timesteps': 150000}
[2026-04-14 12:47:27] [Wave3] Launching trial 1: {'learning_rate': 0.000225, 'steps_per_switch': 10000, 'total_timesteps': 150000}
[2026-04-14 12:47:27] [Wave3] Command: python3 /home/paulh/projects/donkeycar-rl-autoresearch/agent/multitrack_runner.py --total-timesteps 150000 --steps-per-switch 10000 --learning-rate 0.000225 --eval-episodes 3 --save-dir /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave3-trial-0001 --warm-start /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/champion/model.zip
[2026-04-14 13:28:47] [Wave3] Seed trial 1/2: using hardcoded params.
[2026-04-14 13:28:47] [Wave3] Seed trial 2/2: using hardcoded params.
[2026-04-14 13:28:47] [Wave3] Only 0 results — using random proposal.
[2026-04-14 13:28:47] [Champion] 🏆 NEW BEST! Trial 3: combined=1500.00 (mini_monaco=900.0, warren=600.0) params={'learning_rate': 0.0002, 'steps_per_switch': 8000, 'total_timesteps': 150000}
[2026-04-14 13:28:47] [Champion] 🏆 NEW BEST! Trial 1: combined=2000.00 (mini_monaco=1200.0, warren=800.0) params={}
[2026-04-14 13:29:08] [Wave3] Seed trial 1/2: using hardcoded params.
[2026-04-14 13:29:08] [Wave3] Seed trial 2/2: using hardcoded params.
[2026-04-14 13:29:08] [Wave3] Only 0 results — using random proposal.
[2026-04-14 13:29:08] [Champion] 🏆 NEW BEST! Trial 3: combined=1500.00 (mini_monaco=900.0, warren=600.0) params={'learning_rate': 0.0002, 'steps_per_switch': 8000, 'total_timesteps': 150000}
[2026-04-14 13:29:08] [Champion] 🏆 NEW BEST! Trial 1: combined=2000.00 (mini_monaco=1200.0, warren=800.0) params={}
[2026-04-14 13:29:34] [Wave3] Seed trial 1/2: using hardcoded params.
[2026-04-14 13:29:34] [Wave3] Seed trial 2/2: using hardcoded params.
[2026-04-14 13:29:34] [Wave3] Only 0 results — using random proposal.
[2026-04-14 13:29:34] [Champion] 🏆 NEW BEST! Trial 3: combined=1500.00 (mini_monaco=900.0, warren=600.0) params={'learning_rate': 0.0002, 'steps_per_switch': 8000, 'total_timesteps': 150000}
[2026-04-14 13:29:34] [Champion] 🏆 NEW BEST! Trial 1: combined=2000.00 (mini_monaco=1200.0, warren=800.0) params={}
[2026-04-14 13:36:58] =================================================================
[2026-04-14 13:36:58] [Wave3] Multi-Track Autoresearch — GP+UCB Generalization Search
[2026-04-14 13:36:58] [Wave3] Training tracks : generated_road, generated_track, mountain_track
[2026-04-14 13:36:58] [Wave3] Test tracks : mini_monaco, warren (zero-shot)
[2026-04-14 13:36:58] [Wave3] Max trials : 25 | kappa=2.0 | push every 5
[2026-04-14 13:36:58] [Wave3] Results file : /home/paulh/projects/donkeycar-rl-autoresearch/agent/outerloop-results/autoresearch_results_phase3.jsonl
[2026-04-14 13:36:58] [Wave3] Champion dir : /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave3-champion
[2026-04-14 13:36:58] [Wave3] Warm start : /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/champion/model.zip
[2026-04-14 13:36:58] =================================================================
[2026-04-14 13:36:58] [Wave3] Loaded 0 existing Phase 3 results.
[2026-04-14 13:36:58] [Wave3] No Wave 3 champion yet.
[2026-04-14 13:36:58] [Wave3] Starting from trial 1.
[2026-04-14 13:36:58]
[Wave3] ========== Trial 1/25 ==========
[2026-04-14 13:36:58] [Wave3] Seed trial 1/2: using hardcoded params.
[2026-04-14 13:36:58] [Wave3] Proposed params: {'learning_rate': 0.000225, 'steps_per_switch': 5000, 'total_timesteps': 45000}
[2026-04-14 13:37:00] [Wave3] Launching trial 1: {'learning_rate': 0.000225, 'steps_per_switch': 5000, 'total_timesteps': 45000}
[2026-04-14 13:37:00] [Wave3] Command: python3 /home/paulh/projects/donkeycar-rl-autoresearch/agent/multitrack_runner.py --total-timesteps 45000 --steps-per-switch 5000 --learning-rate 0.000225 --eval-episodes 3 --save-dir /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave3-trial-0001 --warm-start /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/champion/model.zip
[2026-04-14 13:47:17] [Wave3] Seed trial 1/2: using hardcoded params.
[2026-04-14 13:47:17] [Wave3] Seed trial 2/2: using hardcoded params.
[2026-04-14 13:47:17] [Wave3] Only 0 results — using random proposal.
[2026-04-14 13:47:17] [Champion] 🏆 NEW BEST! Trial 3: score=1500.00 (mini_monaco=1500.0) params={'learning_rate': 0.0002, 'steps_per_switch': 8000, 'total_timesteps': 150000}
[2026-04-14 13:47:17] [Champion] 🏆 NEW BEST! Trial 1: score=2000.00 (mini_monaco=2000.0) params={}
[2026-04-14 13:47:34] =================================================================
[2026-04-14 13:47:34] [Wave3] Multi-Track Autoresearch — GP+UCB Generalization Search
[2026-04-14 13:47:34] [Wave3] Training tracks : generated_road, generated_track, mountain_track
[2026-04-14 13:47:34] [Wave3] Test tracks : mini_monaco only (zero-shot; warren removed — broken done condition)
[2026-04-14 13:47:34] [Wave3] Max trials : 25 | kappa=2.0 | push every 5
[2026-04-14 13:47:34] [Wave3] Results file : /home/paulh/projects/donkeycar-rl-autoresearch/agent/outerloop-results/autoresearch_results_phase3.jsonl
[2026-04-14 13:47:34] [Wave3] Champion dir : /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave3-champion
[2026-04-14 13:47:34] [Wave3] Warm start : /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/champion/model.zip
[2026-04-14 13:47:34] =================================================================
[2026-04-14 13:47:34] [Wave3] Loaded 0 existing Phase 3 results.
[2026-04-14 13:47:34] [Wave3] No Wave 3 champion yet.
[2026-04-14 13:47:34] [Wave3] Starting from trial 1.
[2026-04-14 13:47:34]
[Wave3] ========== Trial 1/25 ==========
[2026-04-14 13:47:34] [Wave3] Seed trial 1/2: using hardcoded params.
[2026-04-14 13:47:34] [Wave3] Proposed params: {'learning_rate': 0.000225, 'steps_per_switch': 5000, 'total_timesteps': 45000}
[2026-04-14 13:47:36] [Wave3] Launching trial 1: {'learning_rate': 0.000225, 'steps_per_switch': 5000, 'total_timesteps': 45000}
[2026-04-14 13:47:36] [Wave3] Command: python3 /home/paulh/projects/donkeycar-rl-autoresearch/agent/multitrack_runner.py --total-timesteps 45000 --steps-per-switch 5000 --learning-rate 0.000225 --eval-episodes 3 --save-dir /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave3-trial-0001 --warm-start /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/champion/model.zip
[2026-04-14 14:34:25] [Wave3] Trial 1 finished in 2808.7s, rc=0
[2026-04-14 14:34:25] [Wave3] Parsed: combined=24.7695 mini_monaco=24.7695
[2026-04-14 14:34:25] [Champion] 🏆 NEW BEST! Trial 1: score=24.77 (mini_monaco=24.8) params={'learning_rate': 0.000225, 'steps_per_switch': 5000, 'total_timesteps': 45000}
[2026-04-14 14:34:25] [Wave3] ===== Trial 1 Summary =====
[2026-04-14 14:34:25] GP data points : 1
[2026-04-14 14:34:25] Wave3 Champion: trial=1 score=24.77 params={'learning_rate': 0.000225, 'steps_per_switch': 5000, 'total_timesteps': 45000}
[2026-04-14 14:34:25] Top 5:
[2026-04-14 14:34:25] score=24.77 params={'learning_rate': 0.000225, 'steps_per_switch': 5000, 'total_timesteps': 45000}
[2026-04-14 14:34:27]
[Wave3] ========== Trial 2/25 ==========
[2026-04-14 14:34:27] [Wave3] Seed trial 2/2: using hardcoded params.
[2026-04-14 14:34:27] [Wave3] Proposed params: {'learning_rate': 0.000225, 'steps_per_switch': 10000, 'total_timesteps': 90000}
[2026-04-14 14:34:29] [Wave3] Launching trial 2: {'learning_rate': 0.000225, 'steps_per_switch': 10000, 'total_timesteps': 90000}
[2026-04-14 14:34:29] [Wave3] Command: python3 /home/paulh/projects/donkeycar-rl-autoresearch/agent/multitrack_runner.py --total-timesteps 90000 --steps-per-switch 10000 --learning-rate 0.000225 --eval-episodes 3 --save-dir /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave3-trial-0002 --warm-start /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/champion/model.zip
[2026-04-14 15:12:53] [Wave3] Seed trial 1/2: using hardcoded params.
[2026-04-14 15:12:53] [Wave3] Seed trial 2/2: using hardcoded params.
[2026-04-14 15:12:53] [Wave3] Only 0 results — using random proposal.
[2026-04-14 15:12:53] [Champion] 🏆 NEW BEST! Trial 3: score=1500.00 (mini_monaco=1500.0) params={'learning_rate': 0.0002, 'steps_per_switch': 8000, 'total_timesteps': 150000}
[2026-04-14 15:12:53] [Champion] 🏆 NEW BEST! Trial 1: score=2000.00 (mini_monaco=2000.0) params={}
[2026-04-14 15:13:16] =================================================================
[2026-04-14 15:13:16] [Wave3] Multi-Track Autoresearch — GP+UCB Generalization Search
[2026-04-14 15:13:16] [Wave3] Training tracks : generated_road, generated_track, mountain_track
[2026-04-14 15:13:16] [Wave3] Test tracks : mini_monaco only (zero-shot; warren removed — broken done condition)
[2026-04-14 15:13:16] [Wave3] Max trials : 25 | kappa=2.0 | push every 5
[2026-04-14 15:13:16] [Wave3] Results file : /home/paulh/projects/donkeycar-rl-autoresearch/agent/outerloop-results/autoresearch_results_phase3.jsonl
[2026-04-14 15:13:16] [Wave3] Champion dir : /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave3-champion
[2026-04-14 15:13:16] [Wave3] Warm start : /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/champion/model.zip
[2026-04-14 15:13:16] =================================================================
[2026-04-14 15:13:16] [Wave3] Loaded 1 existing Phase 3 results.
[2026-04-14 15:13:16] [Wave3] Wave3 Champion: trial=1 score=24.77 params={'learning_rate': 0.000225, 'steps_per_switch': 5000, 'total_timesteps': 45000}
[2026-04-14 15:13:16] [Wave3] Starting from trial 2.
[2026-04-14 15:13:16]
[Wave3] ========== Trial 2/25 ==========
[2026-04-14 15:13:16] [Wave3] Seed trial 2/2: using hardcoded params.
[2026-04-14 15:13:16] [Wave3] Proposed params: {'learning_rate': 0.000225, 'steps_per_switch': 10000, 'total_timesteps': 90000}
[2026-04-14 15:13:18] [Wave3] Launching trial 2: {'learning_rate': 0.000225, 'steps_per_switch': 10000, 'total_timesteps': 90000}
[2026-04-14 15:13:18] [Wave3] Command: python3 /home/paulh/projects/donkeycar-rl-autoresearch/agent/multitrack_runner.py --total-timesteps 90000 --steps-per-switch 10000 --learning-rate 0.000225 --eval-episodes 3 --save-dir /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave3-trial-0002 --warm-start /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/champion/model.zip
[2026-04-14 16:33:15] [Wave3] Trial 2 finished in 4797.4s, rc=0
[2026-04-14 16:33:15] [Wave3] Parsed: combined=14.61 mini_monaco=14.61
[2026-04-14 16:33:15] [Wave3] ===== Trial 2 Summary =====
[2026-04-14 16:33:15] GP data points : 2
[2026-04-14 16:33:15] Wave3 Champion: trial=1 score=24.77 params={'learning_rate': 0.000225, 'steps_per_switch': 5000, 'total_timesteps': 45000}
[2026-04-14 16:33:15] Top 5:
[2026-04-14 16:33:15] score=24.77 params={'learning_rate': 0.000225, 'steps_per_switch': 5000, 'total_timesteps': 45000}
[2026-04-14 16:33:15] score=14.61 params={'learning_rate': 0.000225, 'steps_per_switch': 10000, 'total_timesteps': 90000}
[2026-04-14 16:33:17]
[Wave3] ========== Trial 3/25 ==========
[2026-04-14 16:33:17] [Wave3] Only 2 results — using random proposal.
[2026-04-14 16:33:17] [Wave3] Proposed params: {'learning_rate': 0.0004302041414294587, 'steps_per_switch': 6993, 'total_timesteps': 31040}
[2026-04-14 16:33:19] [Wave3] Launching trial 3: {'learning_rate': 0.0004302041414294587, 'steps_per_switch': 6993, 'total_timesteps': 31040}
[2026-04-14 16:33:19] [Wave3] Command: python3 /home/paulh/projects/donkeycar-rl-autoresearch/agent/multitrack_runner.py --total-timesteps 31040 --steps-per-switch 6993 --learning-rate 0.0004302041414294587 --eval-episodes 3 --save-dir /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave3-trial-0003 --warm-start /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/champion/model.zip
[2026-04-14 17:07:08] [Wave3] Trial 3 finished in 2029.0s, rc=0
[2026-04-14 17:07:08] [Wave3] Parsed: combined=27.6387 mini_monaco=27.6387
[2026-04-14 17:07:09] [Champion] 🏆 NEW BEST! Trial 3: score=27.64 (mini_monaco=27.6) params={'learning_rate': 0.0004302041414294587, 'steps_per_switch': 6993, 'total_timesteps': 31040}
[2026-04-14 17:07:09] [Wave3] ===== Trial 3 Summary =====
[2026-04-14 17:07:09] GP data points : 3
[2026-04-14 17:07:09] Wave3 Champion: trial=3 score=27.64 params={'learning_rate': 0.0004302041414294587, 'steps_per_switch': 6993, 'total_timesteps': 31040}
[2026-04-14 17:07:09] Top 5:
[2026-04-14 17:07:09] score=27.64 params={'learning_rate': 0.0004302041414294587, 'steps_per_switch': 6993, 'total_timesteps': 31040}
[2026-04-14 17:07:09] score=24.77 params={'learning_rate': 0.000225, 'steps_per_switch': 5000, 'total_timesteps': 45000}
[2026-04-14 17:07:09] score=14.61 params={'learning_rate': 0.000225, 'steps_per_switch': 10000, 'total_timesteps': 90000}
[2026-04-14 17:07:11]
[Wave3] ========== Trial 4/25 ==========
[2026-04-14 17:07:11] [Wave3] GP UCB top-5 proposals:
[2026-04-14 17:07:11] UCB=2.2675 mu=0.3935 σ=0.9370 params={'learning_rate': 0.0006723430224246657, 'steps_per_switch': 2400, 'total_timesteps': 31110}
[2026-04-14 17:07:11] UCB=2.2527 mu=0.5001 σ=0.8763 params={'learning_rate': 0.0006709448038149618, 'steps_per_switch': 4056, 'total_timesteps': 46285}
[2026-04-14 17:07:11] UCB=2.2459 mu=0.6168 σ=0.8145 params={'learning_rate': 0.0006364798923287126, 'steps_per_switch': 4185, 'total_timesteps': 39120}
[2026-04-14 17:07:11] UCB=2.2458 mu=0.6302 σ=0.8078 params={'learning_rate': 0.0006330433971389486, 'steps_per_switch': 4177, 'total_timesteps': 37673}
[2026-04-14 17:07:11] UCB=2.2435 mu=0.5485 σ=0.8475 params={'learning_rate': 0.0005264768982970893, 'steps_per_switch': 2587, 'total_timesteps': 32853}
[2026-04-14 17:07:11] [Wave3] Proposed params: {'learning_rate': 0.0006723430224246657, 'steps_per_switch': 2400, 'total_timesteps': 31110}
[2026-04-14 17:07:13] [Wave3] Launching trial 4: {'learning_rate': 0.0006723430224246657, 'steps_per_switch': 2400, 'total_timesteps': 31110}
[2026-04-14 17:07:13] [Wave3] Command: python3 /home/paulh/projects/donkeycar-rl-autoresearch/agent/multitrack_runner.py --total-timesteps 31110 --steps-per-switch 2400 --learning-rate 0.0006723430224246657 --eval-episodes 3 --save-dir /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave3-trial-0004 --warm-start /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/champion/model.zip
[2026-04-14 17:52:55] [Wave3] Trial 4 finished in 2742.5s, rc=0
[2026-04-14 17:52:55] [Wave3] Parsed: combined=28.0934 mini_monaco=28.0934
[2026-04-14 17:52:56] [Champion] 🏆 NEW BEST! Trial 4: score=28.09 (mini_monaco=28.1) params={'learning_rate': 0.0006723430224246657, 'steps_per_switch': 2400, 'total_timesteps': 31110}
[2026-04-14 17:52:56] [Wave3] ===== Trial 4 Summary =====
[2026-04-14 17:52:56] GP data points : 4
[2026-04-14 17:52:56] Wave3 Champion: trial=4 score=28.09 params={'learning_rate': 0.0006723430224246657, 'steps_per_switch': 2400, 'total_timesteps': 31110}
[2026-04-14 17:52:56] Top 5:
[2026-04-14 17:52:56] score=28.09 params={'learning_rate': 0.0006723430224246657, 'steps_per_switch': 2400, 'total_timesteps': 31110}
[2026-04-14 17:52:56] score=27.64 params={'learning_rate': 0.0004302041414294587, 'steps_per_switch': 6993, 'total_timesteps': 31040}
[2026-04-14 17:52:56] score=24.77 params={'learning_rate': 0.000225, 'steps_per_switch': 5000, 'total_timesteps': 45000}
[2026-04-14 17:52:56] score=14.61 params={'learning_rate': 0.000225, 'steps_per_switch': 10000, 'total_timesteps': 90000}
[2026-04-14 17:52:58]
[Wave3] ========== Trial 5/25 ==========
[2026-04-14 17:52:58] [Wave3] GP UCB top-5 proposals:
[2026-04-14 17:52:58] UCB=2.2084 mu=0.4374 σ=0.8855 params={'learning_rate': 0.0008293130840877947, 'steps_per_switch': 7847, 'total_timesteps': 31625}
[2026-04-14 17:52:58] UCB=2.1978 mu=0.3485 σ=0.9246 params={'learning_rate': 0.0008923041447868616, 'steps_per_switch': 7241, 'total_timesteps': 43058}
[2026-04-14 17:52:58] UCB=2.1976 mu=0.3420 σ=0.9278 params={'learning_rate': 0.0008833374140842175, 'steps_per_switch': 7589, 'total_timesteps': 42051}
[2026-04-14 17:52:58] UCB=2.1805 mu=0.3449 σ=0.9178 params={'learning_rate': 0.0009537044683873197, 'steps_per_switch': 6193, 'total_timesteps': 33355}
[2026-04-14 17:52:58] UCB=2.1634 mu=0.2762 σ=0.9436 params={'learning_rate': 0.0008911867335423042, 'steps_per_switch': 6839, 'total_timesteps': 57412}
[2026-04-14 17:52:58] [Wave3] Proposed params: {'learning_rate': 0.0008293130840877947, 'steps_per_switch': 7847, 'total_timesteps': 31625}
[2026-04-14 17:53:00] [Wave3] Launching trial 5: {'learning_rate': 0.0008293130840877947, 'steps_per_switch': 7847, 'total_timesteps': 31625}
[2026-04-14 17:53:00] [Wave3] Command: python3 /home/paulh/projects/donkeycar-rl-autoresearch/agent/multitrack_runner.py --total-timesteps 31625 --steps-per-switch 7847 --learning-rate 0.0008293130840877947 --eval-episodes 3 --save-dir /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave3-trial-0005 --warm-start /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/champion/model.zip
[2026-04-14 18:22:44] [Wave3] Trial 5 finished in 1784.4s, rc=0
[2026-04-14 18:22:44] [Wave3] Parsed: combined=137.5814 mini_monaco=137.5814
[2026-04-14 18:22:44] [Champion] 🏆 NEW BEST! Trial 5: score=137.58 (mini_monaco=137.6) params={'learning_rate': 0.0008293130840877947, 'steps_per_switch': 7847, 'total_timesteps': 31625}
[2026-04-14 18:22:44] [Wave3] ===== Trial 5 Summary =====
[2026-04-14 18:22:44] GP data points : 5
[2026-04-14 18:22:44] Wave3 Champion: trial=5 score=137.58 params={'learning_rate': 0.0008293130840877947, 'steps_per_switch': 7847, 'total_timesteps': 31625}
[2026-04-14 18:22:44] Top 5:
[2026-04-14 18:22:44] score=137.58 params={'learning_rate': 0.0008293130840877947, 'steps_per_switch': 7847, 'total_timesteps': 31625}
[2026-04-14 18:22:44] score=28.09 params={'learning_rate': 0.0006723430224246657, 'steps_per_switch': 2400, 'total_timesteps': 31110}
[2026-04-14 18:22:44] score=27.64 params={'learning_rate': 0.0004302041414294587, 'steps_per_switch': 6993, 'total_timesteps': 31040}
[2026-04-14 18:22:44] score=24.77 params={'learning_rate': 0.000225, 'steps_per_switch': 5000, 'total_timesteps': 45000}
[2026-04-14 18:22:44] score=14.61 params={'learning_rate': 0.000225, 'steps_per_switch': 10000, 'total_timesteps': 90000}
[2026-04-14 18:22:46] [Wave3] ✅ Git push complete after trial 5
[2026-04-14 18:22:48]
[Wave3] ========== Trial 6/25 ==========
[2026-04-14 18:22:48] [Wave3] GP UCB top-5 proposals:
[2026-04-14 18:22:48] UCB=3.0500 mu=1.6257 σ=0.7122 params={'learning_rate': 0.0009335007740897103, 'steps_per_switch': 10939, 'total_timesteps': 37842}
[2026-04-14 18:22:48] UCB=3.0271 mu=1.6319 σ=0.6976 params={'learning_rate': 0.0009018551117370809, 'steps_per_switch': 10744, 'total_timesteps': 44030}
[2026-04-14 18:22:48] UCB=3.0176 mu=1.5762 σ=0.7207 params={'learning_rate': 0.000895413146419593, 'steps_per_switch': 10921, 'total_timesteps': 44444}
[2026-04-14 18:22:48] UCB=3.0100 mu=1.7115 σ=0.6493 params={'learning_rate': 0.0009264384396182288, 'steps_per_switch': 9176, 'total_timesteps': 53393}
[2026-04-14 18:22:48] UCB=2.9999 mu=1.4293 σ=0.7853 params={'learning_rate': 0.0009071884060283581, 'steps_per_switch': 11683, 'total_timesteps': 32934}
[2026-04-14 18:22:48] [Wave3] Proposed params: {'learning_rate': 0.0009335007740897103, 'steps_per_switch': 10939, 'total_timesteps': 37842}
[2026-04-14 18:22:50] [Wave3] Launching trial 6: {'learning_rate': 0.0009335007740897103, 'steps_per_switch': 10939, 'total_timesteps': 37842}
[2026-04-14 18:22:50] [Wave3] Command: python3 /home/paulh/projects/donkeycar-rl-autoresearch/agent/multitrack_runner.py --total-timesteps 37842 --steps-per-switch 10939 --learning-rate 0.0009335007740897103 --eval-episodes 3 --save-dir /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave3-trial-0006 --warm-start /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/champion/model.zip
[2026-04-14 18:54:45] [Wave3] Trial 6 finished in 1915.4s, rc=0
[2026-04-14 18:54:45] [Wave3] Parsed: combined=136.0329 mini_monaco=136.0329
[2026-04-14 18:54:45] [Wave3] ===== Trial 6 Summary =====
[2026-04-14 18:54:45] GP data points : 6
[2026-04-14 18:54:45] Wave3 Champion: trial=5 score=137.58 params={'learning_rate': 0.0008293130840877947, 'steps_per_switch': 7847, 'total_timesteps': 31625}
[2026-04-14 18:54:45] Top 5:
[2026-04-14 18:54:45] score=137.58 params={'learning_rate': 0.0008293130840877947, 'steps_per_switch': 7847, 'total_timesteps': 31625}
[2026-04-14 18:54:45] score=136.03 params={'learning_rate': 0.0009335007740897103, 'steps_per_switch': 10939, 'total_timesteps': 37842}
[2026-04-14 18:54:45] score=28.09 params={'learning_rate': 0.0006723430224246657, 'steps_per_switch': 2400, 'total_timesteps': 31110}
[2026-04-14 18:54:45] score=27.64 params={'learning_rate': 0.0004302041414294587, 'steps_per_switch': 6993, 'total_timesteps': 31040}
[2026-04-14 18:54:45] score=24.77 params={'learning_rate': 0.000225, 'steps_per_switch': 5000, 'total_timesteps': 45000}
[2026-04-14 18:54:47]
[Wave3] ========== Trial 7/25 ==========
[2026-04-14 18:54:47] [Wave3] GP UCB top-5 proposals:
[2026-04-14 18:54:47] UCB=2.5695 mu=1.0434 σ=0.7630 params={'learning_rate': 0.0009571853094473745, 'steps_per_switch': 8279, 'total_timesteps': 65149}
[2026-04-14 18:54:47] UCB=2.4957 mu=0.8626 σ=0.8166 params={'learning_rate': 0.0008847315458885072, 'steps_per_switch': 7739, 'total_timesteps': 69794}
[2026-04-14 18:54:47] UCB=2.4939 mu=0.8390 σ=0.8274 params={'learning_rate': 0.0008771680507406297, 'steps_per_switch': 8334, 'total_timesteps': 71955}
[2026-04-14 18:54:47] UCB=2.4561 mu=0.9048 σ=0.7756 params={'learning_rate': 0.0008638688741688625, 'steps_per_switch': 9406, 'total_timesteps': 69536}
[2026-04-14 18:54:47] UCB=2.4282 mu=0.7057 σ=0.8612 params={'learning_rate': 0.0008410317919632058, 'steps_per_switch': 8157, 'total_timesteps': 74924}
[2026-04-14 18:54:47] [Wave3] Proposed params: {'learning_rate': 0.0009571853094473745, 'steps_per_switch': 8279, 'total_timesteps': 65149}
[2026-04-14 18:54:49] [Wave3] Launching trial 7: {'learning_rate': 0.0009571853094473745, 'steps_per_switch': 8279, 'total_timesteps': 65149}
[2026-04-14 18:54:49] [Wave3] Command: python3 /home/paulh/projects/donkeycar-rl-autoresearch/agent/multitrack_runner.py --total-timesteps 65149 --steps-per-switch 8279 --learning-rate 0.0009571853094473745 --eval-episodes 3 --save-dir /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave3-trial-0007 --warm-start /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/champion/model.zip
[2026-04-14 19:58:11] [Wave3] Trial 7 finished in 3801.5s, rc=0
[2026-04-14 19:58:11] [Wave3] Parsed: combined=92.5138 mini_monaco=92.5138
[2026-04-14 19:58:11] [Wave3] ===== Trial 7 Summary =====
[2026-04-14 19:58:11] GP data points : 7
[2026-04-14 19:58:11] Wave3 Champion: trial=5 score=137.58 params={'learning_rate': 0.0008293130840877947, 'steps_per_switch': 7847, 'total_timesteps': 31625}
[2026-04-14 19:58:11] Top 5:
[2026-04-14 19:58:11] score=137.58 params={'learning_rate': 0.0008293130840877947, 'steps_per_switch': 7847, 'total_timesteps': 31625}
[2026-04-14 19:58:11] score=136.03 params={'learning_rate': 0.0009335007740897103, 'steps_per_switch': 10939, 'total_timesteps': 37842}
[2026-04-14 19:58:11] score=92.51 params={'learning_rate': 0.0009571853094473745, 'steps_per_switch': 8279, 'total_timesteps': 65149}
[2026-04-14 19:58:11] score=28.09 params={'learning_rate': 0.0006723430224246657, 'steps_per_switch': 2400, 'total_timesteps': 31110}
[2026-04-14 19:58:11] score=27.64 params={'learning_rate': 0.0004302041414294587, 'steps_per_switch': 6993, 'total_timesteps': 31040}
[2026-04-14 19:58:13]
[Wave3] ========== Trial 8/25 ==========
[2026-04-14 19:58:13] [Wave3] GP UCB top-5 proposals:
[2026-04-14 19:58:13] UCB=2.2244 mu=0.8924 σ=0.6660 params={'learning_rate': 0.0007166192676486139, 'steps_per_switch': 12263, 'total_timesteps': 34585}
[2026-04-14 19:58:13] UCB=2.1788 mu=0.7741 σ=0.7023 params={'learning_rate': 0.00070734628199524, 'steps_per_switch': 12454, 'total_timesteps': 39980}
[2026-04-14 19:58:13] UCB=2.1650 mu=0.7614 σ=0.7018 params={'learning_rate': 0.0007126709588457712, 'steps_per_switch': 12502, 'total_timesteps': 41585}
[2026-04-14 19:58:13] UCB=2.1630 mu=0.7027 σ=0.7301 params={'learning_rate': 0.0006962524520444611, 'steps_per_switch': 12551, 'total_timesteps': 41731}
[2026-04-14 19:58:13] UCB=2.1228 mu=1.4002 σ=0.3613 params={'learning_rate': 0.0009722139517771988, 'steps_per_switch': 7979, 'total_timesteps': 36859}
[2026-04-14 19:58:13] [Wave3] Proposed params: {'learning_rate': 0.0007166192676486139, 'steps_per_switch': 12263, 'total_timesteps': 34585}
[2026-04-14 19:58:15] [Wave3] Launching trial 8: {'learning_rate': 0.0007166192676486139, 'steps_per_switch': 12263, 'total_timesteps': 34585}
[2026-04-14 19:58:15] [Wave3] Command: python3 /home/paulh/projects/donkeycar-rl-autoresearch/agent/multitrack_runner.py --total-timesteps 34585 --steps-per-switch 12263 --learning-rate 0.0007166192676486139 --eval-episodes 3 --save-dir /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave3-trial-0008 --warm-start /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/champion/model.zip
[2026-04-14 20:23:11] [Wave3] Trial 8 finished in 1495.6s, rc=0
[2026-04-14 20:23:11] [Wave3] Parsed: combined=108.3697 mini_monaco=108.3697
[2026-04-14 20:23:11] [Wave3] ===== Trial 8 Summary =====
[2026-04-14 20:23:11] GP data points : 8
[2026-04-14 20:23:11] Wave3 Champion: trial=5 score=137.58 params={'learning_rate': 0.0008293130840877947, 'steps_per_switch': 7847, 'total_timesteps': 31625}
[2026-04-14 20:23:11] Top 5:
[2026-04-14 20:23:11] score=137.58 params={'learning_rate': 0.0008293130840877947, 'steps_per_switch': 7847, 'total_timesteps': 31625}
[2026-04-14 20:23:11] score=136.03 params={'learning_rate': 0.0009335007740897103, 'steps_per_switch': 10939, 'total_timesteps': 37842}
[2026-04-14 20:23:11] score=108.37 params={'learning_rate': 0.0007166192676486139, 'steps_per_switch': 12263, 'total_timesteps': 34585}
[2026-04-14 20:23:11] score=92.51 params={'learning_rate': 0.0009571853094473745, 'steps_per_switch': 8279, 'total_timesteps': 65149}
[2026-04-14 20:23:11] score=28.09 params={'learning_rate': 0.0006723430224246657, 'steps_per_switch': 2400, 'total_timesteps': 31110}
[2026-04-14 20:23:13]
[Wave3] ========== Trial 9/25 ==========
[2026-04-14 20:23:13] [Wave3] GP UCB top-5 proposals:
[2026-04-14 20:23:13] UCB=2.1683 mu=1.5130 σ=0.3277 params={'learning_rate': 0.0009737883579372665, 'steps_per_switch': 8510, 'total_timesteps': 30923}
[2026-04-14 20:23:13] UCB=1.9762 mu=-0.0238 σ=1.0000 params={'learning_rate': 0.0009868738288938224, 'steps_per_switch': 13481, 'total_timesteps': 149987}
[2026-04-14 20:23:13] UCB=1.9705 mu=-0.0295 σ=1.0000 params={'learning_rate': 0.0008613697541025559, 'steps_per_switch': 14439, 'total_timesteps': 144933}
[2026-04-14 20:23:13] UCB=1.9690 mu=-0.0310 σ=1.0000 params={'learning_rate': 0.0008421371980014351, 'steps_per_switch': 14662, 'total_timesteps': 143365}
[2026-04-14 20:23:13] UCB=1.9688 mu=-0.0317 σ=1.0003 params={'learning_rate': 0.0006318576844029589, 'steps_per_switch': 2183, 'total_timesteps': 145839}
[2026-04-14 20:23:13] [Wave3] Proposed params: {'learning_rate': 0.0009737883579372665, 'steps_per_switch': 8510, 'total_timesteps': 30923}
[2026-04-14 20:23:15] [Wave3] Launching trial 9: {'learning_rate': 0.0009737883579372665, 'steps_per_switch': 8510, 'total_timesteps': 30923}
[2026-04-14 20:23:15] [Wave3] Command: python3 /home/paulh/projects/donkeycar-rl-autoresearch/agent/multitrack_runner.py --total-timesteps 30923 --steps-per-switch 8510 --learning-rate 0.0009737883579372665 --eval-episodes 3 --save-dir /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave3-trial-0009 --warm-start /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/champion/model.zip
[2026-04-14 20:37:40] [Wave3] Seed trial 1/2: using hardcoded params.
[2026-04-14 20:37:40] [Wave3] Seed trial 2/2: using hardcoded params.
[2026-04-14 20:37:40] [Wave3] Only 0 results — using random proposal.
[2026-04-14 20:37:40] [Champion] 🏆 NEW BEST! Trial 3: score=1500.00 (mini_monaco=1500.0) params={'learning_rate': 0.0002, 'steps_per_switch': 8000, 'total_timesteps': 150000}
[2026-04-14 20:37:40] [Champion] 🏆 NEW BEST! Trial 1: score=2000.00 (mini_monaco=2000.0) params={}
[2026-04-14 20:37:55] =================================================================
[2026-04-14 20:37:55] [Wave3] Multi-Track Autoresearch — GP+UCB Generalization Search
[2026-04-14 20:37:55] [Wave3] Training tracks : generated_road, generated_track, mountain_track
[2026-04-14 20:37:55] [Wave3] Test tracks : mini_monaco only (zero-shot; warren removed — broken done condition)
[2026-04-14 20:37:55] [Wave3] Max trials : 25 | kappa=2.0 | push every 5
[2026-04-14 20:37:55] [Wave3] Results file : /home/paulh/projects/donkeycar-rl-autoresearch/agent/outerloop-results/autoresearch_results_phase3.jsonl
[2026-04-14 20:37:55] [Wave3] Champion dir : /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave3-champion
[2026-04-14 20:37:55] [Wave3] Warm start : /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/champion/model.zip
[2026-04-14 20:37:55] =================================================================
[2026-04-14 20:37:55] [Wave3] Loaded 0 existing Phase 3 results.
[2026-04-14 20:37:55] [Wave3] Wave3 Champion: trial=5 score=137.58 params={'learning_rate': 0.0008293130840877947, 'steps_per_switch': 7847, 'total_timesteps': 31625}
[2026-04-14 20:37:55] [Wave3] Starting from trial 1.
[2026-04-14 20:37:55]
[Wave3] ========== Trial 1/25 ==========
[2026-04-14 20:37:55] [Wave3] Seed trial 1/2: using hardcoded params.
[2026-04-14 20:37:55] [Wave3] Proposed params: {'learning_rate': 0.000225, 'steps_per_switch': 5000, 'total_timesteps': 45000}
[2026-04-14 20:37:57] [Wave3] Launching trial 1: {'learning_rate': 0.000225, 'steps_per_switch': 5000, 'total_timesteps': 45000}
[2026-04-14 20:37:57] [Wave3] Command: python3 /home/paulh/projects/donkeycar-rl-autoresearch/agent/multitrack_runner.py --total-timesteps 45000 --steps-per-switch 5000 --learning-rate 0.000225 --eval-episodes 3 --save-dir /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave3-trial-0001 --warm-start /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/champion/model.zip
[2026-04-14 21:27:21] [Wave3] Seed trial 1/2: using hardcoded params.
[2026-04-14 21:27:21] [Wave3] Seed trial 2/2: using hardcoded params.
[2026-04-14 21:27:21] [Wave3] Only 0 results — using random proposal.
[2026-04-14 21:27:21] [Champion] 🏆 NEW BEST! Trial 3: score=1500.00 (mini_monaco=1500.0) params={'learning_rate': 0.0002, 'steps_per_switch': 8000, 'total_timesteps': 150000}
[2026-04-14 21:27:21] [Champion] 🏆 NEW BEST! Trial 1: score=2000.00 (mini_monaco=2000.0) params={}
[2026-04-14 21:27:52] =================================================================
[2026-04-14 21:27:52] [Wave3] Multi-Track Autoresearch — GP+UCB Generalization Search
[2026-04-14 21:27:52] [Wave3] Training tracks : generated_road, generated_track, mountain_track
[2026-04-14 21:27:52] [Wave3] Test tracks : mini_monaco only (zero-shot; warren removed — broken done condition)
[2026-04-14 21:27:52] [Wave3] Max trials : 25 | kappa=2.0 | push every 5
[2026-04-14 21:27:52] [Wave3] Results file : /home/paulh/projects/donkeycar-rl-autoresearch/agent/outerloop-results/autoresearch_results_phase3.jsonl
[2026-04-14 21:27:52] [Wave3] Champion dir : /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave3-champion
[2026-04-14 21:27:52] [Wave3] Warm start : /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/champion/model.zip
[2026-04-14 21:27:52] =================================================================
[2026-04-14 21:27:52] [Wave3] Loaded 0 existing Phase 3 results.
[2026-04-14 21:27:52] [Wave3] Wave3 Champion: trial=5 score=137.58 params={'learning_rate': 0.0008293130840877947, 'steps_per_switch': 7847, 'total_timesteps': 31625}
[2026-04-14 21:27:52] [Wave3] Starting from trial 1.
[2026-04-14 21:27:52]
[Wave3] ========== Trial 1/25 ==========
[2026-04-14 21:27:52] [Wave3] Seed trial 1/2: using hardcoded params.
[2026-04-14 21:27:52] [Wave3] Proposed params: {'learning_rate': 0.000225, 'steps_per_switch': 5000, 'total_timesteps': 45000}
[2026-04-14 21:27:54] [Wave3] Launching trial 1: {'learning_rate': 0.000225, 'steps_per_switch': 5000, 'total_timesteps': 45000}
[2026-04-14 21:27:54] [Wave3] Command: python3 /home/paulh/projects/donkeycar-rl-autoresearch/agent/multitrack_runner.py --total-timesteps 45000 --steps-per-switch 5000 --learning-rate 0.000225 --eval-episodes 3 --save-dir /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave3-trial-0001 --warm-start /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/champion/model.zip
[2026-04-14 22:21:10] [Wave3] Trial 1 finished in 3195.8s, rc=0
[2026-04-14 22:21:10] [Wave3] Parsed: combined=24.925 mini_monaco=24.925
[2026-04-14 22:21:10] [Wave3] ===== Trial 1 Summary =====
[2026-04-14 22:21:10] GP data points : 1
[2026-04-14 22:21:10] Wave3 Champion: trial=5 score=137.58 params={'learning_rate': 0.0008293130840877947, 'steps_per_switch': 7847, 'total_timesteps': 31625}
[2026-04-14 22:21:10] Top 5:
[2026-04-14 22:21:10] score=24.93 params={'learning_rate': 0.000225, 'steps_per_switch': 5000, 'total_timesteps': 45000}
[2026-04-14 22:21:12]
[Wave3] ========== Trial 2/25 ==========
[2026-04-14 22:21:12] [Wave3] Seed trial 2/2: using hardcoded params.
[2026-04-14 22:21:12] [Wave3] Proposed params: {'learning_rate': 0.001, 'steps_per_switch': 5000, 'total_timesteps': 45000}
[2026-04-14 22:21:14] [Wave3] Launching trial 2: {'learning_rate': 0.001, 'steps_per_switch': 5000, 'total_timesteps': 45000}
[2026-04-14 22:21:14] [Wave3] Command: python3 /home/paulh/projects/donkeycar-rl-autoresearch/agent/multitrack_runner.py --total-timesteps 45000 --steps-per-switch 5000 --learning-rate 0.001 --eval-episodes 3 --save-dir /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/wave3-trial-0002 --warm-start /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/champion/model.zip
[2026-04-14 22:40:23] [Wave3] Seed trial 1/2: using hardcoded params.
[2026-04-14 22:40:23] [Wave3] Seed trial 2/2: using hardcoded params.
[2026-04-14 22:40:23] [Wave3] Only 0 results — using random proposal.
[2026-04-14 22:40:23] [Champion] 🏆 NEW BEST! Trial 3: score=1500.00 (mini_monaco=1500.0) params={'learning_rate': 0.0002, 'steps_per_switch': 8000, 'total_timesteps': 150000}
[2026-04-14 22:40:23] [Champion] 🏆 NEW BEST! Trial 1: score=2000.00 (mini_monaco=2000.0) params={}
[2026-04-14 22:44:13] [Wave3] Seed trial 1/2: using hardcoded params.
[2026-04-14 22:44:13] [Wave3] Seed trial 2/2: using hardcoded params.
[2026-04-14 22:44:13] [Wave3] Only 0 results — using random proposal.
[2026-04-14 22:44:13] [Champion] 🏆 NEW BEST! Trial 3: score=1500.00 (mini_monaco=1500.0) params={'learning_rate': 0.0002, 'steps_per_switch': 8000, 'total_timesteps': 150000}
[2026-04-14 22:44:13] [Champion] 🏆 NEW BEST! Trial 1: score=2000.00 (mini_monaco=2000.0) params={}
[2026-04-15 09:03:51] [Wave3] Seed trial 1/2: using hardcoded params.
[2026-04-15 09:03:51] [Wave3] Seed trial 2/2: using hardcoded params.
[2026-04-15 09:03:51] [Wave3] Only 0 results — using random proposal.
[2026-04-15 09:03:51] [Champion] 🏆 NEW BEST! Trial 3: score=1500.00 (mini_monaco=1500.0) params={'learning_rate': 0.0002, 'steps_per_switch': 8000, 'total_timesteps': 150000}
[2026-04-15 09:03:51] [Champion] 🏆 NEW BEST! Trial 1: score=2000.00 (mini_monaco=2000.0) params={}
[2026-04-15 09:04:44] [Wave3] Seed trial 1/2: using hardcoded params.
[2026-04-15 09:04:44] [Wave3] Seed trial 2/2: using hardcoded params.
[2026-04-15 09:04:44] [Wave3] Only 0 results — using random proposal.
[2026-04-15 09:04:44] [Champion] 🏆 NEW BEST! Trial 3: score=1500.00 (mini_monaco=1500.0) params={'learning_rate': 0.0002, 'steps_per_switch': 8000, 'total_timesteps': 150000}
[2026-04-15 09:04:44] [Champion] 🏆 NEW BEST! Trial 1: score=2000.00 (mini_monaco=2000.0) params={}
[2026-04-15 09:06:00] [Wave3] Seed trial 1/2: using hardcoded params.
[2026-04-15 09:06:00] [Wave3] Seed trial 2/2: using hardcoded params.
[2026-04-15 09:06:00] [Wave3] Only 0 results — using random proposal.
[2026-04-15 09:06:00] [Champion] 🏆 NEW BEST! Trial 3: score=1500.00 (mini_monaco=1500.0) params={'learning_rate': 0.0002, 'steps_per_switch': 8000, 'total_timesteps': 150000}
[2026-04-15 09:06:00] [Champion] 🏆 NEW BEST! Trial 1: score=2000.00 (mini_monaco=2000.0) params={}
[2026-04-15 09:15:27] [Wave3] Seed trial 1/2: using hardcoded params.
[2026-04-15 09:15:27] [Wave3] Seed trial 2/2: using hardcoded params.
[2026-04-15 09:15:27] [Wave3] Only 0 results — using random proposal.
[2026-04-15 09:15:27] [Champion] 🏆 NEW BEST! Trial 3: score=1500.00 (mini_monaco=1500.0) params={'learning_rate': 0.0002, 'steps_per_switch': 8000, 'total_timesteps': 150000}
[2026-04-15 09:15:27] [Champion] 🏆 NEW BEST! Trial 1: score=2000.00 (mini_monaco=2000.0) params={}
[2026-04-15 09:17:10] [Wave3] Seed trial 1/2: using hardcoded params.
[2026-04-15 09:17:10] [Wave3] Seed trial 2/2: using hardcoded params.
[2026-04-15 09:17:10] [Wave3] Only 0 results — using random proposal.
[2026-04-15 09:17:10] [Champion] 🏆 NEW BEST! Trial 3: score=1500.00 (mini_monaco=1500.0) params={'learning_rate': 0.0002, 'steps_per_switch': 8000, 'total_timesteps': 150000}
[2026-04-15 09:17:10] [Champion] 🏆 NEW BEST! Trial 1: score=2000.00 (mini_monaco=2000.0) params={}
[2026-04-15 21:54:37] [Wave3] Seed trial 1/2: using hardcoded params.
[2026-04-15 21:54:37] [Wave3] Seed trial 2/2: using hardcoded params.
[2026-04-15 21:54:37] [Wave3] Only 0 results — using random proposal.
[2026-04-15 21:54:37] [Champion] 🏆 NEW BEST! Trial 3: score=1500.00 (mini_monaco=1500.0) params={'learning_rate': 0.0002, 'steps_per_switch': 8000, 'total_timesteps': 150000}
[2026-04-15 21:54:37] [Champion] 🏆 NEW BEST! Trial 1: score=2000.00 (mini_monaco=2000.0) params={}
[2026-04-15 22:26:39] [Wave3] Seed trial 1/2: using hardcoded params.
[2026-04-15 22:26:39] [Wave3] Seed trial 2/2: using hardcoded params.
[2026-04-15 22:26:39] [Wave3] Only 0 results — using random proposal.
[2026-04-15 22:26:39] [Champion] 🏆 NEW BEST! Trial 3: score=1500.00 (mini_monaco=1500.0) params={'learning_rate': 0.0002, 'steps_per_switch': 8000, 'total_timesteps': 150000}
[2026-04-15 22:26:39] [Champion] 🏆 NEW BEST! Trial 1: score=2000.00 (mini_monaco=2000.0) params={}
[2026-04-15 22:47:16] [Wave3] Seed trial 1/2: using hardcoded params.
[2026-04-15 22:47:16] [Wave3] Seed trial 2/2: using hardcoded params.
[2026-04-15 22:47:16] [Wave3] Only 0 results — using random proposal.
[2026-04-15 22:47:16] [Champion] 🏆 NEW BEST! Trial 3: score=1500.00 (mini_monaco=1500.0) params={'learning_rate': 0.0002, 'steps_per_switch': 8000, 'total_timesteps': 150000}
[2026-04-15 22:47:16] [Champion] 🏆 NEW BEST! Trial 1: score=2000.00 (mini_monaco=2000.0) params={}
[2026-04-16 17:29:20] [Wave3] Seed trial 1/2: using hardcoded params.
[2026-04-16 17:29:20] [Wave3] Seed trial 2/2: using hardcoded params.
[2026-04-16 17:29:20] [Wave3] Only 0 results — using random proposal.
[2026-04-16 17:29:20] [Champion] 🏆 NEW BEST! Trial 3: score=1500.00 (mini_monaco=1500.0) params={'learning_rate': 0.0002, 'steps_per_switch': 8000, 'total_timesteps': 150000}
[2026-04-16 17:29:20] [Champion] 🏆 NEW BEST! Trial 1: score=2000.00 (mini_monaco=2000.0) params={}
[2026-04-17 13:25:25] [Wave3] Seed trial 1/2: using hardcoded params.
[2026-04-17 13:25:25] [Wave3] Seed trial 2/2: using hardcoded params.
[2026-04-17 13:25:25] [Wave3] Only 0 results — using random proposal.
[2026-04-17 13:25:25] [Champion] 🏆 NEW BEST! Trial 3: score=1500.00 (mini_monaco=1500.0) params={'learning_rate': 0.0002, 'steps_per_switch': 8000, 'total_timesteps': 150000}
[2026-04-17 13:25:25] [Champion] 🏆 NEW BEST! Trial 1: score=2000.00 (mini_monaco=2000.0) params={}
[2026-04-17 14:45:26] [Wave3] Seed trial 1/2: using hardcoded params.
[2026-04-17 14:45:26] [Wave3] Seed trial 2/2: using hardcoded params.
[2026-04-17 14:45:26] [Wave3] Only 0 results — using random proposal.
[2026-04-17 14:45:26] [Champion] 🏆 NEW BEST! Trial 3: score=1500.00 (mini_monaco=1500.0) params={'learning_rate': 0.0002, 'steps_per_switch': 8000, 'total_timesteps': 150000}
[2026-04-17 14:45:26] [Champion] 🏆 NEW BEST! Trial 1: score=2000.00 (mini_monaco=2000.0) params={}
[2026-04-17 22:10:26] [Wave3] Seed trial 1/2: using hardcoded params.
[2026-04-17 22:10:26] [Wave3] Seed trial 2/2: using hardcoded params.
[2026-04-17 22:10:26] [Wave3] Only 0 results — using random proposal.
[2026-04-17 22:10:26] [Champion] 🏆 NEW BEST! Trial 3: score=1500.00 (mini_monaco=1500.0) params={'learning_rate': 0.0002, 'steps_per_switch': 8000, 'total_timesteps': 150000}
[2026-04-17 22:10:26] [Champion] 🏆 NEW BEST! Trial 1: score=2000.00 (mini_monaco=2000.0) params={}
[2026-04-18 10:41:19] [Wave3] Seed trial 1/2: using hardcoded params.
[2026-04-18 10:41:19] [Wave3] Seed trial 2/2: using hardcoded params.
[2026-04-18 10:41:19] [Wave3] Only 0 results — using random proposal.
[2026-04-18 10:41:19] [Champion] 🏆 NEW BEST! Trial 3: score=1500.00 (mini_monaco=1500.0) params={'learning_rate': 0.0002, 'steps_per_switch': 8000, 'total_timesteps': 150000}
[2026-04-18 10:41:19] [Champion] 🏆 NEW BEST! Trial 1: score=2000.00 (mini_monaco=2000.0) params={}
[2026-04-18 10:42:10] [Wave3] Seed trial 1/2: using hardcoded params.
[2026-04-18 10:42:10] [Wave3] Seed trial 2/2: using hardcoded params.
[2026-04-18 10:42:10] [Wave3] Only 0 results — using random proposal.
[2026-04-18 10:42:10] [Champion] 🏆 NEW BEST! Trial 3: score=1500.00 (mini_monaco=1500.0) params={'learning_rate': 0.0002, 'steps_per_switch': 8000, 'total_timesteps': 150000}
[2026-04-18 10:42:10] [Champion] 🏆 NEW BEST! Trial 1: score=2000.00 (mini_monaco=2000.0) params={}