1994 lines
265 KiB
Plaintext
1994 lines
265 KiB
Plaintext
[2026-04-13 13:36:24] ============================================================
|
|
[2026-04-13 13:36:24] [AutoResearch] Phase 1 — Real PPO Training + GP+UCB Optimization
|
|
[2026-04-13 13:36:24] [AutoResearch] Max trials: 50 | kappa: 2.0 | push every: 10
|
|
[2026-04-13 13:36:24] [AutoResearch] Results: /home/paulh/projects/donkeycar-rl-autoresearch/agent/outerloop-results/autoresearch_results_phase1.jsonl
|
|
[2026-04-13 13:36:24] [AutoResearch] Champion: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/champion
|
|
[2026-04-13 13:36:24] ============================================================
|
|
[2026-04-13 13:36:24] [AutoResearch] Loaded 0 existing Phase 1 results.
|
|
[2026-04-13 13:36:24] [AutoResearch] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 13:36:24]
|
|
[AutoResearch] ========== Trial 1/50 ==========
|
|
[2026-04-13 13:36:24] [AutoResearch] Only 0 results — using random proposal.
|
|
[2026-04-13 13:36:24] [AutoResearch] Proposed: {'n_steer': 5, 'n_throttle': 5, 'learning_rate': 0.0006023644308821473, 'timesteps': 4723, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 13:36:26] [AutoResearch] Launching trial 1: {'n_steer': 5, 'n_throttle': 5, 'learning_rate': 0.0006023644308821473, 'timesteps': 4723, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 13:41:13] [AutoResearch] Trial 1 finished in 286.6s, returncode=0
|
|
[2026-04-13 13:41:13] [AutoResearch] Trial 1: mean_reward=14.3331 std_reward=0.7924
|
|
[2026-04-13 13:41:13] [AutoResearch] === Trial 1 Summary ===
|
|
[2026-04-13 13:41:13] Total Phase 1 runs: 1
|
|
[2026-04-13 13:41:13] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 13:41:13] Top 5:
|
|
[2026-04-13 13:41:13] mean_reward=14.3331 params={'n_steer': 5, 'n_throttle': 5, 'learning_rate': 0.0006023644308821473, 'timesteps': 4723, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 13:41:15]
|
|
[AutoResearch] ========== Trial 2/50 ==========
|
|
[2026-04-13 13:41:15] [AutoResearch] Only 1 results — using random proposal.
|
|
[2026-04-13 13:41:15] [AutoResearch] Proposed: {'n_steer': 6, 'n_throttle': 5, 'learning_rate': 0.0026097080330405096, 'timesteps': 3663, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 13:41:17] [AutoResearch] Launching trial 2: {'n_steer': 6, 'n_throttle': 5, 'learning_rate': 0.0026097080330405096, 'timesteps': 3663, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 13:44:39] [AutoResearch] Trial 2 finished in 201.7s, returncode=0
|
|
[2026-04-13 13:44:39] [AutoResearch] Trial 2: mean_reward=14.6781 std_reward=0.0047
|
|
[2026-04-13 13:44:39] [AutoResearch] === Trial 2 Summary ===
|
|
[2026-04-13 13:44:39] Total Phase 1 runs: 2
|
|
[2026-04-13 13:44:39] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 13:44:39] Top 5:
|
|
[2026-04-13 13:44:39] mean_reward=14.6781 params={'n_steer': 6, 'n_throttle': 5, 'learning_rate': 0.0026097080330405096, 'timesteps': 3663, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 13:44:39] mean_reward=14.3331 params={'n_steer': 5, 'n_throttle': 5, 'learning_rate': 0.0006023644308821473, 'timesteps': 4723, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 13:44:41]
|
|
[AutoResearch] ========== Trial 3/50 ==========
|
|
[2026-04-13 13:44:41] [AutoResearch] Only 2 results — using random proposal.
|
|
[2026-04-13 13:44:41] [AutoResearch] Proposed: {'n_steer': 8, 'n_throttle': 4, 'learning_rate': 0.0022425720960039287, 'timesteps': 1878, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 13:44:43] [AutoResearch] Launching trial 3: {'n_steer': 8, 'n_throttle': 4, 'learning_rate': 0.0022425720960039287, 'timesteps': 1878, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 13:46:22] [AutoResearch] Trial 3 finished in 99.2s, returncode=0
|
|
[2026-04-13 13:46:22] [AutoResearch] Trial 3: mean_reward=15.0946 std_reward=0.0381
|
|
[2026-04-13 13:46:22] [AutoResearch] === Trial 3 Summary ===
|
|
[2026-04-13 13:46:22] Total Phase 1 runs: 3
|
|
[2026-04-13 13:46:22] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 13:46:22] Top 5:
|
|
[2026-04-13 13:46:22] mean_reward=15.0946 params={'n_steer': 8, 'n_throttle': 4, 'learning_rate': 0.0022425720960039287, 'timesteps': 1878, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 13:46:22] mean_reward=14.6781 params={'n_steer': 6, 'n_throttle': 5, 'learning_rate': 0.0026097080330405096, 'timesteps': 3663, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 13:46:22] mean_reward=14.3331 params={'n_steer': 5, 'n_throttle': 5, 'learning_rate': 0.0006023644308821473, 'timesteps': 4723, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 13:46:24]
|
|
[AutoResearch] ========== Trial 4/50 ==========
|
|
[2026-04-13 13:46:24] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 13:46:24] UCB=2.3657 mu=0.6683 sigma=0.8487 params={'n_steer': 8, 'n_throttle': 3, 'learning_rate': 0.0030375027886947775, 'timesteps': 2497}
|
|
[2026-04-13 13:46:24] UCB=2.3642 mu=0.6129 sigma=0.8757 params={'n_steer': 7, 'n_throttle': 4, 'learning_rate': 0.003872092322417417, 'timesteps': 1454}
|
|
[2026-04-13 13:46:24] UCB=2.3627 mu=0.6363 sigma=0.8632 params={'n_steer': 9, 'n_throttle': 3, 'learning_rate': 0.002284233345527573, 'timesteps': 2792}
|
|
[2026-04-13 13:46:24] UCB=2.3611 mu=0.6142 sigma=0.8735 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0024387325888159195, 'timesteps': 1898}
|
|
[2026-04-13 13:46:24] UCB=2.3610 mu=0.6522 sigma=0.8544 params={'n_steer': 8, 'n_throttle': 4, 'learning_rate': 0.0007105241846548975, 'timesteps': 1492}
|
|
[2026-04-13 13:46:24] [AutoResearch] Proposed: {'n_steer': 8, 'n_throttle': 3, 'learning_rate': 0.0030375027886947775, 'timesteps': 2497, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 13:46:26] [AutoResearch] Launching trial 4: {'n_steer': 8, 'n_throttle': 3, 'learning_rate': 0.0030375027886947775, 'timesteps': 2497, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 13:51:00] [AutoResearch] Trial 4 finished in 273.6s, returncode=0
|
|
[2026-04-13 13:51:00] [AutoResearch] Trial 4: mean_reward=14.6036 std_reward=0.0414
|
|
[2026-04-13 13:51:00] [AutoResearch] === Trial 4 Summary ===
|
|
[2026-04-13 13:51:00] Total Phase 1 runs: 4
|
|
[2026-04-13 13:51:00] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 13:51:00] Top 5:
|
|
[2026-04-13 13:51:00] mean_reward=15.0946 params={'n_steer': 8, 'n_throttle': 4, 'learning_rate': 0.0022425720960039287, 'timesteps': 1878, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 13:51:00] mean_reward=14.6781 params={'n_steer': 6, 'n_throttle': 5, 'learning_rate': 0.0026097080330405096, 'timesteps': 3663, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 13:51:00] mean_reward=14.6036 params={'n_steer': 8, 'n_throttle': 3, 'learning_rate': 0.0030375027886947775, 'timesteps': 2497, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 13:51:00] mean_reward=14.3331 params={'n_steer': 5, 'n_throttle': 5, 'learning_rate': 0.0006023644308821473, 'timesteps': 4723, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 13:51:02]
|
|
[AutoResearch] ========== Trial 5/50 ==========
|
|
[2026-04-13 13:51:02] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 13:51:02] UCB=2.6511 mu=1.1122 sigma=0.7695 params={'n_steer': 8, 'n_throttle': 5, 'learning_rate': 0.0011131823295383878, 'timesteps': 1691}
|
|
[2026-04-13 13:51:02] UCB=2.6390 mu=1.1590 sigma=0.7400 params={'n_steer': 9, 'n_throttle': 4, 'learning_rate': 0.0023718639468651766, 'timesteps': 1039}
|
|
[2026-04-13 13:51:02] UCB=2.6371 mu=0.9686 sigma=0.8342 params={'n_steer': 9, 'n_throttle': 5, 'learning_rate': 0.0024714959214247904, 'timesteps': 1444}
|
|
[2026-04-13 13:51:02] UCB=2.6303 mu=1.0400 sigma=0.7951 params={'n_steer': 9, 'n_throttle': 5, 'learning_rate': 0.0028500094580389797, 'timesteps': 1932}
|
|
[2026-04-13 13:51:02] UCB=2.6281 mu=0.9823 sigma=0.8229 params={'n_steer': 7, 'n_throttle': 4, 'learning_rate': 0.0018569421425612218, 'timesteps': 1300}
|
|
[2026-04-13 13:51:02] [AutoResearch] Proposed: {'n_steer': 8, 'n_throttle': 5, 'learning_rate': 0.0011131823295383878, 'timesteps': 1691, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 13:51:04] [AutoResearch] Launching trial 5: {'n_steer': 8, 'n_throttle': 5, 'learning_rate': 0.0011131823295383878, 'timesteps': 1691, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 13:52:58] [AutoResearch] Trial 5 finished in 114.3s, returncode=0
|
|
[2026-04-13 13:52:58] [AutoResearch] Trial 5: mean_reward=92.4248 std_reward=0.2184
|
|
[2026-04-13 13:52:58] [AutoResearch] === Trial 5 Summary ===
|
|
[2026-04-13 13:52:58] Total Phase 1 runs: 5
|
|
[2026-04-13 13:52:58] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 13:52:58] Top 5:
|
|
[2026-04-13 13:52:58] mean_reward=92.4248 params={'n_steer': 8, 'n_throttle': 5, 'learning_rate': 0.0011131823295383878, 'timesteps': 1691, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 13:52:58] mean_reward=15.0946 params={'n_steer': 8, 'n_throttle': 4, 'learning_rate': 0.0022425720960039287, 'timesteps': 1878, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 13:52:58] mean_reward=14.6781 params={'n_steer': 6, 'n_throttle': 5, 'learning_rate': 0.0026097080330405096, 'timesteps': 3663, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 13:52:58] mean_reward=14.6036 params={'n_steer': 8, 'n_throttle': 3, 'learning_rate': 0.0030375027886947775, 'timesteps': 2497, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 13:52:58] mean_reward=14.3331 params={'n_steer': 5, 'n_throttle': 5, 'learning_rate': 0.0006023644308821473, 'timesteps': 4723, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 13:53:00]
|
|
[AutoResearch] ========== Trial 6/50 ==========
|
|
[2026-04-13 13:53:00] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 13:53:00] UCB=2.7680 mu=1.8581 sigma=0.4549 params={'n_steer': 8, 'n_throttle': 5, 'learning_rate': 0.0010109905842864714, 'timesteps': 1150}
|
|
[2026-04-13 13:53:00] UCB=2.7571 mu=1.3859 sigma=0.6856 params={'n_steer': 7, 'n_throttle': 5, 'learning_rate': 0.0005903908533825176, 'timesteps': 2343}
|
|
[2026-04-13 13:53:00] UCB=2.6722 mu=1.7592 sigma=0.4565 params={'n_steer': 7, 'n_throttle': 5, 'learning_rate': 0.0007155529793779908, 'timesteps': 1801}
|
|
[2026-04-13 13:53:00] UCB=2.6514 mu=0.8841 sigma=0.8837 params={'n_steer': 6, 'n_throttle': 4, 'learning_rate': 0.00022184611967850532, 'timesteps': 1388}
|
|
[2026-04-13 13:53:00] UCB=2.6250 mu=1.2493 sigma=0.6879 params={'n_steer': 7, 'n_throttle': 4, 'learning_rate': 0.0007146109608951488, 'timesteps': 1192}
|
|
[2026-04-13 13:53:00] [AutoResearch] Proposed: {'n_steer': 8, 'n_throttle': 5, 'learning_rate': 0.0010109905842864714, 'timesteps': 1150, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 13:53:02] [AutoResearch] Launching trial 6: {'n_steer': 8, 'n_throttle': 5, 'learning_rate': 0.0010109905842864714, 'timesteps': 1150, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 13:54:52] [AutoResearch] Trial 6 finished in 109.6s, returncode=0
|
|
[2026-04-13 13:54:52] [AutoResearch] Trial 6: mean_reward=74.2498 std_reward=0.327
|
|
[2026-04-13 13:54:52] [AutoResearch] === Trial 6 Summary ===
|
|
[2026-04-13 13:54:52] Total Phase 1 runs: 6
|
|
[2026-04-13 13:54:52] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 13:54:52] Top 5:
|
|
[2026-04-13 13:54:52] mean_reward=92.4248 params={'n_steer': 8, 'n_throttle': 5, 'learning_rate': 0.0011131823295383878, 'timesteps': 1691, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 13:54:52] mean_reward=74.2498 params={'n_steer': 8, 'n_throttle': 5, 'learning_rate': 0.0010109905842864714, 'timesteps': 1150, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 13:54:52] mean_reward=15.0946 params={'n_steer': 8, 'n_throttle': 4, 'learning_rate': 0.0022425720960039287, 'timesteps': 1878, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 13:54:52] mean_reward=14.6781 params={'n_steer': 6, 'n_throttle': 5, 'learning_rate': 0.0026097080330405096, 'timesteps': 3663, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 13:54:52] mean_reward=14.6036 params={'n_steer': 8, 'n_throttle': 3, 'learning_rate': 0.0030375027886947775, 'timesteps': 2497, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 13:54:54]
|
|
[AutoResearch] ========== Trial 7/50 ==========
|
|
[2026-04-13 13:54:54] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 13:54:54] UCB=2.5933 mu=0.8355 sigma=0.8789 params={'n_steer': 6, 'n_throttle': 5, 'learning_rate': 0.0001799978550884136, 'timesteps': 2081}
|
|
[2026-04-13 13:54:54] UCB=2.5505 mu=0.7947 sigma=0.8779 params={'n_steer': 9, 'n_throttle': 4, 'learning_rate': 0.00014230944957386966, 'timesteps': 2582}
|
|
[2026-04-13 13:54:54] UCB=2.5503 mu=0.8231 sigma=0.8636 params={'n_steer': 8, 'n_throttle': 5, 'learning_rate': 0.00023514026888676683, 'timesteps': 3062}
|
|
[2026-04-13 13:54:54] UCB=2.5217 mu=1.2654 sigma=0.6281 params={'n_steer': 7, 'n_throttle': 5, 'learning_rate': 0.0006389418712953596, 'timesteps': 1932}
|
|
[2026-04-13 13:54:54] UCB=2.3784 mu=0.6315 sigma=0.8734 params={'n_steer': 6, 'n_throttle': 5, 'learning_rate': 0.00034403242394723434, 'timesteps': 2732}
|
|
[2026-04-13 13:54:54] [AutoResearch] Proposed: {'n_steer': 6, 'n_throttle': 5, 'learning_rate': 0.0001799978550884136, 'timesteps': 2081, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 13:54:56] [AutoResearch] Launching trial 7: {'n_steer': 6, 'n_throttle': 5, 'learning_rate': 0.0001799978550884136, 'timesteps': 2081, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 13:58:28] [AutoResearch] Trial 7 finished in 212.1s, returncode=0
|
|
[2026-04-13 13:58:28] [AutoResearch] Trial 7: mean_reward=326.6374 std_reward=2.3715
|
|
[2026-04-13 13:58:28] [AutoResearch] === Trial 7 Summary ===
|
|
[2026-04-13 13:58:28] Total Phase 1 runs: 7
|
|
[2026-04-13 13:58:28] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 13:58:28] Top 5:
|
|
[2026-04-13 13:58:28] mean_reward=326.6374 params={'n_steer': 6, 'n_throttle': 5, 'learning_rate': 0.0001799978550884136, 'timesteps': 2081, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 13:58:28] mean_reward=92.4248 params={'n_steer': 8, 'n_throttle': 5, 'learning_rate': 0.0011131823295383878, 'timesteps': 1691, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 13:58:28] mean_reward=74.2498 params={'n_steer': 8, 'n_throttle': 5, 'learning_rate': 0.0010109905842864714, 'timesteps': 1150, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 13:58:28] mean_reward=15.0946 params={'n_steer': 8, 'n_throttle': 4, 'learning_rate': 0.0022425720960039287, 'timesteps': 1878, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 13:58:28] mean_reward=14.6781 params={'n_steer': 6, 'n_throttle': 5, 'learning_rate': 0.0026097080330405096, 'timesteps': 3663, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 13:58:30]
|
|
[AutoResearch] ========== Trial 8/50 ==========
|
|
[2026-04-13 13:58:30] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 13:58:30] UCB=3.0530 mu=1.8512 sigma=0.6009 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0003046182905194233, 'timesteps': 2293}
|
|
[2026-04-13 13:58:30] UCB=2.9620 mu=1.3333 sigma=0.8144 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0002690144955499583, 'timesteps': 2184}
|
|
[2026-04-13 13:58:30] UCB=2.9307 mu=1.5416 sigma=0.6945 params={'n_steer': 5, 'n_throttle': 5, 'learning_rate': 0.0007634930463964747, 'timesteps': 2464}
|
|
[2026-04-13 13:58:30] UCB=2.8609 mu=1.2576 sigma=0.8017 params={'n_steer': 6, 'n_throttle': 4, 'learning_rate': 0.00030073926504989247, 'timesteps': 2405}
|
|
[2026-04-13 13:58:30] UCB=2.8085 mu=1.2678 sigma=0.7704 params={'n_steer': 6, 'n_throttle': 4, 'learning_rate': 0.0002460273175097693, 'timesteps': 2273}
|
|
[2026-04-13 13:58:30] [AutoResearch] Proposed: {'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0003046182905194233, 'timesteps': 2293, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 13:58:32] [AutoResearch] Launching trial 8: {'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0003046182905194233, 'timesteps': 2293, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:02:13] [AutoResearch] Trial 8 finished in 221.7s, returncode=0
|
|
[2026-04-13 14:02:13] [AutoResearch] Trial 8: mean_reward=492.1545 std_reward=20.4057
|
|
[2026-04-13 14:02:13] [AutoResearch] === Trial 8 Summary ===
|
|
[2026-04-13 14:02:13] Total Phase 1 runs: 8
|
|
[2026-04-13 14:02:13] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:02:13] Top 5:
|
|
[2026-04-13 14:02:13] mean_reward=492.1545 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0003046182905194233, 'timesteps': 2293, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:02:13] mean_reward=326.6374 params={'n_steer': 6, 'n_throttle': 5, 'learning_rate': 0.0001799978550884136, 'timesteps': 2081, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:02:13] mean_reward=92.4248 params={'n_steer': 8, 'n_throttle': 5, 'learning_rate': 0.0011131823295383878, 'timesteps': 1691, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:02:13] mean_reward=74.2498 params={'n_steer': 8, 'n_throttle': 5, 'learning_rate': 0.0010109905842864714, 'timesteps': 1150, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:02:13] mean_reward=15.0946 params={'n_steer': 8, 'n_throttle': 4, 'learning_rate': 0.0022425720960039287, 'timesteps': 1878, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:02:15]
|
|
[AutoResearch] ========== Trial 9/50 ==========
|
|
[2026-04-13 14:02:15] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 14:02:15] UCB=2.8782 mu=1.2806 sigma=0.7988 params={'n_steer': 4, 'n_throttle': 4, 'learning_rate': 0.0010269810535699494, 'timesteps': 1405}
|
|
[2026-04-13 14:02:15] UCB=2.8699 mu=1.2136 sigma=0.8282 params={'n_steer': 3, 'n_throttle': 4, 'learning_rate': 0.0008951125166902782, 'timesteps': 1867}
|
|
[2026-04-13 14:02:15] UCB=2.7808 mu=1.0494 sigma=0.8657 params={'n_steer': 3, 'n_throttle': 4, 'learning_rate': 0.0007281373711988931, 'timesteps': 1628}
|
|
[2026-04-13 14:02:15] UCB=2.7699 mu=1.4417 sigma=0.6641 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0012071606973414922, 'timesteps': 1881}
|
|
[2026-04-13 14:02:15] UCB=2.7343 mu=1.6068 sigma=0.5638 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0008505460420664955, 'timesteps': 2487}
|
|
[2026-04-13 14:02:15] [AutoResearch] Proposed: {'n_steer': 4, 'n_throttle': 4, 'learning_rate': 0.0010269810535699494, 'timesteps': 1405, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:02:17] [AutoResearch] Launching trial 9: {'n_steer': 4, 'n_throttle': 4, 'learning_rate': 0.0010269810535699494, 'timesteps': 1405, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:04:07] [AutoResearch] Trial 9 finished in 109.5s, returncode=0
|
|
[2026-04-13 14:04:07] [AutoResearch] Trial 9: mean_reward=47.3482 std_reward=0.07
|
|
[2026-04-13 14:04:07] [AutoResearch] === Trial 9 Summary ===
|
|
[2026-04-13 14:04:07] Total Phase 1 runs: 9
|
|
[2026-04-13 14:04:07] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:04:07] Top 5:
|
|
[2026-04-13 14:04:07] mean_reward=492.1545 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0003046182905194233, 'timesteps': 2293, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:04:07] mean_reward=326.6374 params={'n_steer': 6, 'n_throttle': 5, 'learning_rate': 0.0001799978550884136, 'timesteps': 2081, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:04:07] mean_reward=92.4248 params={'n_steer': 8, 'n_throttle': 5, 'learning_rate': 0.0011131823295383878, 'timesteps': 1691, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:04:07] mean_reward=74.2498 params={'n_steer': 8, 'n_throttle': 5, 'learning_rate': 0.0010109905842864714, 'timesteps': 1150, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:04:07] mean_reward=47.3482 params={'n_steer': 4, 'n_throttle': 4, 'learning_rate': 0.0010269810535699494, 'timesteps': 1405, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:04:09]
|
|
[AutoResearch] ========== Trial 10/50 ==========
|
|
[2026-04-13 14:04:09] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 14:04:09] UCB=3.0861 mu=1.7282 sigma=0.6790 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717}
|
|
[2026-04-13 14:04:09] UCB=3.0467 mu=2.0076 sigma=0.5196 params={'n_steer': 6, 'n_throttle': 4, 'learning_rate': 0.00021279372557134375, 'timesteps': 2438}
|
|
[2026-04-13 14:04:09] UCB=3.0434 mu=2.2259 sigma=0.4088 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.00041913098323291125, 'timesteps': 2524}
|
|
[2026-04-13 14:04:09] UCB=2.9788 mu=1.7738 sigma=0.6025 params={'n_steer': 6, 'n_throttle': 4, 'learning_rate': 0.00031041165555133226, 'timesteps': 3122}
|
|
[2026-04-13 14:04:09] UCB=2.9735 mu=1.5955 sigma=0.6890 params={'n_steer': 6, 'n_throttle': 4, 'learning_rate': 0.0005830228901312196, 'timesteps': 2255}
|
|
[2026-04-13 14:04:09] [AutoResearch] Proposed: {'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:04:11] [AutoResearch] Launching trial 10: {'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:07:58] [AutoResearch] Trial 10 finished in 226.7s, returncode=0
|
|
[2026-04-13 14:07:58] [AutoResearch] Trial 10: mean_reward=1157.047 std_reward=0.7533
|
|
[2026-04-13 14:07:58] [AutoResearch] === Trial 10 Summary ===
|
|
[2026-04-13 14:07:58] Total Phase 1 runs: 10
|
|
[2026-04-13 14:07:58] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:07:58] Top 5:
|
|
[2026-04-13 14:07:58] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:07:58] mean_reward=492.1545 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0003046182905194233, 'timesteps': 2293, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:07:58] mean_reward=326.6374 params={'n_steer': 6, 'n_throttle': 5, 'learning_rate': 0.0001799978550884136, 'timesteps': 2081, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:07:58] mean_reward=92.4248 params={'n_steer': 8, 'n_throttle': 5, 'learning_rate': 0.0011131823295383878, 'timesteps': 1691, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:07:58] mean_reward=74.2498 params={'n_steer': 8, 'n_throttle': 5, 'learning_rate': 0.0010109905842864714, 'timesteps': 1150, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:07:59] [AutoResearch] Git push complete after trial 10
|
|
[2026-04-13 14:08:01]
|
|
[AutoResearch] ========== Trial 11/50 ==========
|
|
[2026-04-13 14:08:01] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 14:08:01] UCB=3.4245 mu=2.1335 sigma=0.6455 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0006487801810851401, 'timesteps': 3497}
|
|
[2026-04-13 14:08:01] UCB=3.4026 mu=2.4787 sigma=0.4619 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0005509497648997503, 'timesteps': 3181}
|
|
[2026-04-13 14:08:01] UCB=3.3530 mu=1.8438 sigma=0.7546 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00021061724588262036, 'timesteps': 3814}
|
|
[2026-04-13 14:08:01] UCB=3.2826 mu=1.7314 sigma=0.7756 params={'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.0003018922088252074, 'timesteps': 2774}
|
|
[2026-04-13 14:08:01] UCB=3.2756 mu=1.6815 sigma=0.7971 params={'n_steer': 5, 'n_throttle': 2, 'learning_rate': 0.000520158505476463, 'timesteps': 3742}
|
|
[2026-04-13 14:08:01] [AutoResearch] Proposed: {'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0006487801810851401, 'timesteps': 3497, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:08:03] [AutoResearch] Launching trial 11: {'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0006487801810851401, 'timesteps': 3497, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:11:29] [AutoResearch] Trial 11 finished in 205.5s, returncode=0
|
|
[2026-04-13 14:11:29] [AutoResearch] Trial 11: mean_reward=295.0942 std_reward=8.7983
|
|
[2026-04-13 14:11:29] [AutoResearch] === Trial 11 Summary ===
|
|
[2026-04-13 14:11:29] Total Phase 1 runs: 11
|
|
[2026-04-13 14:11:29] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:11:29] Top 5:
|
|
[2026-04-13 14:11:29] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:11:29] mean_reward=492.1545 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0003046182905194233, 'timesteps': 2293, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:11:29] mean_reward=326.6374 params={'n_steer': 6, 'n_throttle': 5, 'learning_rate': 0.0001799978550884136, 'timesteps': 2081, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:11:29] mean_reward=295.0942 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0006487801810851401, 'timesteps': 3497, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:11:29] mean_reward=92.4248 params={'n_steer': 8, 'n_throttle': 5, 'learning_rate': 0.0011131823295383878, 'timesteps': 1691, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:11:31]
|
|
[AutoResearch] ========== Trial 12/50 ==========
|
|
[2026-04-13 14:11:31] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 14:11:31] UCB=3.8974 mu=2.3114 sigma=0.7930 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00036708488973799465, 'timesteps': 1708}
|
|
[2026-04-13 14:11:31] UCB=3.8307 mu=2.4987 sigma=0.6660 params={'n_steer': 4, 'n_throttle': 3, 'learning_rate': 0.00029894659146815144, 'timesteps': 2429}
|
|
[2026-04-13 14:11:31] UCB=3.7947 mu=2.3099 sigma=0.7424 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.0009005388399341564, 'timesteps': 2218}
|
|
[2026-04-13 14:11:31] UCB=3.7156 mu=2.0628 sigma=0.8264 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0007350352970779585, 'timesteps': 1803}
|
|
[2026-04-13 14:11:31] UCB=3.6879 mu=2.2851 sigma=0.7014 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.001344651976676164, 'timesteps': 2228}
|
|
[2026-04-13 14:11:31] [AutoResearch] Proposed: {'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00036708488973799465, 'timesteps': 1708, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:11:33] [AutoResearch] Launching trial 12: {'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00036708488973799465, 'timesteps': 1708, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:13:24] [AutoResearch] Trial 12 finished in 111.4s, returncode=0
|
|
[2026-04-13 14:13:24] [AutoResearch] Trial 12: mean_reward=57.3599 std_reward=0.3574
|
|
[2026-04-13 14:13:24] [AutoResearch] === Trial 12 Summary ===
|
|
[2026-04-13 14:13:24] Total Phase 1 runs: 12
|
|
[2026-04-13 14:13:24] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:13:24] Top 5:
|
|
[2026-04-13 14:13:24] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:13:24] mean_reward=492.1545 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0003046182905194233, 'timesteps': 2293, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:13:24] mean_reward=326.6374 params={'n_steer': 6, 'n_throttle': 5, 'learning_rate': 0.0001799978550884136, 'timesteps': 2081, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:13:24] mean_reward=295.0942 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0006487801810851401, 'timesteps': 3497, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:13:24] mean_reward=92.4248 params={'n_steer': 8, 'n_throttle': 5, 'learning_rate': 0.0011131823295383878, 'timesteps': 1691, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:13:26]
|
|
[AutoResearch] ========== Trial 13/50 ==========
|
|
[2026-04-13 14:13:26] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 14:13:26] UCB=3.8867 mu=2.4252 sigma=0.7308 params={'n_steer': 4, 'n_throttle': 2, 'learning_rate': 0.0007540639059045701, 'timesteps': 2616}
|
|
[2026-04-13 14:13:26] UCB=3.4762 mu=1.6947 sigma=0.8907 params={'n_steer': 4, 'n_throttle': 2, 'learning_rate': 0.0006448520560867668, 'timesteps': 1828}
|
|
[2026-04-13 14:13:26] UCB=3.4298 mu=1.6469 sigma=0.8914 params={'n_steer': 3, 'n_throttle': 2, 'learning_rate': 6.829619904851873e-05, 'timesteps': 3068}
|
|
[2026-04-13 14:13:26] UCB=3.0569 mu=1.6819 sigma=0.6875 params={'n_steer': 3, 'n_throttle': 4, 'learning_rate': 0.0004803523239302434, 'timesteps': 2203}
|
|
[2026-04-13 14:13:26] UCB=2.9155 mu=1.2471 sigma=0.8342 params={'n_steer': 4, 'n_throttle': 2, 'learning_rate': 0.0014019934947968622, 'timesteps': 2871}
|
|
[2026-04-13 14:13:26] [AutoResearch] Proposed: {'n_steer': 4, 'n_throttle': 2, 'learning_rate': 0.0007540639059045701, 'timesteps': 2616, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:13:28] [AutoResearch] Launching trial 13: {'n_steer': 4, 'n_throttle': 2, 'learning_rate': 0.0007540639059045701, 'timesteps': 2616, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:17:13] [AutoResearch] Trial 13 finished in 225.0s, returncode=0
|
|
[2026-04-13 14:17:13] [AutoResearch] Trial 13: mean_reward=33.683 std_reward=0.1015
|
|
[2026-04-13 14:17:13] [AutoResearch] === Trial 13 Summary ===
|
|
[2026-04-13 14:17:13] Total Phase 1 runs: 13
|
|
[2026-04-13 14:17:13] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:17:13] Top 5:
|
|
[2026-04-13 14:17:13] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:17:13] mean_reward=492.1545 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0003046182905194233, 'timesteps': 2293, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:17:13] mean_reward=326.6374 params={'n_steer': 6, 'n_throttle': 5, 'learning_rate': 0.0001799978550884136, 'timesteps': 2081, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:17:13] mean_reward=295.0942 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0006487801810851401, 'timesteps': 3497, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:17:13] mean_reward=92.4248 params={'n_steer': 8, 'n_throttle': 5, 'learning_rate': 0.0011131823295383878, 'timesteps': 1691, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:17:15]
|
|
[AutoResearch] ========== Trial 14/50 ==========
|
|
[2026-04-13 14:17:15] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 14:17:15] UCB=3.7132 mu=2.7971 sigma=0.4580 params={'n_steer': 4, 'n_throttle': 3, 'learning_rate': 0.00065796575384948, 'timesteps': 2500}
|
|
[2026-04-13 14:17:15] UCB=3.6826 mu=2.7276 sigma=0.4775 params={'n_steer': 4, 'n_throttle': 3, 'learning_rate': 0.0005935435040011074, 'timesteps': 2295}
|
|
[2026-04-13 14:17:15] UCB=3.0544 mu=1.7790 sigma=0.6377 params={'n_steer': 4, 'n_throttle': 3, 'learning_rate': 0.0013345870493340923, 'timesteps': 2444}
|
|
[2026-04-13 14:17:15] UCB=3.0393 mu=1.4606 sigma=0.7893 params={'n_steer': 3, 'n_throttle': 3, 'learning_rate': 0.0010072545445081315, 'timesteps': 2556}
|
|
[2026-04-13 14:17:15] UCB=2.9760 mu=1.4846 sigma=0.7457 params={'n_steer': 3, 'n_throttle': 4, 'learning_rate': 0.000290256920789712, 'timesteps': 2875}
|
|
[2026-04-13 14:17:15] [AutoResearch] Proposed: {'n_steer': 4, 'n_throttle': 3, 'learning_rate': 0.00065796575384948, 'timesteps': 2500, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:17:17] [AutoResearch] Launching trial 14: {'n_steer': 4, 'n_throttle': 3, 'learning_rate': 0.00065796575384948, 'timesteps': 2500, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:20:31] [AutoResearch] Trial 14 finished in 193.8s, returncode=0
|
|
[2026-04-13 14:20:31] [AutoResearch] Trial 14: mean_reward=28.9888 std_reward=0.0928
|
|
[2026-04-13 14:20:31] [AutoResearch] === Trial 14 Summary ===
|
|
[2026-04-13 14:20:31] Total Phase 1 runs: 14
|
|
[2026-04-13 14:20:31] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:20:31] Top 5:
|
|
[2026-04-13 14:20:31] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:20:31] mean_reward=492.1545 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0003046182905194233, 'timesteps': 2293, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:20:31] mean_reward=326.6374 params={'n_steer': 6, 'n_throttle': 5, 'learning_rate': 0.0001799978550884136, 'timesteps': 2081, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:20:31] mean_reward=295.0942 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0006487801810851401, 'timesteps': 3497, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:20:31] mean_reward=92.4248 params={'n_steer': 8, 'n_throttle': 5, 'learning_rate': 0.0011131823295383878, 'timesteps': 1691, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:20:33]
|
|
[AutoResearch] ========== Trial 15/50 ==========
|
|
[2026-04-13 14:20:33] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 14:20:33] UCB=4.1822 mu=3.8136 sigma=0.1843 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00015967596710454723, 'timesteps': 2500}
|
|
[2026-04-13 14:20:33] UCB=3.9849 mu=3.6984 sigma=0.1432 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00020203299697276424, 'timesteps': 2915}
|
|
[2026-04-13 14:20:33] UCB=3.1212 mu=1.3239 sigma=0.8986 params={'n_steer': 8, 'n_throttle': 3, 'learning_rate': 5.1439595083155936e-05, 'timesteps': 3411}
|
|
[2026-04-13 14:20:33] UCB=3.0219 mu=1.3426 sigma=0.8396 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005125080920325789, 'timesteps': 3284}
|
|
[2026-04-13 14:20:33] UCB=2.8614 mu=2.3902 sigma=0.2356 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00013318306340867736, 'timesteps': 2174}
|
|
[2026-04-13 14:20:33] [AutoResearch] Proposed: {'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00015967596710454723, 'timesteps': 2500, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:20:35] [AutoResearch] Launching trial 15: {'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00015967596710454723, 'timesteps': 2500, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:23:58] [AutoResearch] Trial 15 finished in 203.6s, returncode=0
|
|
[2026-04-13 14:23:58] [AutoResearch] Trial 15: mean_reward=296.5245 std_reward=0.8544
|
|
[2026-04-13 14:23:58] [AutoResearch] === Trial 15 Summary ===
|
|
[2026-04-13 14:23:58] Total Phase 1 runs: 15
|
|
[2026-04-13 14:23:58] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:23:58] Top 5:
|
|
[2026-04-13 14:23:58] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:23:58] mean_reward=492.1545 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0003046182905194233, 'timesteps': 2293, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:23:58] mean_reward=326.6374 params={'n_steer': 6, 'n_throttle': 5, 'learning_rate': 0.0001799978550884136, 'timesteps': 2081, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:23:58] mean_reward=296.5245 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00015967596710454723, 'timesteps': 2500, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:23:58] mean_reward=295.0942 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0006487801810851401, 'timesteps': 3497, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:24:00]
|
|
[AutoResearch] ========== Trial 16/50 ==========
|
|
[2026-04-13 14:24:00] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 14:24:00] UCB=8.3198 mu=7.2782 sigma=0.5208 params={'n_steer': 6, 'n_throttle': 4, 'learning_rate': 0.0012899142850653915, 'timesteps': 2451}
|
|
[2026-04-13 14:24:00] UCB=8.2069 mu=7.0019 sigma=0.6025 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.001243494300822533, 'timesteps': 2623}
|
|
[2026-04-13 14:24:00] UCB=7.7757 mu=6.8085 sigma=0.4836 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.0006531958020221599, 'timesteps': 2717}
|
|
[2026-04-13 14:24:00] UCB=7.4949 mu=6.2955 sigma=0.5997 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0010665298887244244, 'timesteps': 2471}
|
|
[2026-04-13 14:24:00] UCB=7.0435 mu=6.1542 sigma=0.4446 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0012582252159183014, 'timesteps': 2805}
|
|
[2026-04-13 14:24:00] [AutoResearch] Proposed: {'n_steer': 6, 'n_throttle': 4, 'learning_rate': 0.0012899142850653915, 'timesteps': 2451, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:24:02] [AutoResearch] Launching trial 16: {'n_steer': 6, 'n_throttle': 4, 'learning_rate': 0.0012899142850653915, 'timesteps': 2451, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:27:35] [AutoResearch] Trial 16 finished in 212.5s, returncode=0
|
|
[2026-04-13 14:27:35] [AutoResearch] Trial 16: mean_reward=15.5282 std_reward=0.0252
|
|
[2026-04-13 14:27:35] [AutoResearch] === Trial 16 Summary ===
|
|
[2026-04-13 14:27:35] Total Phase 1 runs: 16
|
|
[2026-04-13 14:27:35] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:27:35] Top 5:
|
|
[2026-04-13 14:27:35] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:27:35] mean_reward=492.1545 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0003046182905194233, 'timesteps': 2293, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:27:35] mean_reward=326.6374 params={'n_steer': 6, 'n_throttle': 5, 'learning_rate': 0.0001799978550884136, 'timesteps': 2081, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:27:35] mean_reward=296.5245 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00015967596710454723, 'timesteps': 2500, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:27:35] mean_reward=295.0942 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0006487801810851401, 'timesteps': 3497, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:27:37]
|
|
[AutoResearch] ========== Trial 17/50 ==========
|
|
[2026-04-13 14:27:37] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 14:27:37] UCB=8.1359 mu=7.3725 sigma=0.3817 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0012604760157625598, 'timesteps': 2679}
|
|
[2026-04-13 14:27:37] UCB=8.0777 mu=6.9461 sigma=0.5658 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.0017255017088966425, 'timesteps': 2530}
|
|
[2026-04-13 14:27:37] UCB=6.8693 mu=5.2870 sigma=0.7911 params={'n_steer': 7, 'n_throttle': 2, 'learning_rate': 0.0009439788338821218, 'timesteps': 2931}
|
|
[2026-04-13 14:27:37] UCB=6.8669 mu=5.7551 sigma=0.5559 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0016476310168858867, 'timesteps': 2769}
|
|
[2026-04-13 14:27:37] UCB=6.6556 mu=5.4318 sigma=0.6119 params={'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.0009292913262432035, 'timesteps': 2642}
|
|
[2026-04-13 14:27:37] [AutoResearch] Proposed: {'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0012604760157625598, 'timesteps': 2679, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:27:39] [AutoResearch] Launching trial 17: {'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0012604760157625598, 'timesteps': 2679, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:31:18] [AutoResearch] Trial 17 finished in 218.8s, returncode=0
|
|
[2026-04-13 14:31:18] [AutoResearch] Trial 17: mean_reward=25.041 std_reward=0.2538
|
|
[2026-04-13 14:31:18] [AutoResearch] === Trial 17 Summary ===
|
|
[2026-04-13 14:31:18] Total Phase 1 runs: 17
|
|
[2026-04-13 14:31:18] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:31:18] Top 5:
|
|
[2026-04-13 14:31:18] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:31:18] mean_reward=492.1545 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0003046182905194233, 'timesteps': 2293, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:31:18] mean_reward=326.6374 params={'n_steer': 6, 'n_throttle': 5, 'learning_rate': 0.0001799978550884136, 'timesteps': 2081, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:31:18] mean_reward=296.5245 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00015967596710454723, 'timesteps': 2500, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:31:18] mean_reward=295.0942 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0006487801810851401, 'timesteps': 3497, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:31:20]
|
|
[AutoResearch] ========== Trial 18/50 ==========
|
|
[2026-04-13 14:31:20] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 14:31:20] UCB=7.0244 mu=5.9003 sigma=0.5621 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472}
|
|
[2026-04-13 14:31:20] UCB=6.0955 mu=4.8517 sigma=0.6219 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.00020480510091558957, 'timesteps': 2306}
|
|
[2026-04-13 14:31:20] UCB=6.0818 mu=4.6603 sigma=0.7107 params={'n_steer': 7, 'n_throttle': 2, 'learning_rate': 0.0007009084549222966, 'timesteps': 3007}
|
|
[2026-04-13 14:31:20] UCB=5.7812 mu=4.0041 sigma=0.8885 params={'n_steer': 8, 'n_throttle': 3, 'learning_rate': 0.0007689592895820599, 'timesteps': 2581}
|
|
[2026-04-13 14:31:20] UCB=5.6269 mu=4.0968 sigma=0.7651 params={'n_steer': 8, 'n_throttle': 3, 'learning_rate': 0.0009037770199337085, 'timesteps': 2594}
|
|
[2026-04-13 14:31:20] [AutoResearch] Proposed: {'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:31:22] [AutoResearch] Launching trial 18: {'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:35:35] [AutoResearch] Trial 18 finished in 252.8s, returncode=0
|
|
[2026-04-13 14:35:35] [AutoResearch] Trial 18: mean_reward=1389.3806 std_reward=4.4479
|
|
[2026-04-13 14:35:35] [AutoResearch] === Trial 18 Summary ===
|
|
[2026-04-13 14:35:35] Total Phase 1 runs: 18
|
|
[2026-04-13 14:35:35] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:35:35] Top 5:
|
|
[2026-04-13 14:35:35] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:35:35] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:35:35] mean_reward=492.1545 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0003046182905194233, 'timesteps': 2293, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:35:35] mean_reward=326.6374 params={'n_steer': 6, 'n_throttle': 5, 'learning_rate': 0.0001799978550884136, 'timesteps': 2081, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:35:35] mean_reward=296.5245 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00015967596710454723, 'timesteps': 2500, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:35:37]
|
|
[AutoResearch] ========== Trial 19/50 ==========
|
|
[2026-04-13 14:35:37] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 14:35:37] UCB=3.7303 mu=3.1686 sigma=0.2808 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914}
|
|
[2026-04-13 14:35:37] UCB=3.6627 mu=3.1516 sigma=0.2556 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00030687648195598525, 'timesteps': 2754}
|
|
[2026-04-13 14:35:37] UCB=3.3895 mu=2.0524 sigma=0.6685 params={'n_steer': 7, 'n_throttle': 2, 'learning_rate': 0.0005669280868608377, 'timesteps': 2176}
|
|
[2026-04-13 14:35:37] UCB=3.2511 mu=1.6136 sigma=0.8188 params={'n_steer': 8, 'n_throttle': 2, 'learning_rate': 0.0005635699313680674, 'timesteps': 2489}
|
|
[2026-04-13 14:35:37] UCB=3.2373 mu=1.7930 sigma=0.7222 params={'n_steer': 7, 'n_throttle': 2, 'learning_rate': 0.0005912469317752097, 'timesteps': 1811}
|
|
[2026-04-13 14:35:37] [AutoResearch] Proposed: {'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:35:39] [AutoResearch] Launching trial 19: {'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:39:40] [AutoResearch] Trial 19 finished in 241.7s, returncode=0
|
|
[2026-04-13 14:39:40] [AutoResearch] Trial 19: mean_reward=1072.7063 std_reward=4.9159
|
|
[2026-04-13 14:39:40] [AutoResearch] === Trial 19 Summary ===
|
|
[2026-04-13 14:39:40] Total Phase 1 runs: 19
|
|
[2026-04-13 14:39:40] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:39:40] Top 5:
|
|
[2026-04-13 14:39:40] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:39:40] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:39:40] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:39:40] mean_reward=492.1545 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0003046182905194233, 'timesteps': 2293, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:39:40] mean_reward=326.6374 params={'n_steer': 6, 'n_throttle': 5, 'learning_rate': 0.0001799978550884136, 'timesteps': 2081, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:39:42]
|
|
[AutoResearch] ========== Trial 20/50 ==========
|
|
[2026-04-13 14:39:42] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 14:39:42] UCB=4.1899 mu=2.9389 sigma=0.6255 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0012685117683148405, 'timesteps': 1382}
|
|
[2026-04-13 14:39:42] UCB=4.0210 mu=2.6315 sigma=0.6947 params={'n_steer': 7, 'n_throttle': 2, 'learning_rate': 0.0008397617990643678, 'timesteps': 2216}
|
|
[2026-04-13 14:39:42] UCB=3.9967 mu=2.3696 sigma=0.8136 params={'n_steer': 7, 'n_throttle': 2, 'learning_rate': 0.0013674551870704959, 'timesteps': 1522}
|
|
[2026-04-13 14:39:42] UCB=3.8338 mu=2.1883 sigma=0.8228 params={'n_steer': 7, 'n_throttle': 2, 'learning_rate': 0.0005541858758139742, 'timesteps': 1280}
|
|
[2026-04-13 14:39:42] UCB=3.7481 mu=2.5203 sigma=0.6139 params={'n_steer': 8, 'n_throttle': 3, 'learning_rate': 7.025023485216959e-05, 'timesteps': 2093}
|
|
[2026-04-13 14:39:42] [AutoResearch] Proposed: {'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0012685117683148405, 'timesteps': 1382, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:39:44] [AutoResearch] Launching trial 20: {'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0012685117683148405, 'timesteps': 1382, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:41:55] [AutoResearch] Trial 20 finished in 130.6s, returncode=0
|
|
[2026-04-13 14:41:55] [AutoResearch] Trial 20: mean_reward=821.1389 std_reward=234.0365
|
|
[2026-04-13 14:41:55] [AutoResearch] === Trial 20 Summary ===
|
|
[2026-04-13 14:41:55] Total Phase 1 runs: 20
|
|
[2026-04-13 14:41:55] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:41:55] Top 5:
|
|
[2026-04-13 14:41:55] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:41:55] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:41:55] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:41:55] mean_reward=821.1389 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0012685117683148405, 'timesteps': 1382, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:41:55] mean_reward=492.1545 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0003046182905194233, 'timesteps': 2293, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:41:56] [AutoResearch] Git push complete after trial 20
|
|
[2026-04-13 14:41:58]
|
|
[AutoResearch] ========== Trial 21/50 ==========
|
|
[2026-04-13 14:41:58] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 14:41:58] UCB=3.8136 mu=2.8224 sigma=0.4956 params={'n_steer': 8, 'n_throttle': 3, 'learning_rate': 0.0007169004126896797, 'timesteps': 1991}
|
|
[2026-04-13 14:41:58] UCB=3.5260 mu=1.8738 sigma=0.8261 params={'n_steer': 8, 'n_throttle': 2, 'learning_rate': 0.00014906253326062445, 'timesteps': 1656}
|
|
[2026-04-13 14:41:58] UCB=3.3444 mu=1.6081 sigma=0.8681 params={'n_steer': 9, 'n_throttle': 2, 'learning_rate': 0.0007228760687423513, 'timesteps': 2342}
|
|
[2026-04-13 14:41:58] UCB=3.2706 mu=1.4597 sigma=0.9054 params={'n_steer': 9, 'n_throttle': 3, 'learning_rate': 0.00011495962027542093, 'timesteps': 1045}
|
|
[2026-04-13 14:41:58] UCB=2.9919 mu=1.9088 sigma=0.5415 params={'n_steer': 7, 'n_throttle': 2, 'learning_rate': 0.00016186063350120652, 'timesteps': 2028}
|
|
[2026-04-13 14:41:58] [AutoResearch] Proposed: {'n_steer': 8, 'n_throttle': 3, 'learning_rate': 0.0007169004126896797, 'timesteps': 1991, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:42:00] [AutoResearch] Launching trial 21: {'n_steer': 8, 'n_throttle': 3, 'learning_rate': 0.0007169004126896797, 'timesteps': 1991, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:43:56] [AutoResearch] Trial 21 finished in 115.4s, returncode=0
|
|
[2026-04-13 14:43:56] [AutoResearch] Trial 21: mean_reward=23.9294 std_reward=0.0242
|
|
[2026-04-13 14:43:56] [AutoResearch] === Trial 21 Summary ===
|
|
[2026-04-13 14:43:56] Total Phase 1 runs: 21
|
|
[2026-04-13 14:43:56] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:43:56] Top 5:
|
|
[2026-04-13 14:43:56] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:43:56] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:43:56] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:43:56] mean_reward=821.1389 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0012685117683148405, 'timesteps': 1382, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:43:56] mean_reward=492.1545 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0003046182905194233, 'timesteps': 2293, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:43:58]
|
|
[AutoResearch] ========== Trial 22/50 ==========
|
|
[2026-04-13 14:43:58] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 14:43:58] UCB=3.2203 mu=2.5187 sigma=0.3508 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156}
|
|
[2026-04-13 14:43:58] UCB=2.7442 mu=1.8625 sigma=0.4409 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0010289690181806371, 'timesteps': 1685}
|
|
[2026-04-13 14:43:58] UCB=2.6790 mu=1.0884 sigma=0.7953 params={'n_steer': 3, 'n_throttle': 5, 'learning_rate': 0.00025603119181826265, 'timesteps': 1153}
|
|
[2026-04-13 14:43:58] UCB=2.6591 mu=1.6599 sigma=0.4996 params={'n_steer': 4, 'n_throttle': 4, 'learning_rate': 0.0005948263081101674, 'timesteps': 2945}
|
|
[2026-04-13 14:43:58] UCB=2.6466 mu=1.0963 sigma=0.7752 params={'n_steer': 3, 'n_throttle': 5, 'learning_rate': 0.0014869800982187835, 'timesteps': 1534}
|
|
[2026-04-13 14:43:58] [AutoResearch] Proposed: {'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:44:00] [AutoResearch] Launching trial 22: {'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:48:13] [AutoResearch] Trial 22 finished in 252.9s, returncode=0
|
|
[2026-04-13 14:48:13] [AutoResearch] Trial 22: mean_reward=1859.847 std_reward=4.6351
|
|
[2026-04-13 14:48:13] [AutoResearch] === Trial 22 Summary ===
|
|
[2026-04-13 14:48:13] Total Phase 1 runs: 22
|
|
[2026-04-13 14:48:13] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:48:13] Top 5:
|
|
[2026-04-13 14:48:13] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:48:13] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:48:13] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:48:13] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:48:13] mean_reward=821.1389 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0012685117683148405, 'timesteps': 1382, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:48:15]
|
|
[AutoResearch] ========== Trial 23/50 ==========
|
|
[2026-04-13 14:48:15] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 14:48:15] UCB=3.4508 mu=2.2736 sigma=0.5886 params={'n_steer': 7, 'n_throttle': 2, 'learning_rate': 0.0004810438536887613, 'timesteps': 1972}
|
|
[2026-04-13 14:48:15] UCB=3.4191 mu=1.9865 sigma=0.7163 params={'n_steer': 7, 'n_throttle': 2, 'learning_rate': 0.0006782346336585454, 'timesteps': 2316}
|
|
[2026-04-13 14:48:15] UCB=3.0160 mu=1.3590 sigma=0.8285 params={'n_steer': 3, 'n_throttle': 4, 'learning_rate': 0.00015648799569206457, 'timesteps': 3627}
|
|
[2026-04-13 14:48:15] UCB=2.9983 mu=1.3451 sigma=0.8266 params={'n_steer': 3, 'n_throttle': 4, 'learning_rate': 0.0006745921677002186, 'timesteps': 2926}
|
|
[2026-04-13 14:48:15] UCB=2.9502 mu=1.4387 sigma=0.7557 params={'n_steer': 4, 'n_throttle': 4, 'learning_rate': 0.00046495943726949734, 'timesteps': 2735}
|
|
[2026-04-13 14:48:15] [AutoResearch] Proposed: {'n_steer': 7, 'n_throttle': 2, 'learning_rate': 0.0004810438536887613, 'timesteps': 1972, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:48:17] [AutoResearch] Launching trial 23: {'n_steer': 7, 'n_throttle': 2, 'learning_rate': 0.0004810438536887613, 'timesteps': 1972, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:50:23] [AutoResearch] Trial 23 finished in 126.5s, returncode=0
|
|
[2026-04-13 14:50:23] [AutoResearch] Trial 23: mean_reward=211.9381 std_reward=0.5943
|
|
[2026-04-13 14:50:23] [AutoResearch] === Trial 23 Summary ===
|
|
[2026-04-13 14:50:23] Total Phase 1 runs: 23
|
|
[2026-04-13 14:50:23] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:50:23] Top 5:
|
|
[2026-04-13 14:50:23] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:50:23] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:50:23] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:50:23] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:50:23] mean_reward=821.1389 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0012685117683148405, 'timesteps': 1382, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:50:25]
|
|
[AutoResearch] ========== Trial 24/50 ==========
|
|
[2026-04-13 14:50:25] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 14:50:25] UCB=2.9789 mu=1.2777 sigma=0.8506 params={'n_steer': 3, 'n_throttle': 5, 'learning_rate': 0.0007945313135188126, 'timesteps': 2605}
|
|
[2026-04-13 14:50:25] UCB=2.9480 mu=2.4478 sigma=0.2501 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0008293839244973556, 'timesteps': 1800}
|
|
[2026-04-13 14:50:25] UCB=2.4588 mu=1.2989 sigma=0.5799 params={'n_steer': 4, 'n_throttle': 4, 'learning_rate': 0.0003778611590363342, 'timesteps': 2517}
|
|
[2026-04-13 14:50:25] UCB=2.4403 mu=1.0025 sigma=0.7189 params={'n_steer': 3, 'n_throttle': 4, 'learning_rate': 0.0019303183103883325, 'timesteps': 1569}
|
|
[2026-04-13 14:50:25] UCB=2.4358 mu=0.6432 sigma=0.8963 params={'n_steer': 3, 'n_throttle': 4, 'learning_rate': 0.0027965188439229814, 'timesteps': 1004}
|
|
[2026-04-13 14:50:25] [AutoResearch] Proposed: {'n_steer': 3, 'n_throttle': 5, 'learning_rate': 0.0007945313135188126, 'timesteps': 2605, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:50:27] [AutoResearch] Launching trial 24: {'n_steer': 3, 'n_throttle': 5, 'learning_rate': 0.0007945313135188126, 'timesteps': 2605, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:54:21] [AutoResearch] Trial 24 finished in 233.3s, returncode=0
|
|
[2026-04-13 14:54:21] [AutoResearch] Trial 24: mean_reward=22.2095 std_reward=0.0496
|
|
[2026-04-13 14:54:21] [AutoResearch] === Trial 24 Summary ===
|
|
[2026-04-13 14:54:21] Total Phase 1 runs: 24
|
|
[2026-04-13 14:54:21] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:54:21] Top 5:
|
|
[2026-04-13 14:54:21] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:54:21] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:54:21] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:54:21] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:54:21] mean_reward=821.1389 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0012685117683148405, 'timesteps': 1382, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:54:23]
|
|
[AutoResearch] ========== Trial 25/50 ==========
|
|
[2026-04-13 14:54:23] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 14:54:23] UCB=3.1390 mu=2.4425 sigma=0.3483 params={'n_steer': 4, 'n_throttle': 3, 'learning_rate': 5.954273148103271e-05, 'timesteps': 3313}
|
|
[2026-04-13 14:54:23] UCB=2.6722 mu=2.0120 sigma=0.3301 params={'n_steer': 4, 'n_throttle': 3, 'learning_rate': 0.0001025090708476032, 'timesteps': 3502}
|
|
[2026-04-13 14:54:23] UCB=2.4749 mu=1.8672 sigma=0.3039 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0012006077737121994, 'timesteps': 2179}
|
|
[2026-04-13 14:54:23] UCB=2.4596 mu=0.6881 sigma=0.8858 params={'n_steer': 9, 'n_throttle': 3, 'learning_rate': 0.004688975887028294, 'timesteps': 2780}
|
|
[2026-04-13 14:54:23] UCB=2.4093 mu=0.6759 sigma=0.8667 params={'n_steer': 8, 'n_throttle': 3, 'learning_rate': 0.004722071725864288, 'timesteps': 2896}
|
|
[2026-04-13 14:54:23] [AutoResearch] Proposed: {'n_steer': 4, 'n_throttle': 3, 'learning_rate': 5.954273148103271e-05, 'timesteps': 3313, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:54:25] [AutoResearch] Launching trial 25: {'n_steer': 4, 'n_throttle': 3, 'learning_rate': 5.954273148103271e-05, 'timesteps': 3313, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:57:54] [AutoResearch] Trial 25 finished in 209.9s, returncode=0
|
|
[2026-04-13 14:57:54] [AutoResearch] Trial 25: mean_reward=237.8844 std_reward=0.2589
|
|
[2026-04-13 14:57:54] [AutoResearch] === Trial 25 Summary ===
|
|
[2026-04-13 14:57:54] Total Phase 1 runs: 25
|
|
[2026-04-13 14:57:54] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:57:54] Top 5:
|
|
[2026-04-13 14:57:54] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:57:54] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:57:54] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:57:54] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:57:54] mean_reward=821.1389 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0012685117683148405, 'timesteps': 1382, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:57:56]
|
|
[AutoResearch] ========== Trial 26/50 ==========
|
|
[2026-04-13 14:57:56] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 14:57:56] UCB=2.5652 mu=0.7504 sigma=0.9074 params={'n_steer': 9, 'n_throttle': 3, 'learning_rate': 0.004801715827187974, 'timesteps': 2543}
|
|
[2026-04-13 14:57:56] UCB=2.4423 mu=0.5942 sigma=0.9241 params={'n_steer': 8, 'n_throttle': 4, 'learning_rate': 0.004973066114846697, 'timesteps': 2444}
|
|
[2026-04-13 14:57:56] UCB=2.4395 mu=0.7410 sigma=0.8492 params={'n_steer': 9, 'n_throttle': 3, 'learning_rate': 0.0045180303501340955, 'timesteps': 3047}
|
|
[2026-04-13 14:57:56] UCB=2.4381 mu=0.5876 sigma=0.9252 params={'n_steer': 8, 'n_throttle': 3, 'learning_rate': 0.004911706851291294, 'timesteps': 1756}
|
|
[2026-04-13 14:57:56] UCB=2.4131 mu=0.5551 sigma=0.9290 params={'n_steer': 8, 'n_throttle': 4, 'learning_rate': 0.004522587810616554, 'timesteps': 1800}
|
|
[2026-04-13 14:57:56] [AutoResearch] Proposed: {'n_steer': 9, 'n_throttle': 3, 'learning_rate': 0.004801715827187974, 'timesteps': 2543, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 14:57:59] [AutoResearch] Launching trial 26: {'n_steer': 9, 'n_throttle': 3, 'learning_rate': 0.004801715827187974, 'timesteps': 2543, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:02:34] [AutoResearch] Trial 26 finished in 275.5s, returncode=0
|
|
[2026-04-13 15:02:34] [AutoResearch] Trial 26: mean_reward=15.0771 std_reward=0.0213
|
|
[2026-04-13 15:02:34] [AutoResearch] === Trial 26 Summary ===
|
|
[2026-04-13 15:02:34] Total Phase 1 runs: 26
|
|
[2026-04-13 15:02:34] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:02:34] Top 5:
|
|
[2026-04-13 15:02:34] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:02:34] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:02:34] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:02:34] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:02:34] mean_reward=821.1389 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0012685117683148405, 'timesteps': 1382, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:02:36]
|
|
[AutoResearch] ========== Trial 27/50 ==========
|
|
[2026-04-13 15:02:36] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 15:02:36] UCB=3.0288 mu=2.4828 sigma=0.2730 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0007139046027489641, 'timesteps': 2278}
|
|
[2026-04-13 15:02:36] UCB=2.9336 mu=2.1217 sigma=0.4059 params={'n_steer': 7, 'n_throttle': 4, 'learning_rate': 0.0003943076736423479, 'timesteps': 1928}
|
|
[2026-04-13 15:02:36] UCB=2.7988 mu=2.2863 sigma=0.2562 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0011223236314925372, 'timesteps': 1871}
|
|
[2026-04-13 15:02:36] UCB=2.3913 mu=1.5376 sigma=0.4269 params={'n_steer': 6, 'n_throttle': 4, 'learning_rate': 0.0005593634601157492, 'timesteps': 1747}
|
|
[2026-04-13 15:02:36] UCB=2.2062 mu=0.6541 sigma=0.7761 params={'n_steer': 4, 'n_throttle': 5, 'learning_rate': 8.72586099538617e-05, 'timesteps': 1047}
|
|
[2026-04-13 15:02:36] [AutoResearch] Proposed: {'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0007139046027489641, 'timesteps': 2278, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:02:38] [AutoResearch] Launching trial 27: {'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0007139046027489641, 'timesteps': 2278, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:06:12] [AutoResearch] Trial 27 finished in 214.4s, returncode=0
|
|
[2026-04-13 15:06:12] [AutoResearch] Trial 27: mean_reward=435.0689 std_reward=72.8002
|
|
[2026-04-13 15:06:12] [AutoResearch] === Trial 27 Summary ===
|
|
[2026-04-13 15:06:12] Total Phase 1 runs: 27
|
|
[2026-04-13 15:06:12] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:06:12] Top 5:
|
|
[2026-04-13 15:06:12] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:06:12] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:06:12] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:06:12] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:06:12] mean_reward=821.1389 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0012685117683148405, 'timesteps': 1382, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:06:14]
|
|
[AutoResearch] ========== Trial 28/50 ==========
|
|
[2026-04-13 15:06:14] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 15:06:14] UCB=8.1497 mu=7.3892 sigma=0.3802 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 7.57033141698586e-05, 'timesteps': 1795}
|
|
[2026-04-13 15:06:14] UCB=6.4243 mu=5.1219 sigma=0.6512 params={'n_steer': 8, 'n_throttle': 4, 'learning_rate': 0.00027636763846234557, 'timesteps': 1049}
|
|
[2026-04-13 15:06:14] UCB=5.0712 mu=3.2994 sigma=0.8859 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.003481351347296331, 'timesteps': 1124}
|
|
[2026-04-13 15:06:14] UCB=4.9277 mu=3.5508 sigma=0.6884 params={'n_steer': 8, 'n_throttle': 3, 'learning_rate': 0.0028061703799304324, 'timesteps': 1078}
|
|
[2026-04-13 15:06:14] UCB=4.7717 mu=3.0410 sigma=0.8654 params={'n_steer': 9, 'n_throttle': 3, 'learning_rate': 0.0037370286662698512, 'timesteps': 1034}
|
|
[2026-04-13 15:06:14] [AutoResearch] Proposed: {'n_steer': 7, 'n_throttle': 3, 'learning_rate': 7.57033141698586e-05, 'timesteps': 1795, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:06:16] [AutoResearch] Launching trial 28: {'n_steer': 7, 'n_throttle': 3, 'learning_rate': 7.57033141698586e-05, 'timesteps': 1795, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:08:07] [AutoResearch] Trial 28 finished in 110.7s, returncode=0
|
|
[2026-04-13 15:08:07] [AutoResearch] Trial 28: mean_reward=82.7727 std_reward=0.8551
|
|
[2026-04-13 15:08:07] [AutoResearch] === Trial 28 Summary ===
|
|
[2026-04-13 15:08:07] Total Phase 1 runs: 28
|
|
[2026-04-13 15:08:07] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:08:07] Top 5:
|
|
[2026-04-13 15:08:07] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:08:07] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:08:07] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:08:07] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:08:07] mean_reward=821.1389 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0012685117683148405, 'timesteps': 1382, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:08:09]
|
|
[AutoResearch] ========== Trial 29/50 ==========
|
|
[2026-04-13 15:08:09] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 15:08:09] UCB=5.2779 mu=3.9398 sigma=0.6691 params={'n_steer': 8, 'n_throttle': 3, 'learning_rate': 0.0038702768940272764, 'timesteps': 1630}
|
|
[2026-04-13 15:08:09] UCB=4.6112 mu=2.8243 sigma=0.8935 params={'n_steer': 8, 'n_throttle': 4, 'learning_rate': 0.0040377757528288795, 'timesteps': 1025}
|
|
[2026-04-13 15:08:09] UCB=4.5876 mu=3.6875 sigma=0.4500 params={'n_steer': 8, 'n_throttle': 3, 'learning_rate': 0.003898240324417907, 'timesteps': 2181}
|
|
[2026-04-13 15:08:09] UCB=4.3613 mu=3.1401 sigma=0.6106 params={'n_steer': 9, 'n_throttle': 4, 'learning_rate': 0.003711559821615946, 'timesteps': 1966}
|
|
[2026-04-13 15:08:09] UCB=4.2189 mu=2.7285 sigma=0.7452 params={'n_steer': 7, 'n_throttle': 4, 'learning_rate': 0.003452590260446862, 'timesteps': 1304}
|
|
[2026-04-13 15:08:09] [AutoResearch] Proposed: {'n_steer': 8, 'n_throttle': 3, 'learning_rate': 0.0038702768940272764, 'timesteps': 1630, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:08:11] [AutoResearch] Launching trial 29: {'n_steer': 8, 'n_throttle': 3, 'learning_rate': 0.0038702768940272764, 'timesteps': 1630, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:09:54] [AutoResearch] Trial 29 finished in 102.6s, returncode=0
|
|
[2026-04-13 15:09:54] [AutoResearch] Trial 29: mean_reward=15.5211 std_reward=0.0294
|
|
[2026-04-13 15:09:54] [AutoResearch] === Trial 29 Summary ===
|
|
[2026-04-13 15:09:54] Total Phase 1 runs: 29
|
|
[2026-04-13 15:09:54] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:09:54] Top 5:
|
|
[2026-04-13 15:09:54] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:09:54] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:09:54] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:09:54] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:09:54] mean_reward=821.1389 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0012685117683148405, 'timesteps': 1382, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:09:56]
|
|
[AutoResearch] ========== Trial 30/50 ==========
|
|
[2026-04-13 15:09:56] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 15:09:56] UCB=4.4896 mu=4.1988 sigma=0.1454 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 6.732545053457424e-05, 'timesteps': 2708}
|
|
[2026-04-13 15:09:56] UCB=4.2126 mu=3.5464 sigma=0.3331 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0001083987083551113, 'timesteps': 3033}
|
|
[2026-04-13 15:09:56] UCB=4.1748 mu=2.4436 sigma=0.8656 params={'n_steer': 8, 'n_throttle': 3, 'learning_rate': 0.004047059120159478, 'timesteps': 3889}
|
|
[2026-04-13 15:09:56] UCB=4.0196 mu=3.0474 sigma=0.4861 params={'n_steer': 8, 'n_throttle': 3, 'learning_rate': 0.003963695470731348, 'timesteps': 2883}
|
|
[2026-04-13 15:09:56] UCB=3.9599 mu=2.3481 sigma=0.8059 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.004271745963239979, 'timesteps': 3390}
|
|
[2026-04-13 15:09:56] [AutoResearch] Proposed: {'n_steer': 6, 'n_throttle': 3, 'learning_rate': 6.732545053457424e-05, 'timesteps': 2708, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:09:58] [AutoResearch] Launching trial 30: {'n_steer': 6, 'n_throttle': 3, 'learning_rate': 6.732545053457424e-05, 'timesteps': 2708, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:13:21] [AutoResearch] Trial 30 finished in 203.3s, returncode=0
|
|
[2026-04-13 15:13:21] [AutoResearch] Trial 30: mean_reward=267.9527 std_reward=7.4167
|
|
[2026-04-13 15:13:21] [AutoResearch] === Trial 30 Summary ===
|
|
[2026-04-13 15:13:21] Total Phase 1 runs: 30
|
|
[2026-04-13 15:13:21] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:13:21] Top 5:
|
|
[2026-04-13 15:13:21] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:13:21] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:13:21] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:13:21] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:13:21] mean_reward=821.1389 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0012685117683148405, 'timesteps': 1382, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:13:22] [AutoResearch] Git push complete after trial 30
|
|
[2026-04-13 15:13:24]
|
|
[AutoResearch] ========== Trial 31/50 ==========
|
|
[2026-04-13 15:13:24] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 15:13:24] UCB=5.5651 mu=4.3580 sigma=0.6036 params={'n_steer': 8, 'n_throttle': 3, 'learning_rate': 0.004135349936389352, 'timesteps': 3206}
|
|
[2026-04-13 15:13:24] UCB=5.5642 mu=4.5363 sigma=0.5139 params={'n_steer': 8, 'n_throttle': 3, 'learning_rate': 0.0038596790201358227, 'timesteps': 2946}
|
|
[2026-04-13 15:13:24] UCB=4.4093 mu=2.5958 sigma=0.9067 params={'n_steer': 8, 'n_throttle': 2, 'learning_rate': 0.004156951783506707, 'timesteps': 3826}
|
|
[2026-04-13 15:13:24] UCB=4.2822 mu=2.3905 sigma=0.9458 params={'n_steer': 8, 'n_throttle': 3, 'learning_rate': 0.003842385514909252, 'timesteps': 4354}
|
|
[2026-04-13 15:13:24] UCB=4.2719 mu=2.3951 sigma=0.9384 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.004118857172828388, 'timesteps': 4203}
|
|
[2026-04-13 15:13:24] [AutoResearch] Proposed: {'n_steer': 8, 'n_throttle': 3, 'learning_rate': 0.004135349936389352, 'timesteps': 3206, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:13:26] [AutoResearch] Launching trial 31: {'n_steer': 8, 'n_throttle': 3, 'learning_rate': 0.004135349936389352, 'timesteps': 3206, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:17:17] [AutoResearch] Trial 31 finished in 230.1s, returncode=0
|
|
[2026-04-13 15:17:17] [AutoResearch] Trial 31: mean_reward=14.3463 std_reward=0.4624
|
|
[2026-04-13 15:17:17] [AutoResearch] === Trial 31 Summary ===
|
|
[2026-04-13 15:17:17] Total Phase 1 runs: 31
|
|
[2026-04-13 15:17:17] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:17:17] Top 5:
|
|
[2026-04-13 15:17:17] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:17:17] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:17:17] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:17:17] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:17:17] mean_reward=821.1389 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0012685117683148405, 'timesteps': 1382, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:17:19]
|
|
[AutoResearch] ========== Trial 32/50 ==========
|
|
[2026-04-13 15:17:19] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 15:17:19] UCB=3.7776 mu=2.6148 sigma=0.5814 params={'n_steer': 9, 'n_throttle': 4, 'learning_rate': 0.0033897596454698794, 'timesteps': 1993}
|
|
[2026-04-13 15:17:19] UCB=3.2663 mu=1.9851 sigma=0.6406 params={'n_steer': 8, 'n_throttle': 4, 'learning_rate': 0.003975169633599844, 'timesteps': 2404}
|
|
[2026-04-13 15:17:19] UCB=3.1243 mu=2.2294 sigma=0.4474 params={'n_steer': 7, 'n_throttle': 4, 'learning_rate': 0.001085936336153102, 'timesteps': 1347}
|
|
[2026-04-13 15:17:19] UCB=3.0807 mu=1.6082 sigma=0.7362 params={'n_steer': 9, 'n_throttle': 3, 'learning_rate': 0.00021511093369327893, 'timesteps': 3455}
|
|
[2026-04-13 15:17:19] UCB=3.0410 mu=1.3292 sigma=0.8559 params={'n_steer': 9, 'n_throttle': 5, 'learning_rate': 0.0033704773533770626, 'timesteps': 2742}
|
|
[2026-04-13 15:17:19] [AutoResearch] Proposed: {'n_steer': 9, 'n_throttle': 4, 'learning_rate': 0.0033897596454698794, 'timesteps': 1993, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:17:21] [AutoResearch] Launching trial 32: {'n_steer': 9, 'n_throttle': 4, 'learning_rate': 0.0033897596454698794, 'timesteps': 1993, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:19:14] [AutoResearch] Trial 32 finished in 112.9s, returncode=0
|
|
[2026-04-13 15:19:14] [AutoResearch] Trial 32: mean_reward=15.5031 std_reward=0.0044
|
|
[2026-04-13 15:19:14] [AutoResearch] === Trial 32 Summary ===
|
|
[2026-04-13 15:19:14] Total Phase 1 runs: 32
|
|
[2026-04-13 15:19:14] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:19:14] Top 5:
|
|
[2026-04-13 15:19:14] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:19:14] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:19:14] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:19:14] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:19:14] mean_reward=821.1389 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0012685117683148405, 'timesteps': 1382, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:19:16]
|
|
[AutoResearch] ========== Trial 33/50 ==========
|
|
[2026-04-13 15:19:16] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 15:19:16] UCB=3.5850 mu=2.4733 sigma=0.5558 params={'n_steer': 7, 'n_throttle': 4, 'learning_rate': 0.0014086345870627532, 'timesteps': 1047}
|
|
[2026-04-13 15:19:16] UCB=3.4733 mu=2.0484 sigma=0.7124 params={'n_steer': 8, 'n_throttle': 4, 'learning_rate': 0.00021886708641480224, 'timesteps': 3412}
|
|
[2026-04-13 15:19:16] UCB=3.3215 mu=2.3680 sigma=0.4768 params={'n_steer': 6, 'n_throttle': 4, 'learning_rate': 0.0011490025067707607, 'timesteps': 1218}
|
|
[2026-04-13 15:19:16] UCB=2.9198 mu=1.7734 sigma=0.5732 params={'n_steer': 8, 'n_throttle': 4, 'learning_rate': 0.00018304261685525393, 'timesteps': 1998}
|
|
[2026-04-13 15:19:16] UCB=2.8965 mu=1.7977 sigma=0.5494 params={'n_steer': 7, 'n_throttle': 4, 'learning_rate': 0.00025906248389237, 'timesteps': 2317}
|
|
[2026-04-13 15:19:16] [AutoResearch] Proposed: {'n_steer': 7, 'n_throttle': 4, 'learning_rate': 0.0014086345870627532, 'timesteps': 1047, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:19:18] [AutoResearch] Launching trial 33: {'n_steer': 7, 'n_throttle': 4, 'learning_rate': 0.0014086345870627532, 'timesteps': 1047, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:18] [AutoResearch] Trial 33 TIMED OUT after 480.1s
|
|
[2026-04-13 15:27:18] [AutoResearch] Trial 33: mean_reward=None std_reward=None
|
|
[2026-04-13 15:27:18] [AutoResearch] === Trial 33 Summary ===
|
|
[2026-04-13 15:27:18] Total Phase 1 runs: 32
|
|
[2026-04-13 15:27:18] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:18] Top 5:
|
|
[2026-04-13 15:27:18] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:18] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:18] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:18] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:18] mean_reward=821.1389 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0012685117683148405, 'timesteps': 1382, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:20]
|
|
[AutoResearch] ========== Trial 34/50 ==========
|
|
[2026-04-13 15:27:20] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 15:27:20] UCB=3.0602 mu=2.0221 sigma=0.5190 params={'n_steer': 7, 'n_throttle': 4, 'learning_rate': 0.0005797524390357458, 'timesteps': 1730}
|
|
[2026-04-13 15:27:20] UCB=2.9639 mu=2.3355 sigma=0.3142 params={'n_steer': 9, 'n_throttle': 3, 'learning_rate': 0.003342940719681374, 'timesteps': 2781}
|
|
[2026-04-13 15:27:20] UCB=2.8402 mu=1.0555 sigma=0.8923 params={'n_steer': 9, 'n_throttle': 2, 'learning_rate': 0.00011009164810198821, 'timesteps': 3597}
|
|
[2026-04-13 15:27:20] UCB=2.7623 mu=1.4832 sigma=0.6396 params={'n_steer': 7, 'n_throttle': 4, 'learning_rate': 0.002448672401879597, 'timesteps': 1031}
|
|
[2026-04-13 15:27:20] UCB=2.6515 mu=1.5184 sigma=0.5666 params={'n_steer': 9, 'n_throttle': 3, 'learning_rate': 0.003370243610453799, 'timesteps': 3185}
|
|
[2026-04-13 15:27:20] [AutoResearch] Proposed: {'n_steer': 7, 'n_throttle': 4, 'learning_rate': 0.0005797524390357458, 'timesteps': 1730, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:22] [AutoResearch] Launching trial 34: {'n_steer': 7, 'n_throttle': 4, 'learning_rate': 0.0005797524390357458, 'timesteps': 1730, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:25] [AutoResearch] Trial 34 finished in 3.0s, returncode=100
|
|
[2026-04-13 15:27:25] [AutoResearch] Trial 34: mean_reward=None std_reward=None
|
|
[2026-04-13 15:27:25] [AutoResearch] === Trial 34 Summary ===
|
|
[2026-04-13 15:27:25] Total Phase 1 runs: 32
|
|
[2026-04-13 15:27:25] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:25] Top 5:
|
|
[2026-04-13 15:27:25] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:25] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:25] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:25] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:25] mean_reward=821.1389 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0012685117683148405, 'timesteps': 1382, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:27]
|
|
[AutoResearch] ========== Trial 35/50 ==========
|
|
[2026-04-13 15:27:27] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 15:27:27] UCB=3.8315 mu=3.2173 sigma=0.3071 params={'n_steer': 8, 'n_throttle': 3, 'learning_rate': 5.189151979066741e-05, 'timesteps': 2800}
|
|
[2026-04-13 15:27:27] UCB=3.5173 mu=1.9176 sigma=0.7999 params={'n_steer': 9, 'n_throttle': 2, 'learning_rate': 0.0036174691379323937, 'timesteps': 2688}
|
|
[2026-04-13 15:27:27] UCB=3.4074 mu=1.8779 sigma=0.7648 params={'n_steer': 9, 'n_throttle': 2, 'learning_rate': 0.0035558136362367604, 'timesteps': 2380}
|
|
[2026-04-13 15:27:27] UCB=3.1016 mu=2.2958 sigma=0.4029 params={'n_steer': 6, 'n_throttle': 4, 'learning_rate': 0.0007703363956986522, 'timesteps': 1578}
|
|
[2026-04-13 15:27:27] UCB=3.0396 mu=2.8296 sigma=0.1050 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.00017347635729668062, 'timesteps': 2581}
|
|
[2026-04-13 15:27:27] [AutoResearch] Proposed: {'n_steer': 8, 'n_throttle': 3, 'learning_rate': 5.189151979066741e-05, 'timesteps': 2800, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:29] [AutoResearch] Launching trial 35: {'n_steer': 8, 'n_throttle': 3, 'learning_rate': 5.189151979066741e-05, 'timesteps': 2800, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:31] [AutoResearch] Trial 35 finished in 2.7s, returncode=100
|
|
[2026-04-13 15:27:31] [AutoResearch] Trial 35: mean_reward=None std_reward=None
|
|
[2026-04-13 15:27:31] [AutoResearch] === Trial 35 Summary ===
|
|
[2026-04-13 15:27:31] Total Phase 1 runs: 32
|
|
[2026-04-13 15:27:31] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:31] Top 5:
|
|
[2026-04-13 15:27:31] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:31] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:31] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:31] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:31] mean_reward=821.1389 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0012685117683148405, 'timesteps': 1382, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:33]
|
|
[AutoResearch] ========== Trial 36/50 ==========
|
|
[2026-04-13 15:27:33] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 15:27:33] UCB=3.1494 mu=1.9716 sigma=0.5889 params={'n_steer': 6, 'n_throttle': 4, 'learning_rate': 0.0017131753633976303, 'timesteps': 1286}
|
|
[2026-04-13 15:27:33] UCB=3.0055 mu=1.3764 sigma=0.8145 params={'n_steer': 9, 'n_throttle': 2, 'learning_rate': 0.004273088506596801, 'timesteps': 2449}
|
|
[2026-04-13 15:27:33] UCB=2.9839 mu=1.9820 sigma=0.5010 params={'n_steer': 7, 'n_throttle': 4, 'learning_rate': 0.001505487454944658, 'timesteps': 1324}
|
|
[2026-04-13 15:27:33] UCB=2.9113 mu=1.6855 sigma=0.6129 params={'n_steer': 7, 'n_throttle': 4, 'learning_rate': 0.000403724803581608, 'timesteps': 2447}
|
|
[2026-04-13 15:27:33] UCB=2.8107 mu=1.5968 sigma=0.6069 params={'n_steer': 6, 'n_throttle': 4, 'learning_rate': 0.0017740090102008168, 'timesteps': 1057}
|
|
[2026-04-13 15:27:33] [AutoResearch] Proposed: {'n_steer': 6, 'n_throttle': 4, 'learning_rate': 0.0017131753633976303, 'timesteps': 1286, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:35] [AutoResearch] Launching trial 36: {'n_steer': 6, 'n_throttle': 4, 'learning_rate': 0.0017131753633976303, 'timesteps': 1286, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:38] [AutoResearch] Trial 36 finished in 2.7s, returncode=100
|
|
[2026-04-13 15:27:38] [AutoResearch] Trial 36: mean_reward=None std_reward=None
|
|
[2026-04-13 15:27:38] [AutoResearch] === Trial 36 Summary ===
|
|
[2026-04-13 15:27:38] Total Phase 1 runs: 32
|
|
[2026-04-13 15:27:38] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:38] Top 5:
|
|
[2026-04-13 15:27:38] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:38] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:38] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:38] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:38] mean_reward=821.1389 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0012685117683148405, 'timesteps': 1382, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:40]
|
|
[AutoResearch] ========== Trial 37/50 ==========
|
|
[2026-04-13 15:27:40] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 15:27:40] UCB=3.8327 mu=2.3242 sigma=0.7543 params={'n_steer': 9, 'n_throttle': 2, 'learning_rate': 0.00381491453203496, 'timesteps': 2570}
|
|
[2026-04-13 15:27:40] UCB=3.6164 mu=2.0347 sigma=0.7909 params={'n_steer': 9, 'n_throttle': 2, 'learning_rate': 0.0036154734154099826, 'timesteps': 2143}
|
|
[2026-04-13 15:27:40] UCB=3.5328 mu=2.3139 sigma=0.6095 params={'n_steer': 6, 'n_throttle': 4, 'learning_rate': 0.0019717822698033754, 'timesteps': 1026}
|
|
[2026-04-13 15:27:40] UCB=3.3397 mu=1.8546 sigma=0.7426 params={'n_steer': 9, 'n_throttle': 3, 'learning_rate': 0.00021835294506320483, 'timesteps': 3476}
|
|
[2026-04-13 15:27:40] UCB=3.1978 mu=1.8508 sigma=0.6735 params={'n_steer': 7, 'n_throttle': 4, 'learning_rate': 0.0022912708080479603, 'timesteps': 1083}
|
|
[2026-04-13 15:27:40] [AutoResearch] Proposed: {'n_steer': 9, 'n_throttle': 2, 'learning_rate': 0.00381491453203496, 'timesteps': 2570, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:42] [AutoResearch] Launching trial 37: {'n_steer': 9, 'n_throttle': 2, 'learning_rate': 0.00381491453203496, 'timesteps': 2570, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:45] [AutoResearch] Trial 37 finished in 2.7s, returncode=100
|
|
[2026-04-13 15:27:45] [AutoResearch] Trial 37: mean_reward=None std_reward=None
|
|
[2026-04-13 15:27:45] [AutoResearch] === Trial 37 Summary ===
|
|
[2026-04-13 15:27:45] Total Phase 1 runs: 32
|
|
[2026-04-13 15:27:45] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:45] Top 5:
|
|
[2026-04-13 15:27:45] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:45] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:45] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:45] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:45] mean_reward=821.1389 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0012685117683148405, 'timesteps': 1382, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:47]
|
|
[AutoResearch] ========== Trial 38/50 ==========
|
|
[2026-04-13 15:27:47] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 15:27:47] UCB=3.3792 mu=2.6015 sigma=0.3889 params={'n_steer': 9, 'n_throttle': 3, 'learning_rate': 0.0035795756473208304, 'timesteps': 2168}
|
|
[2026-04-13 15:27:47] UCB=3.2370 mu=2.4236 sigma=0.4067 params={'n_steer': 6, 'n_throttle': 4, 'learning_rate': 0.00133548686236209, 'timesteps': 1359}
|
|
[2026-04-13 15:27:47] UCB=3.1648 mu=1.7164 sigma=0.7242 params={'n_steer': 8, 'n_throttle': 2, 'learning_rate': 0.00014234046680507736, 'timesteps': 3245}
|
|
[2026-04-13 15:27:47] UCB=3.0756 mu=2.1065 sigma=0.4846 params={'n_steer': 7, 'n_throttle': 4, 'learning_rate': 0.0006639757379204275, 'timesteps': 1652}
|
|
[2026-04-13 15:27:47] UCB=2.9504 mu=1.5700 sigma=0.6902 params={'n_steer': 6, 'n_throttle': 4, 'learning_rate': 0.0023215307871791192, 'timesteps': 1076}
|
|
[2026-04-13 15:27:47] [AutoResearch] Proposed: {'n_steer': 9, 'n_throttle': 3, 'learning_rate': 0.0035795756473208304, 'timesteps': 2168, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:49] [AutoResearch] Launching trial 38: {'n_steer': 9, 'n_throttle': 3, 'learning_rate': 0.0035795756473208304, 'timesteps': 2168, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:52] [AutoResearch] Trial 38 finished in 2.7s, returncode=100
|
|
[2026-04-13 15:27:52] [AutoResearch] Trial 38: mean_reward=None std_reward=None
|
|
[2026-04-13 15:27:52] [AutoResearch] === Trial 38 Summary ===
|
|
[2026-04-13 15:27:52] Total Phase 1 runs: 32
|
|
[2026-04-13 15:27:52] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:52] Top 5:
|
|
[2026-04-13 15:27:52] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:52] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:52] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:52] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:52] mean_reward=821.1389 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0012685117683148405, 'timesteps': 1382, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:54]
|
|
[AutoResearch] ========== Trial 39/50 ==========
|
|
[2026-04-13 15:27:54] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 15:27:54] UCB=2.9376 mu=1.7644 sigma=0.5866 params={'n_steer': 6, 'n_throttle': 4, 'learning_rate': 0.0007519115144017143, 'timesteps': 1229}
|
|
[2026-04-13 15:27:54] UCB=2.9206 mu=1.6104 sigma=0.6551 params={'n_steer': 9, 'n_throttle': 4, 'learning_rate': 8.74145742620092e-05, 'timesteps': 2464}
|
|
[2026-04-13 15:27:54] UCB=2.8799 mu=2.4729 sigma=0.2035 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0002020477741474642, 'timesteps': 2206}
|
|
[2026-04-13 15:27:54] UCB=2.8618 mu=1.4087 sigma=0.7265 params={'n_steer': 8, 'n_throttle': 2, 'learning_rate': 0.003978320937123767, 'timesteps': 2257}
|
|
[2026-04-13 15:27:54] UCB=2.7374 mu=1.3210 sigma=0.7082 params={'n_steer': 8, 'n_throttle': 4, 'learning_rate': 0.0004460127152856436, 'timesteps': 2828}
|
|
[2026-04-13 15:27:54] [AutoResearch] Proposed: {'n_steer': 6, 'n_throttle': 4, 'learning_rate': 0.0007519115144017143, 'timesteps': 1229, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:56] [AutoResearch] Launching trial 39: {'n_steer': 6, 'n_throttle': 4, 'learning_rate': 0.0007519115144017143, 'timesteps': 1229, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:58] [AutoResearch] Trial 39 finished in 2.7s, returncode=100
|
|
[2026-04-13 15:27:58] [AutoResearch] Trial 39: mean_reward=None std_reward=None
|
|
[2026-04-13 15:27:58] [AutoResearch] === Trial 39 Summary ===
|
|
[2026-04-13 15:27:58] Total Phase 1 runs: 32
|
|
[2026-04-13 15:27:58] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:58] Top 5:
|
|
[2026-04-13 15:27:58] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:58] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:58] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:58] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:27:58] mean_reward=821.1389 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0012685117683148405, 'timesteps': 1382, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:28:00]
|
|
[AutoResearch] ========== Trial 40/50 ==========
|
|
[2026-04-13 15:28:00] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 15:28:00] UCB=4.0033 mu=2.6495 sigma=0.6769 params={'n_steer': 8, 'n_throttle': 4, 'learning_rate': 0.00011160573690096532, 'timesteps': 3094}
|
|
[2026-04-13 15:28:00] UCB=3.5783 mu=2.7389 sigma=0.4197 params={'n_steer': 9, 'n_throttle': 3, 'learning_rate': 0.003355215291754839, 'timesteps': 2513}
|
|
[2026-04-13 15:28:00] UCB=3.4820 mu=1.7536 sigma=0.8642 params={'n_steer': 8, 'n_throttle': 4, 'learning_rate': 8.1223641552854e-05, 'timesteps': 3316}
|
|
[2026-04-13 15:28:00] UCB=3.4324 mu=1.8215 sigma=0.8055 params={'n_steer': 9, 'n_throttle': 2, 'learning_rate': 0.003472795016041811, 'timesteps': 3138}
|
|
[2026-04-13 15:28:00] UCB=3.1700 mu=2.2029 sigma=0.4836 params={'n_steer': 7, 'n_throttle': 4, 'learning_rate': 0.000693815329398727, 'timesteps': 1830}
|
|
[2026-04-13 15:28:00] [AutoResearch] Proposed: {'n_steer': 8, 'n_throttle': 4, 'learning_rate': 0.00011160573690096532, 'timesteps': 3094, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:28:02] [AutoResearch] Launching trial 40: {'n_steer': 8, 'n_throttle': 4, 'learning_rate': 0.00011160573690096532, 'timesteps': 3094, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:28:05] [AutoResearch] Trial 40 finished in 2.9s, returncode=100
|
|
[2026-04-13 15:28:05] [AutoResearch] Trial 40: mean_reward=None std_reward=None
|
|
[2026-04-13 15:28:05] [AutoResearch] === Trial 40 Summary ===
|
|
[2026-04-13 15:28:05] Total Phase 1 runs: 32
|
|
[2026-04-13 15:28:05] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:28:05] Top 5:
|
|
[2026-04-13 15:28:05] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:28:05] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:28:05] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:28:05] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:28:05] mean_reward=821.1389 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0012685117683148405, 'timesteps': 1382, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:28:07] [AutoResearch] Git push complete after trial 40
|
|
[2026-04-13 15:28:09]
|
|
[AutoResearch] ========== Trial 41/50 ==========
|
|
[2026-04-13 15:28:09] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 15:28:09] UCB=3.8633 mu=2.6844 sigma=0.5894 params={'n_steer': 9, 'n_throttle': 2, 'learning_rate': 0.003889704330887667, 'timesteps': 2358}
|
|
[2026-04-13 15:28:09] UCB=3.4918 mu=1.9865 sigma=0.7526 params={'n_steer': 8, 'n_throttle': 4, 'learning_rate': 0.0002237254690090858, 'timesteps': 3281}
|
|
[2026-04-13 15:28:09] UCB=3.3272 mu=1.7103 sigma=0.8085 params={'n_steer': 9, 'n_throttle': 2, 'learning_rate': 0.003364105777883071, 'timesteps': 2949}
|
|
[2026-04-13 15:28:09] UCB=3.2008 mu=1.9271 sigma=0.6368 params={'n_steer': 7, 'n_throttle': 4, 'learning_rate': 0.0019096612672065147, 'timesteps': 1173}
|
|
[2026-04-13 15:28:09] UCB=2.7984 mu=1.4837 sigma=0.6574 params={'n_steer': 8, 'n_throttle': 2, 'learning_rate': 0.0036338598508786854, 'timesteps': 2044}
|
|
[2026-04-13 15:28:09] [AutoResearch] Proposed: {'n_steer': 9, 'n_throttle': 2, 'learning_rate': 0.003889704330887667, 'timesteps': 2358, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:28:11] [AutoResearch] Launching trial 41: {'n_steer': 9, 'n_throttle': 2, 'learning_rate': 0.003889704330887667, 'timesteps': 2358, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:28:13] [AutoResearch] Trial 41 finished in 2.7s, returncode=100
|
|
[2026-04-13 15:28:13] [AutoResearch] Trial 41: mean_reward=None std_reward=None
|
|
[2026-04-13 15:28:13] [AutoResearch] === Trial 41 Summary ===
|
|
[2026-04-13 15:28:13] Total Phase 1 runs: 32
|
|
[2026-04-13 15:28:13] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:28:13] Top 5:
|
|
[2026-04-13 15:28:13] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:28:13] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:28:13] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:28:13] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:28:13] mean_reward=821.1389 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0012685117683148405, 'timesteps': 1382, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:28:15]
|
|
[AutoResearch] ========== Trial 42/50 ==========
|
|
[2026-04-13 15:28:15] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 15:28:15] UCB=3.5665 mu=2.4790 sigma=0.5437 params={'n_steer': 8, 'n_throttle': 4, 'learning_rate': 0.0003019383611398774, 'timesteps': 2781}
|
|
[2026-04-13 15:28:15] UCB=3.3654 mu=1.9637 sigma=0.7009 params={'n_steer': 8, 'n_throttle': 3, 'learning_rate': 0.00014844026245908166, 'timesteps': 3459}
|
|
[2026-04-13 15:28:15] UCB=3.3592 mu=2.5184 sigma=0.4204 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0010885983228779504, 'timesteps': 1043}
|
|
[2026-04-13 15:28:15] UCB=3.3498 mu=2.6283 sigma=0.3607 params={'n_steer': 9, 'n_throttle': 3, 'learning_rate': 0.0032895057033497216, 'timesteps': 2546}
|
|
[2026-04-13 15:28:15] UCB=3.3203 mu=2.2130 sigma=0.5536 params={'n_steer': 7, 'n_throttle': 4, 'learning_rate': 0.0015695836762903508, 'timesteps': 1060}
|
|
[2026-04-13 15:28:15] [AutoResearch] Proposed: {'n_steer': 8, 'n_throttle': 4, 'learning_rate': 0.0003019383611398774, 'timesteps': 2781, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:28:17] [AutoResearch] Launching trial 42: {'n_steer': 8, 'n_throttle': 4, 'learning_rate': 0.0003019383611398774, 'timesteps': 2781, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:28:20] [AutoResearch] Trial 42 finished in 2.7s, returncode=100
|
|
[2026-04-13 15:28:20] [AutoResearch] Trial 42: mean_reward=None std_reward=None
|
|
[2026-04-13 15:28:20] [AutoResearch] === Trial 42 Summary ===
|
|
[2026-04-13 15:28:20] Total Phase 1 runs: 32
|
|
[2026-04-13 15:28:20] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:28:20] Top 5:
|
|
[2026-04-13 15:28:20] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:28:20] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:28:20] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:28:20] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:28:20] mean_reward=821.1389 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0012685117683148405, 'timesteps': 1382, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:28:22]
|
|
[AutoResearch] ========== Trial 43/50 ==========
|
|
[2026-04-13 15:28:22] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 15:28:22] UCB=3.7562 mu=2.5461 sigma=0.6050 params={'n_steer': 7, 'n_throttle': 4, 'learning_rate': 0.0011774218521194368, 'timesteps': 1055}
|
|
[2026-04-13 15:28:22] UCB=3.7529 mu=2.8818 sigma=0.4355 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0013193140990772738, 'timesteps': 1000}
|
|
[2026-04-13 15:28:22] UCB=3.2836 mu=1.9527 sigma=0.6654 params={'n_steer': 6, 'n_throttle': 4, 'learning_rate': 0.0020286332388212746, 'timesteps': 1134}
|
|
[2026-04-13 15:28:22] UCB=3.2030 mu=2.4299 sigma=0.3866 params={'n_steer': 9, 'n_throttle': 3, 'learning_rate': 0.0034641823188944136, 'timesteps': 2703}
|
|
[2026-04-13 15:28:22] UCB=2.9956 mu=1.4257 sigma=0.7850 params={'n_steer': 8, 'n_throttle': 2, 'learning_rate': 0.00017238085750060874, 'timesteps': 3444}
|
|
[2026-04-13 15:28:22] [AutoResearch] Proposed: {'n_steer': 7, 'n_throttle': 4, 'learning_rate': 0.0011774218521194368, 'timesteps': 1055, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:28:24] [AutoResearch] Launching trial 43: {'n_steer': 7, 'n_throttle': 4, 'learning_rate': 0.0011774218521194368, 'timesteps': 1055, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:28:27] [AutoResearch] Trial 43 finished in 2.7s, returncode=100
|
|
[2026-04-13 15:28:27] [AutoResearch] Trial 43: mean_reward=None std_reward=None
|
|
[2026-04-13 15:28:27] [AutoResearch] === Trial 43 Summary ===
|
|
[2026-04-13 15:28:27] Total Phase 1 runs: 32
|
|
[2026-04-13 15:28:27] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:28:27] Top 5:
|
|
[2026-04-13 15:28:27] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:28:27] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:28:27] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:28:27] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:28:27] mean_reward=821.1389 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0012685117683148405, 'timesteps': 1382, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:28:29]
|
|
[AutoResearch] ========== Trial 44/50 ==========
|
|
[2026-04-13 15:28:29] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 15:28:29] UCB=3.6688 mu=2.1358 sigma=0.7665 params={'n_steer': 9, 'n_throttle': 2, 'learning_rate': 0.0035907044857161814, 'timesteps': 2398}
|
|
[2026-04-13 15:28:29] UCB=3.6026 mu=2.7375 sigma=0.4325 params={'n_steer': 8, 'n_throttle': 3, 'learning_rate': 0.0001492703303519168, 'timesteps': 2495}
|
|
[2026-04-13 15:28:29] UCB=3.2482 mu=1.7169 sigma=0.7657 params={'n_steer': 8, 'n_throttle': 4, 'learning_rate': 0.00019487299853207732, 'timesteps': 3583}
|
|
[2026-04-13 15:28:29] UCB=3.1606 mu=1.8266 sigma=0.6670 params={'n_steer': 9, 'n_throttle': 2, 'learning_rate': 0.004078225263142901, 'timesteps': 2617}
|
|
[2026-04-13 15:28:29] UCB=3.1520 mu=1.5537 sigma=0.7991 params={'n_steer': 9, 'n_throttle': 3, 'learning_rate': 0.0001880042865711781, 'timesteps': 3713}
|
|
[2026-04-13 15:28:29] [AutoResearch] Proposed: {'n_steer': 9, 'n_throttle': 2, 'learning_rate': 0.0035907044857161814, 'timesteps': 2398, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:28:31] [AutoResearch] Launching trial 44: {'n_steer': 9, 'n_throttle': 2, 'learning_rate': 0.0035907044857161814, 'timesteps': 2398, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:28:34] [AutoResearch] Trial 44 finished in 2.7s, returncode=100
|
|
[2026-04-13 15:28:34] [AutoResearch] Trial 44: mean_reward=None std_reward=None
|
|
[2026-04-13 15:28:34] [AutoResearch] === Trial 44 Summary ===
|
|
[2026-04-13 15:28:34] Total Phase 1 runs: 32
|
|
[2026-04-13 15:28:34] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:28:34] Top 5:
|
|
[2026-04-13 15:28:34] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:28:34] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:28:34] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:28:34] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:28:34] mean_reward=821.1389 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0012685117683148405, 'timesteps': 1382, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:28:36]
|
|
[AutoResearch] ========== Trial 45/50 ==========
|
|
[2026-04-13 15:28:36] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 15:28:36] UCB=4.4174 mu=3.2704 sigma=0.5735 params={'n_steer': 8, 'n_throttle': 4, 'learning_rate': 0.00010809987579177334, 'timesteps': 2828}
|
|
[2026-04-13 15:28:36] UCB=3.4234 mu=2.3887 sigma=0.5174 params={'n_steer': 9, 'n_throttle': 3, 'learning_rate': 0.0038836965543492395, 'timesteps': 2908}
|
|
[2026-04-13 15:28:36] UCB=3.2779 mu=2.0187 sigma=0.6296 params={'n_steer': 9, 'n_throttle': 2, 'learning_rate': 0.003299624894545091, 'timesteps': 2117}
|
|
[2026-04-13 15:28:36] UCB=3.0053 mu=1.7752 sigma=0.6151 params={'n_steer': 8, 'n_throttle': 2, 'learning_rate': 0.004061584169979702, 'timesteps': 2354}
|
|
[2026-04-13 15:28:36] UCB=2.9944 mu=2.3011 sigma=0.3467 params={'n_steer': 6, 'n_throttle': 4, 'learning_rate': 0.000638916285305638, 'timesteps': 1840}
|
|
[2026-04-13 15:28:36] [AutoResearch] Proposed: {'n_steer': 8, 'n_throttle': 4, 'learning_rate': 0.00010809987579177334, 'timesteps': 2828, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:31:20] ============================================================
|
|
[2026-04-13 15:31:20] [AutoResearch] Phase 1 — Real PPO Training + GP+UCB Optimization
|
|
[2026-04-13 15:31:20] [AutoResearch] Max trials: 50 | kappa: 2.0 | push every: 10
|
|
[2026-04-13 15:31:20] [AutoResearch] Results: /home/paulh/projects/donkeycar-rl-autoresearch/agent/outerloop-results/autoresearch_results_phase1.jsonl
|
|
[2026-04-13 15:31:20] [AutoResearch] Champion: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/champion
|
|
[2026-04-13 15:31:20] ============================================================
|
|
[2026-04-13 15:31:20] [AutoResearch] Loaded 32 existing Phase 1 results.
|
|
[2026-04-13 15:31:20] [AutoResearch] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:31:20]
|
|
[AutoResearch] ========== Trial 1/50 ==========
|
|
[2026-04-13 15:31:20] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 15:31:20] UCB=3.7366 mu=2.9210 sigma=0.4078 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.00027249752441387215, 'timesteps': 2768}
|
|
[2026-04-13 15:31:20] UCB=3.5322 mu=2.1413 sigma=0.6954 params={'n_steer': 8, 'n_throttle': 2, 'learning_rate': 7.200709133754677e-05, 'timesteps': 3189}
|
|
[2026-04-13 15:31:20] UCB=2.9545 mu=1.4144 sigma=0.7700 params={'n_steer': 8, 'n_throttle': 4, 'learning_rate': 0.00021644385806279394, 'timesteps': 3757}
|
|
[2026-04-13 15:31:20] UCB=2.7812 mu=1.6942 sigma=0.5435 params={'n_steer': 8, 'n_throttle': 4, 'learning_rate': 0.0006140384934385441, 'timesteps': 1601}
|
|
[2026-04-13 15:31:20] UCB=2.7496 mu=0.8450 sigma=0.9523 params={'n_steer': 9, 'n_throttle': 4, 'learning_rate': 0.0001968080135620267, 'timesteps': 3906}
|
|
[2026-04-13 15:31:20] [AutoResearch] Proposed: {'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.00027249752441387215, 'timesteps': 2768, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:31:22] [AutoResearch] Launching trial 1: {'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.00027249752441387215, 'timesteps': 2768, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:34:50] [AutoResearch] Trial 1 finished in 208.5s, returncode=0
|
|
[2026-04-13 15:34:50] [AutoResearch] Trial 1: mean_reward=619.9873 std_reward=0.5713
|
|
[2026-04-13 15:34:50] [AutoResearch] === Trial 1 Summary ===
|
|
[2026-04-13 15:34:50] Total Phase 1 runs: 33
|
|
[2026-04-13 15:34:50] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:34:50] Top 5:
|
|
[2026-04-13 15:34:50] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:34:50] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:34:50] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:34:50] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:34:50] mean_reward=821.1389 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0012685117683148405, 'timesteps': 1382, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:34:52]
|
|
[AutoResearch] ========== Trial 2/50 ==========
|
|
[2026-04-13 15:34:52] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 15:34:52] UCB=3.0943 mu=1.8643 sigma=0.6150 params={'n_steer': 6, 'n_throttle': 4, 'learning_rate': 0.0010178217900925352, 'timesteps': 1138}
|
|
[2026-04-13 15:34:52] UCB=2.8458 mu=1.7127 sigma=0.5666 params={'n_steer': 8, 'n_throttle': 4, 'learning_rate': 0.0002392280897745687, 'timesteps': 2148}
|
|
[2026-04-13 15:34:52] UCB=2.7366 mu=1.7553 sigma=0.4907 params={'n_steer': 8, 'n_throttle': 4, 'learning_rate': 0.0011962893358962853, 'timesteps': 1336}
|
|
[2026-04-13 15:34:52] UCB=2.6824 mu=1.7018 sigma=0.4903 params={'n_steer': 7, 'n_throttle': 4, 'learning_rate': 0.001160325530975692, 'timesteps': 1426}
|
|
[2026-04-13 15:34:52] UCB=2.6355 mu=1.6142 sigma=0.5106 params={'n_steer': 9, 'n_throttle': 3, 'learning_rate': 0.0031704780737375126, 'timesteps': 2863}
|
|
[2026-04-13 15:34:52] [AutoResearch] Proposed: {'n_steer': 6, 'n_throttle': 4, 'learning_rate': 0.0010178217900925352, 'timesteps': 1138, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:34:54] [AutoResearch] Launching trial 2: {'n_steer': 6, 'n_throttle': 4, 'learning_rate': 0.0010178217900925352, 'timesteps': 1138, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:36:43] [AutoResearch] Trial 2 finished in 108.8s, returncode=0
|
|
[2026-04-13 15:36:43] [AutoResearch] Trial 2: mean_reward=43.2368 std_reward=0.0435
|
|
[2026-04-13 15:36:43] [AutoResearch] === Trial 2 Summary ===
|
|
[2026-04-13 15:36:43] Total Phase 1 runs: 34
|
|
[2026-04-13 15:36:43] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:36:43] Top 5:
|
|
[2026-04-13 15:36:43] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:36:43] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:36:43] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:36:43] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:36:43] mean_reward=821.1389 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0012685117683148405, 'timesteps': 1382, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:36:45]
|
|
[AutoResearch] ========== Trial 3/50 ==========
|
|
[2026-04-13 15:36:45] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 15:36:45] UCB=2.5334 mu=1.4412 sigma=0.5461 params={'n_steer': 8, 'n_throttle': 4, 'learning_rate': 0.00017414812807443384, 'timesteps': 2214}
|
|
[2026-04-13 15:36:45] UCB=2.3326 mu=2.1642 sigma=0.0842 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006368885696011156, 'timesteps': 1836}
|
|
[2026-04-13 15:36:45] UCB=2.2757 mu=1.7536 sigma=0.2611 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.0004955871472787, 'timesteps': 2773}
|
|
[2026-04-13 15:36:45] UCB=2.1297 mu=0.1789 sigma=0.9754 params={'n_steer': 4, 'n_throttle': 5, 'learning_rate': 0.003247259025720135, 'timesteps': 1013}
|
|
[2026-04-13 15:36:45] UCB=2.0617 mu=0.7276 sigma=0.6670 params={'n_steer': 3, 'n_throttle': 4, 'learning_rate': 0.0017195798959150094, 'timesteps': 1010}
|
|
[2026-04-13 15:36:45] [AutoResearch] Proposed: {'n_steer': 8, 'n_throttle': 4, 'learning_rate': 0.00017414812807443384, 'timesteps': 2214, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:36:47] [AutoResearch] Launching trial 3: {'n_steer': 8, 'n_throttle': 4, 'learning_rate': 0.00017414812807443384, 'timesteps': 2214, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:40:05] [AutoResearch] Trial 3 finished in 198.5s, returncode=0
|
|
[2026-04-13 15:40:05] [AutoResearch] Trial 3: mean_reward=333.5673 std_reward=0.1409
|
|
[2026-04-13 15:40:05] [AutoResearch] === Trial 3 Summary ===
|
|
[2026-04-13 15:40:05] Total Phase 1 runs: 35
|
|
[2026-04-13 15:40:05] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:40:05] Top 5:
|
|
[2026-04-13 15:40:05] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:40:05] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:40:05] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:40:05] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:40:05] mean_reward=821.1389 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0012685117683148405, 'timesteps': 1382, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:40:07]
|
|
[AutoResearch] ========== Trial 4/50 ==========
|
|
[2026-04-13 15:40:07] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 15:40:07] UCB=3.9731 mu=2.5194 sigma=0.7269 params={'n_steer': 9, 'n_throttle': 2, 'learning_rate': 0.0037042058707837115, 'timesteps': 2319}
|
|
[2026-04-13 15:40:07] UCB=3.6554 mu=2.2181 sigma=0.7186 params={'n_steer': 9, 'n_throttle': 2, 'learning_rate': 0.003708338874971097, 'timesteps': 2440}
|
|
[2026-04-13 15:40:07] UCB=2.5706 mu=1.8032 sigma=0.3837 params={'n_steer': 9, 'n_throttle': 3, 'learning_rate': 0.003955234037292529, 'timesteps': 2079}
|
|
[2026-04-13 15:40:07] UCB=2.5238 mu=2.1163 sigma=0.2038 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0006821546025584073, 'timesteps': 2654}
|
|
[2026-04-13 15:40:07] UCB=2.0849 mu=1.1662 sigma=0.4594 params={'n_steer': 9, 'n_throttle': 3, 'learning_rate': 0.0028891139882576333, 'timesteps': 2774}
|
|
[2026-04-13 15:40:07] [AutoResearch] Proposed: {'n_steer': 9, 'n_throttle': 2, 'learning_rate': 0.0037042058707837115, 'timesteps': 2319, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:40:09] [AutoResearch] Launching trial 4: {'n_steer': 9, 'n_throttle': 2, 'learning_rate': 0.0037042058707837115, 'timesteps': 2319, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:44:34] [AutoResearch] Trial 4 finished in 264.3s, returncode=0
|
|
[2026-04-13 15:44:34] [AutoResearch] Trial 4: mean_reward=14.9952 std_reward=0.0012
|
|
[2026-04-13 15:44:34] [AutoResearch] === Trial 4 Summary ===
|
|
[2026-04-13 15:44:34] Total Phase 1 runs: 36
|
|
[2026-04-13 15:44:34] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:44:34] Top 5:
|
|
[2026-04-13 15:44:34] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:44:34] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:44:34] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:44:34] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:44:34] mean_reward=821.1389 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0012685117683148405, 'timesteps': 1382, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:44:36]
|
|
[AutoResearch] ========== Trial 5/50 ==========
|
|
[2026-04-13 15:44:36] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 15:44:36] UCB=2.3559 mu=1.8208 sigma=0.2676 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0006774569893590574, 'timesteps': 2849}
|
|
[2026-04-13 15:44:36] UCB=2.3145 mu=2.0150 sigma=0.1498 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005938673780871419, 'timesteps': 2123}
|
|
[2026-04-13 15:44:36] UCB=2.2943 mu=1.8076 sigma=0.2433 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0006906315490947656, 'timesteps': 2765}
|
|
[2026-04-13 15:44:36] UCB=2.1792 mu=1.7569 sigma=0.2112 params={'n_steer': 6, 'n_throttle': 4, 'learning_rate': 0.0005595190308580637, 'timesteps': 2578}
|
|
[2026-04-13 15:44:36] UCB=2.1234 mu=1.6990 sigma=0.2122 params={'n_steer': 6, 'n_throttle': 4, 'learning_rate': 0.00044225780284592813, 'timesteps': 2282}
|
|
[2026-04-13 15:44:36] [AutoResearch] Proposed: {'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0006774569893590574, 'timesteps': 2849, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:44:38] [AutoResearch] Launching trial 5: {'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0006774569893590574, 'timesteps': 2849, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:48:04] [AutoResearch] Trial 5 finished in 206.4s, returncode=0
|
|
[2026-04-13 15:48:04] [AutoResearch] Trial 5: mean_reward=1032.0966 std_reward=2.1093
|
|
[2026-04-13 15:48:04] [AutoResearch] === Trial 5 Summary ===
|
|
[2026-04-13 15:48:04] Total Phase 1 runs: 37
|
|
[2026-04-13 15:48:04] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:48:04] Top 5:
|
|
[2026-04-13 15:48:04] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:48:04] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:48:04] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:48:04] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:48:04] mean_reward=1032.0966 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0006774569893590574, 'timesteps': 2849, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:48:06]
|
|
[AutoResearch] ========== Trial 6/50 ==========
|
|
[2026-04-13 15:48:06] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 15:48:06] UCB=2.3180 mu=1.9957 sigma=0.1611 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.0008165309928824906, 'timesteps': 2250}
|
|
[2026-04-13 15:48:06] UCB=2.1244 mu=0.2338 sigma=0.9453 params={'n_steer': 3, 'n_throttle': 5, 'learning_rate': 0.002787182019228819, 'timesteps': 1216}
|
|
[2026-04-13 15:48:06] UCB=2.0859 mu=0.5910 sigma=0.7474 params={'n_steer': 3, 'n_throttle': 5, 'learning_rate': 0.0007018364858104177, 'timesteps': 1103}
|
|
[2026-04-13 15:48:06] UCB=1.8739 mu=1.2565 sigma=0.3087 params={'n_steer': 9, 'n_throttle': 2, 'learning_rate': 0.0035236952784743498, 'timesteps': 2547}
|
|
[2026-04-13 15:48:06] UCB=1.8564 mu=1.6769 sigma=0.0898 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0008436170399085055, 'timesteps': 2485}
|
|
[2026-04-13 15:48:06] [AutoResearch] Proposed: {'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.0008165309928824906, 'timesteps': 2250, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:48:08] [AutoResearch] Launching trial 6: {'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.0008165309928824906, 'timesteps': 2250, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:51:25] [AutoResearch] Trial 6 finished in 196.6s, returncode=0
|
|
[2026-04-13 15:51:25] [AutoResearch] Trial 6: mean_reward=572.0601 std_reward=3.1979
|
|
[2026-04-13 15:51:25] [AutoResearch] === Trial 6 Summary ===
|
|
[2026-04-13 15:51:25] Total Phase 1 runs: 38
|
|
[2026-04-13 15:51:25] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:51:25] Top 5:
|
|
[2026-04-13 15:51:25] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:51:25] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:51:25] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:51:25] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:51:25] mean_reward=1032.0966 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0006774569893590574, 'timesteps': 2849, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:51:27]
|
|
[AutoResearch] ========== Trial 7/50 ==========
|
|
[2026-04-13 15:51:27] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 15:51:27] UCB=2.4842 mu=0.9195 sigma=0.7824 params={'n_steer': 3, 'n_throttle': 5, 'learning_rate': 0.0020283343280176574, 'timesteps': 1060}
|
|
[2026-04-13 15:51:27] UCB=2.4623 mu=1.5234 sigma=0.4694 params={'n_steer': 9, 'n_throttle': 3, 'learning_rate': 0.003246237136438706, 'timesteps': 3130}
|
|
[2026-04-13 15:51:27] UCB=2.1984 mu=1.4877 sigma=0.3553 params={'n_steer': 9, 'n_throttle': 3, 'learning_rate': 0.003983244774723074, 'timesteps': 2975}
|
|
[2026-04-13 15:51:27] UCB=2.1282 mu=0.8256 sigma=0.6513 params={'n_steer': 4, 'n_throttle': 5, 'learning_rate': 0.0013707605977159977, 'timesteps': 1244}
|
|
[2026-04-13 15:51:27] UCB=2.0957 mu=1.7888 sigma=0.1534 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0005782749802131942, 'timesteps': 3014}
|
|
[2026-04-13 15:51:27] [AutoResearch] Proposed: {'n_steer': 3, 'n_throttle': 5, 'learning_rate': 0.0020283343280176574, 'timesteps': 1060, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:51:29] [AutoResearch] Launching trial 7: {'n_steer': 3, 'n_throttle': 5, 'learning_rate': 0.0020283343280176574, 'timesteps': 1060, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:53:13] [AutoResearch] Trial 7 finished in 104.2s, returncode=0
|
|
[2026-04-13 15:53:13] [AutoResearch] Trial 7: mean_reward=31.9123 std_reward=0.1159
|
|
[2026-04-13 15:53:13] [AutoResearch] === Trial 7 Summary ===
|
|
[2026-04-13 15:53:13] Total Phase 1 runs: 39
|
|
[2026-04-13 15:53:13] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:53:13] Top 5:
|
|
[2026-04-13 15:53:13] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:53:13] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:53:13] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:53:13] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:53:13] mean_reward=1032.0966 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0006774569893590574, 'timesteps': 2849, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:53:15]
|
|
[AutoResearch] ========== Trial 8/50 ==========
|
|
[2026-04-13 15:53:15] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 15:53:15] UCB=2.3645 mu=1.8877 sigma=0.2384 params={'n_steer': 9, 'n_throttle': 3, 'learning_rate': 0.0038245903103972813, 'timesteps': 2401}
|
|
[2026-04-13 15:53:15] UCB=2.2307 mu=1.6587 sigma=0.2860 params={'n_steer': 9, 'n_throttle': 3, 'learning_rate': 0.0036860171858864176, 'timesteps': 2774}
|
|
[2026-04-13 15:53:15] UCB=2.1993 mu=1.1385 sigma=0.5304 params={'n_steer': 9, 'n_throttle': 3, 'learning_rate': 0.0035133068828179766, 'timesteps': 3180}
|
|
[2026-04-13 15:53:15] UCB=2.0278 mu=1.6637 sigma=0.1821 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00045103564754635163, 'timesteps': 3157}
|
|
[2026-04-13 15:53:15] UCB=1.9169 mu=1.3349 sigma=0.2910 params={'n_steer': 6, 'n_throttle': 4, 'learning_rate': 0.0002792226855563916, 'timesteps': 2550}
|
|
[2026-04-13 15:53:15] [AutoResearch] Proposed: {'n_steer': 9, 'n_throttle': 3, 'learning_rate': 0.0038245903103972813, 'timesteps': 2401, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:53:17] [AutoResearch] Launching trial 8: {'n_steer': 9, 'n_throttle': 3, 'learning_rate': 0.0038245903103972813, 'timesteps': 2401, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:56:47] [AutoResearch] Trial 8 finished in 209.8s, returncode=0
|
|
[2026-04-13 15:56:47] [AutoResearch] Trial 8: mean_reward=14.5596 std_reward=0.0173
|
|
[2026-04-13 15:56:47] [AutoResearch] === Trial 8 Summary ===
|
|
[2026-04-13 15:56:47] Total Phase 1 runs: 40
|
|
[2026-04-13 15:56:47] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:56:47] Top 5:
|
|
[2026-04-13 15:56:47] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:56:47] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:56:47] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:56:47] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:56:47] mean_reward=1032.0966 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0006774569893590574, 'timesteps': 2849, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:56:49]
|
|
[AutoResearch] ========== Trial 9/50 ==========
|
|
[2026-04-13 15:56:49] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 15:56:49] UCB=2.0502 mu=1.7407 sigma=0.1548 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0003951932722914093, 'timesteps': 1993}
|
|
[2026-04-13 15:56:49] UCB=2.0166 mu=1.7821 sigma=0.1173 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0007037030106819803, 'timesteps': 1562}
|
|
[2026-04-13 15:56:49] UCB=1.9042 mu=0.2061 sigma=0.8490 params={'n_steer': 7, 'n_throttle': 4, 'learning_rate': 0.004093188374182109, 'timesteps': 1907}
|
|
[2026-04-13 15:56:49] UCB=1.8829 mu=0.8452 sigma=0.5189 params={'n_steer': 8, 'n_throttle': 4, 'learning_rate': 0.003916637540083301, 'timesteps': 2173}
|
|
[2026-04-13 15:56:49] UCB=1.8557 mu=-0.1309 sigma=0.9933 params={'n_steer': 3, 'n_throttle': 4, 'learning_rate': 0.0046622999893220305, 'timesteps': 1198}
|
|
[2026-04-13 15:56:49] [AutoResearch] Proposed: {'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0003951932722914093, 'timesteps': 1993, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:56:51] [AutoResearch] Launching trial 9: {'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0003951932722914093, 'timesteps': 1993, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:58:34] [AutoResearch] Trial 9 finished in 103.3s, returncode=0
|
|
[2026-04-13 15:58:34] [AutoResearch] Trial 9: mean_reward=44.5266 std_reward=0.1726
|
|
[2026-04-13 15:58:34] [AutoResearch] === Trial 9 Summary ===
|
|
[2026-04-13 15:58:34] Total Phase 1 runs: 41
|
|
[2026-04-13 15:58:34] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:58:34] Top 5:
|
|
[2026-04-13 15:58:34] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:58:34] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:58:34] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:58:34] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:58:34] mean_reward=1032.0966 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0006774569893590574, 'timesteps': 2849, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:58:36]
|
|
[AutoResearch] ========== Trial 10/50 ==========
|
|
[2026-04-13 15:58:36] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 15:58:36] UCB=3.2274 mu=2.4983 sigma=0.3646 params={'n_steer': 8, 'n_throttle': 3, 'learning_rate': 0.0017751713134174076, 'timesteps': 1004}
|
|
[2026-04-13 15:58:36] UCB=2.2756 mu=1.7072 sigma=0.2842 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0010234264691151955, 'timesteps': 1076}
|
|
[2026-04-13 15:58:36] UCB=2.2585 mu=2.1443 sigma=0.0571 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.0006964361674886283, 'timesteps': 2693}
|
|
[2026-04-13 15:58:36] UCB=2.1748 mu=0.9093 sigma=0.6327 params={'n_steer': 8, 'n_throttle': 2, 'learning_rate': 0.002907030660302143, 'timesteps': 1339}
|
|
[2026-04-13 15:58:36] UCB=2.1561 mu=1.2532 sigma=0.4515 params={'n_steer': 8, 'n_throttle': 3, 'learning_rate': 0.0032424956372853735, 'timesteps': 1140}
|
|
[2026-04-13 15:58:36] [AutoResearch] Proposed: {'n_steer': 8, 'n_throttle': 3, 'learning_rate': 0.0017751713134174076, 'timesteps': 1004, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 15:58:38] [AutoResearch] Launching trial 10: {'n_steer': 8, 'n_throttle': 3, 'learning_rate': 0.0017751713134174076, 'timesteps': 1004, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:00:23] [AutoResearch] Trial 10 finished in 105.1s, returncode=0
|
|
[2026-04-13 16:00:23] [AutoResearch] Trial 10: mean_reward=26.063 std_reward=0.1375
|
|
[2026-04-13 16:00:23] [AutoResearch] === Trial 10 Summary ===
|
|
[2026-04-13 16:00:23] Total Phase 1 runs: 42
|
|
[2026-04-13 16:00:23] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:00:23] Top 5:
|
|
[2026-04-13 16:00:23] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:00:23] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:00:23] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:00:23] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:00:23] mean_reward=1032.0966 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0006774569893590574, 'timesteps': 2849, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:00:25] [AutoResearch] Git push complete after trial 10
|
|
[2026-04-13 16:00:27]
|
|
[AutoResearch] ========== Trial 11/50 ==========
|
|
[2026-04-13 16:00:27] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 16:00:27] UCB=3.0451 mu=1.7238 sigma=0.6607 params={'n_steer': 7, 'n_throttle': 2, 'learning_rate': 0.00261843600079989, 'timesteps': 1081}
|
|
[2026-04-13 16:00:27] UCB=2.6034 mu=2.2509 sigma=0.1762 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.0005990972802344085, 'timesteps': 2774}
|
|
[2026-04-13 16:00:27] UCB=2.3252 mu=2.2068 sigma=0.0592 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.000721376156088404, 'timesteps': 2907}
|
|
[2026-04-13 16:00:27] UCB=2.2116 mu=1.4670 sigma=0.3723 params={'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.0011709545997933251, 'timesteps': 1321}
|
|
[2026-04-13 16:00:27] UCB=2.1998 mu=1.1176 sigma=0.5411 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0026805657038851353, 'timesteps': 1209}
|
|
[2026-04-13 16:00:27] [AutoResearch] Proposed: {'n_steer': 7, 'n_throttle': 2, 'learning_rate': 0.00261843600079989, 'timesteps': 1081, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:00:29] [AutoResearch] Launching trial 11: {'n_steer': 7, 'n_throttle': 2, 'learning_rate': 0.00261843600079989, 'timesteps': 1081, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:02:12] [AutoResearch] Trial 11 finished in 102.6s, returncode=0
|
|
[2026-04-13 16:02:12] [AutoResearch] Trial 11: mean_reward=15.0063 std_reward=0.0168
|
|
[2026-04-13 16:02:12] [AutoResearch] === Trial 11 Summary ===
|
|
[2026-04-13 16:02:12] Total Phase 1 runs: 43
|
|
[2026-04-13 16:02:12] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:02:12] Top 5:
|
|
[2026-04-13 16:02:12] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:02:12] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:02:12] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:02:12] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:02:12] mean_reward=1032.0966 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0006774569893590574, 'timesteps': 2849, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:02:14]
|
|
[AutoResearch] ========== Trial 12/50 ==========
|
|
[2026-04-13 16:02:14] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 16:02:14] UCB=2.0321 mu=1.9000 sigma=0.0661 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.000494038295411516, 'timesteps': 2549}
|
|
[2026-04-13 16:02:14] UCB=1.9393 mu=1.5188 sigma=0.2103 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0014740735858409435, 'timesteps': 1216}
|
|
[2026-04-13 16:02:14] UCB=1.8736 mu=1.4721 sigma=0.2007 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00043141485613891153, 'timesteps': 2832}
|
|
[2026-04-13 16:02:14] UCB=1.8682 mu=1.0345 sigma=0.4169 params={'n_steer': 7, 'n_throttle': 4, 'learning_rate': 0.0006118476939445131, 'timesteps': 3068}
|
|
[2026-04-13 16:02:14] UCB=1.8532 mu=-0.1166 sigma=0.9849 params={'n_steer': 9, 'n_throttle': 5, 'learning_rate': 0.0023880940961788946, 'timesteps': 4789}
|
|
[2026-04-13 16:02:14] [AutoResearch] Proposed: {'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.000494038295411516, 'timesteps': 2549, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:02:16] [AutoResearch] Launching trial 12: {'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.000494038295411516, 'timesteps': 2549, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:05:50] [AutoResearch] Trial 12 finished in 214.4s, returncode=0
|
|
[2026-04-13 16:05:50] [AutoResearch] Trial 12: mean_reward=743.5528 std_reward=3.1861
|
|
[2026-04-13 16:05:50] [AutoResearch] === Trial 12 Summary ===
|
|
[2026-04-13 16:05:50] Total Phase 1 runs: 44
|
|
[2026-04-13 16:05:50] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:05:50] Top 5:
|
|
[2026-04-13 16:05:50] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:05:50] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:05:50] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:05:50] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:05:50] mean_reward=1032.0966 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0006774569893590574, 'timesteps': 2849, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:05:52]
|
|
[AutoResearch] ========== Trial 13/50 ==========
|
|
[2026-04-13 16:05:52] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 16:05:52] UCB=2.2973 mu=2.1304 sigma=0.0834 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0006080707920424922, 'timesteps': 3082}
|
|
[2026-04-13 16:05:52] UCB=1.8440 mu=-0.1447 sigma=0.9943 params={'n_steer': 4, 'n_throttle': 5, 'learning_rate': 0.004963788338941324, 'timesteps': 1577}
|
|
[2026-04-13 16:05:52] UCB=1.8349 mu=-0.1535 sigma=0.9942 params={'n_steer': 4, 'n_throttle': 5, 'learning_rate': 0.004895602208151989, 'timesteps': 2098}
|
|
[2026-04-13 16:05:52] UCB=1.8171 mu=-0.1611 sigma=0.9891 params={'n_steer': 7, 'n_throttle': 5, 'learning_rate': 0.004911350917760141, 'timesteps': 1288}
|
|
[2026-04-13 16:05:52] UCB=1.8039 mu=-0.1505 sigma=0.9772 params={'n_steer': 4, 'n_throttle': 5, 'learning_rate': 0.0048778108965635185, 'timesteps': 3392}
|
|
[2026-04-13 16:05:52] [AutoResearch] Proposed: {'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0006080707920424922, 'timesteps': 3082, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:05:54] [AutoResearch] Launching trial 13: {'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0006080707920424922, 'timesteps': 3082, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:09:20] [AutoResearch] Trial 13 finished in 205.8s, returncode=0
|
|
[2026-04-13 16:09:20] [AutoResearch] Trial 13: mean_reward=471.0407 std_reward=7.8649
|
|
[2026-04-13 16:09:20] [AutoResearch] === Trial 13 Summary ===
|
|
[2026-04-13 16:09:20] Total Phase 1 runs: 45
|
|
[2026-04-13 16:09:20] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:09:20] Top 5:
|
|
[2026-04-13 16:09:20] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:09:20] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:09:20] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:09:20] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:09:20] mean_reward=1032.0966 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0006774569893590574, 'timesteps': 2849, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:09:22]
|
|
[AutoResearch] ========== Trial 14/50 ==========
|
|
[2026-04-13 16:09:22] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 16:09:22] UCB=3.6895 mu=2.3211 sigma=0.6842 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0016216488882950194, 'timesteps': 1028}
|
|
[2026-04-13 16:09:22] UCB=3.5839 mu=2.1682 sigma=0.7079 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.002622019549469901, 'timesteps': 1018}
|
|
[2026-04-13 16:09:22] UCB=3.1895 mu=2.4656 sigma=0.3619 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0021469951486806586, 'timesteps': 1073}
|
|
[2026-04-13 16:09:22] UCB=2.9809 mu=1.1456 sigma=0.9176 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00354615869547646, 'timesteps': 1296}
|
|
[2026-04-13 16:09:22] UCB=2.9648 mu=2.3325 sigma=0.3161 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0003130709905464761, 'timesteps': 3373}
|
|
[2026-04-13 16:09:22] [AutoResearch] Proposed: {'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0016216488882950194, 'timesteps': 1028, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:09:24] [AutoResearch] Launching trial 14: {'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0016216488882950194, 'timesteps': 1028, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:11:08] [AutoResearch] Trial 14 finished in 104.0s, returncode=0
|
|
[2026-04-13 16:11:08] [AutoResearch] Trial 14: mean_reward=15.1063 std_reward=0.0076
|
|
[2026-04-13 16:11:08] [AutoResearch] === Trial 14 Summary ===
|
|
[2026-04-13 16:11:08] Total Phase 1 runs: 46
|
|
[2026-04-13 16:11:08] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:11:08] Top 5:
|
|
[2026-04-13 16:11:08] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:11:08] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:11:08] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:11:08] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:11:08] mean_reward=1032.0966 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0006774569893590574, 'timesteps': 2849, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:11:10]
|
|
[AutoResearch] ========== Trial 15/50 ==========
|
|
[2026-04-13 16:11:10] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 16:11:10] UCB=2.3880 mu=1.3579 sigma=0.5150 params={'n_steer': 5, 'n_throttle': 5, 'learning_rate': 0.00031142012560796015, 'timesteps': 3450}
|
|
[2026-04-13 16:11:10] UCB=2.3861 mu=2.1519 sigma=0.1171 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.00043425304956394085, 'timesteps': 2955}
|
|
[2026-04-13 16:11:10] UCB=2.2442 mu=0.6357 sigma=0.8042 params={'n_steer': 4, 'n_throttle': 4, 'learning_rate': 9.434990518937583e-05, 'timesteps': 4980}
|
|
[2026-04-13 16:11:10] UCB=2.1419 mu=1.1831 sigma=0.4794 params={'n_steer': 6, 'n_throttle': 5, 'learning_rate': 0.0006021392192267739, 'timesteps': 3258}
|
|
[2026-04-13 16:11:10] UCB=2.1294 mu=0.1453 sigma=0.9921 params={'n_steer': 4, 'n_throttle': 3, 'learning_rate': 0.004817619902542952, 'timesteps': 2247}
|
|
[2026-04-13 16:11:10] [AutoResearch] Proposed: {'n_steer': 5, 'n_throttle': 5, 'learning_rate': 0.00031142012560796015, 'timesteps': 3450, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:11:12] [AutoResearch] Launching trial 15: {'n_steer': 5, 'n_throttle': 5, 'learning_rate': 0.00031142012560796015, 'timesteps': 3450, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:14:16] [AutoResearch] Trial 15 finished in 184.3s, returncode=0
|
|
[2026-04-13 16:14:16] [AutoResearch] Trial 15: mean_reward=21.5609 std_reward=0.0472
|
|
[2026-04-13 16:14:16] [AutoResearch] === Trial 15 Summary ===
|
|
[2026-04-13 16:14:16] Total Phase 1 runs: 47
|
|
[2026-04-13 16:14:16] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:14:16] Top 5:
|
|
[2026-04-13 16:14:16] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:14:16] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:14:16] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:14:16] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:14:16] mean_reward=1032.0966 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0006774569893590574, 'timesteps': 2849, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:14:18]
|
|
[AutoResearch] ========== Trial 16/50 ==========
|
|
[2026-04-13 16:14:18] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 16:14:18] UCB=2.8069 mu=2.0215 sigma=0.3927 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0003241673094165768, 'timesteps': 3632}
|
|
[2026-04-13 16:14:18] UCB=2.3263 mu=1.8104 sigma=0.2580 params={'n_steer': 4, 'n_throttle': 4, 'learning_rate': 0.00033084211302334245, 'timesteps': 2916}
|
|
[2026-04-13 16:14:18] UCB=2.2712 mu=1.3861 sigma=0.4426 params={'n_steer': 8, 'n_throttle': 4, 'learning_rate': 0.0006784421359377939, 'timesteps': 3060}
|
|
[2026-04-13 16:14:18] UCB=2.2307 mu=1.6077 sigma=0.3115 params={'n_steer': 4, 'n_throttle': 4, 'learning_rate': 0.0003912882062863388, 'timesteps': 3421}
|
|
[2026-04-13 16:14:18] UCB=2.1864 mu=0.2144 sigma=0.9860 params={'n_steer': 4, 'n_throttle': 3, 'learning_rate': 0.004468930310099049, 'timesteps': 2507}
|
|
[2026-04-13 16:14:18] [AutoResearch] Proposed: {'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0003241673094165768, 'timesteps': 3632, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:14:20] [AutoResearch] Launching trial 16: {'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0003241673094165768, 'timesteps': 3632, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:17:44] [AutoResearch] Trial 16 finished in 203.5s, returncode=0
|
|
[2026-04-13 16:17:44] [AutoResearch] Trial 16: mean_reward=501.8865 std_reward=1.0671
|
|
[2026-04-13 16:17:44] [AutoResearch] === Trial 16 Summary ===
|
|
[2026-04-13 16:17:44] Total Phase 1 runs: 48
|
|
[2026-04-13 16:17:44] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:17:44] Top 5:
|
|
[2026-04-13 16:17:44] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:17:44] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:17:44] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:17:44] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:17:44] mean_reward=1032.0966 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0006774569893590574, 'timesteps': 2849, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:17:46]
|
|
[AutoResearch] ========== Trial 17/50 ==========
|
|
[2026-04-13 16:17:46] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 16:17:46] UCB=2.1466 mu=0.1634 sigma=0.9916 params={'n_steer': 3, 'n_throttle': 3, 'learning_rate': 0.004551208391752519, 'timesteps': 2467}
|
|
[2026-04-13 16:17:46] UCB=2.1356 mu=1.7494 sigma=0.1931 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.0005224077374076833, 'timesteps': 2982}
|
|
[2026-04-13 16:17:46] UCB=2.1196 mu=0.1238 sigma=0.9979 params={'n_steer': 3, 'n_throttle': 4, 'learning_rate': 0.00487944066987397, 'timesteps': 2500}
|
|
[2026-04-13 16:17:46] UCB=2.0933 mu=0.1087 sigma=0.9923 params={'n_steer': 4, 'n_throttle': 4, 'learning_rate': 0.0046177294398087885, 'timesteps': 2764}
|
|
[2026-04-13 16:17:46] UCB=2.0828 mu=0.0933 sigma=0.9947 params={'n_steer': 4, 'n_throttle': 4, 'learning_rate': 0.004644956123956311, 'timesteps': 2890}
|
|
[2026-04-13 16:17:46] [AutoResearch] Proposed: {'n_steer': 3, 'n_throttle': 3, 'learning_rate': 0.004551208391752519, 'timesteps': 2467, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:17:48] [AutoResearch] Launching trial 17: {'n_steer': 3, 'n_throttle': 3, 'learning_rate': 0.004551208391752519, 'timesteps': 2467, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:21:19] [AutoResearch] Trial 17 finished in 210.9s, returncode=0
|
|
[2026-04-13 16:21:19] [AutoResearch] Trial 17: mean_reward=204.9237 std_reward=0.8594
|
|
[2026-04-13 16:21:19] [AutoResearch] === Trial 17 Summary ===
|
|
[2026-04-13 16:21:19] Total Phase 1 runs: 49
|
|
[2026-04-13 16:21:19] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:21:19] Top 5:
|
|
[2026-04-13 16:21:19] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:21:19] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:21:19] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:21:19] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:21:19] mean_reward=1032.0966 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0006774569893590574, 'timesteps': 2849, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:21:21]
|
|
[AutoResearch] ========== Trial 18/50 ==========
|
|
[2026-04-13 16:21:21] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 16:21:21] UCB=1.9736 mu=-0.0151 sigma=0.9943 params={'n_steer': 3, 'n_throttle': 5, 'learning_rate': 0.004635271824040741, 'timesteps': 4579}
|
|
[2026-04-13 16:21:21] UCB=1.9719 mu=-0.0100 sigma=0.9910 params={'n_steer': 4, 'n_throttle': 5, 'learning_rate': 0.004702506196963715, 'timesteps': 4515}
|
|
[2026-04-13 16:21:21] UCB=1.9680 mu=0.3559 sigma=0.8060 params={'n_steer': 5, 'n_throttle': 2, 'learning_rate': 0.00039611211457936087, 'timesteps': 4776}
|
|
[2026-04-13 16:21:21] UCB=1.9488 mu=-0.0208 sigma=0.9848 params={'n_steer': 4, 'n_throttle': 5, 'learning_rate': 0.004976994354630289, 'timesteps': 4295}
|
|
[2026-04-13 16:21:21] UCB=1.9237 mu=0.8638 sigma=0.5299 params={'n_steer': 5, 'n_throttle': 2, 'learning_rate': 5.318719404529583e-05, 'timesteps': 3383}
|
|
[2026-04-13 16:21:21] [AutoResearch] Proposed: {'n_steer': 3, 'n_throttle': 5, 'learning_rate': 0.004635271824040741, 'timesteps': 4579, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:21:23] [AutoResearch] Launching trial 18: {'n_steer': 3, 'n_throttle': 5, 'learning_rate': 0.004635271824040741, 'timesteps': 4579, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:27:21] [AutoResearch] Trial 18 finished in 357.9s, returncode=0
|
|
[2026-04-13 16:27:21] [AutoResearch] Trial 18: mean_reward=15.1327 std_reward=0.0091
|
|
[2026-04-13 16:27:21] [AutoResearch] === Trial 18 Summary ===
|
|
[2026-04-13 16:27:21] Total Phase 1 runs: 50
|
|
[2026-04-13 16:27:21] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:27:21] Top 5:
|
|
[2026-04-13 16:27:21] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:27:21] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:27:21] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:27:21] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:27:21] mean_reward=1032.0966 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0006774569893590574, 'timesteps': 2849, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:27:23]
|
|
[AutoResearch] ========== Trial 19/50 ==========
|
|
[2026-04-13 16:27:23] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 16:27:23] UCB=2.4117 mu=0.7315 sigma=0.8401 params={'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.0005660634897015402, 'timesteps': 4954}
|
|
[2026-04-13 16:27:23] UCB=2.1418 mu=1.4716 sigma=0.3351 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0010210660206428537, 'timesteps': 2865}
|
|
[2026-04-13 16:27:23] UCB=2.0522 mu=1.8672 sigma=0.0925 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0005484988468040054, 'timesteps': 2664}
|
|
[2026-04-13 16:27:23] UCB=1.9170 mu=0.5165 sigma=0.7002 params={'n_steer': 5, 'n_throttle': 2, 'learning_rate': 0.0003221051258700323, 'timesteps': 4009}
|
|
[2026-04-13 16:27:23] UCB=1.8812 mu=0.6397 sigma=0.6208 params={'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.0005319521788798213, 'timesteps': 3471}
|
|
[2026-04-13 16:27:23] [AutoResearch] Proposed: {'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.0005660634897015402, 'timesteps': 4954, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:27:25] [AutoResearch] Launching trial 19: {'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.0005660634897015402, 'timesteps': 4954, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:32:33] [AutoResearch] Trial 19 finished in 307.9s, returncode=0
|
|
[2026-04-13 16:32:33] [AutoResearch] Trial 19: mean_reward=2237.9305 std_reward=4.2059
|
|
[2026-04-13 16:32:33] [AutoResearch] === Trial 19 Summary ===
|
|
[2026-04-13 16:32:33] Total Phase 1 runs: 51
|
|
[2026-04-13 16:32:33] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:32:33] Top 5:
|
|
[2026-04-13 16:32:33] mean_reward=2237.9305 params={'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.0005660634897015402, 'timesteps': 4954, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:32:33] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:32:33] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:32:33] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:32:33] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:32:35]
|
|
[AutoResearch] ========== Trial 20/50 ==========
|
|
[2026-04-13 16:32:35] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 16:32:35] UCB=3.4259 mu=1.8243 sigma=0.8008 params={'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.002069915888771807, 'timesteps': 4960}
|
|
[2026-04-13 16:32:35] UCB=3.0743 mu=1.9887 sigma=0.5428 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.0006141122334708564, 'timesteps': 4798}
|
|
[2026-04-13 16:32:35] UCB=2.9580 mu=1.7284 sigma=0.6148 params={'n_steer': 6, 'n_throttle': 2, 'learning_rate': 8.750174245784358e-05, 'timesteps': 3752}
|
|
[2026-04-13 16:32:35] UCB=2.9385 mu=1.9153 sigma=0.5116 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.0003012438414853693, 'timesteps': 4570}
|
|
[2026-04-13 16:32:35] UCB=2.8840 mu=1.3911 sigma=0.7465 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0009444412843513588, 'timesteps': 4795}
|
|
[2026-04-13 16:32:35] [AutoResearch] Proposed: {'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.002069915888771807, 'timesteps': 4960, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:32:37] [AutoResearch] Launching trial 20: {'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.002069915888771807, 'timesteps': 4960, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:38:17] [AutoResearch] Trial 20 finished in 340.5s, returncode=0
|
|
[2026-04-13 16:38:17] [AutoResearch] Trial 20: mean_reward=15.1108 std_reward=0.0091
|
|
[2026-04-13 16:38:17] [AutoResearch] === Trial 20 Summary ===
|
|
[2026-04-13 16:38:17] Total Phase 1 runs: 52
|
|
[2026-04-13 16:38:17] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:38:17] Top 5:
|
|
[2026-04-13 16:38:17] mean_reward=2237.9305 params={'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.0005660634897015402, 'timesteps': 4954, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:38:17] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:38:17] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:38:17] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:38:17] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:38:18] [AutoResearch] Git push complete after trial 20
|
|
[2026-04-13 16:38:20]
|
|
[AutoResearch] ========== Trial 21/50 ==========
|
|
[2026-04-13 16:38:20] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 16:38:20] UCB=3.7556 mu=2.6923 sigma=0.5317 params={'n_steer': 7, 'n_throttle': 2, 'learning_rate': 0.00016732476969720305, 'timesteps': 4545}
|
|
[2026-04-13 16:38:20] UCB=3.6554 mu=2.6237 sigma=0.5159 params={'n_steer': 7, 'n_throttle': 2, 'learning_rate': 0.00027361330320505457, 'timesteps': 4426}
|
|
[2026-04-13 16:38:20] UCB=2.9292 mu=1.5395 sigma=0.6949 params={'n_steer': 7, 'n_throttle': 2, 'learning_rate': 0.0008890476874051958, 'timesteps': 4901}
|
|
[2026-04-13 16:38:20] UCB=2.7843 mu=1.6784 sigma=0.5529 params={'n_steer': 7, 'n_throttle': 2, 'learning_rate': 0.0009111341085199367, 'timesteps': 4229}
|
|
[2026-04-13 16:38:20] UCB=2.0854 mu=1.5744 sigma=0.2555 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.000791530228950698, 'timesteps': 2671}
|
|
[2026-04-13 16:38:20] [AutoResearch] Proposed: {'n_steer': 7, 'n_throttle': 2, 'learning_rate': 0.00016732476969720305, 'timesteps': 4545, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:38:22] [AutoResearch] Launching trial 21: {'n_steer': 7, 'n_throttle': 2, 'learning_rate': 0.00016732476969720305, 'timesteps': 4545, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:43:14] [AutoResearch] Trial 21 finished in 291.9s, returncode=0
|
|
[2026-04-13 16:43:14] [AutoResearch] Trial 21: mean_reward=712.0042 std_reward=90.943
|
|
[2026-04-13 16:43:14] [AutoResearch] === Trial 21 Summary ===
|
|
[2026-04-13 16:43:14] Total Phase 1 runs: 53
|
|
[2026-04-13 16:43:14] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:43:14] Top 5:
|
|
[2026-04-13 16:43:14] mean_reward=2237.9305 params={'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.0005660634897015402, 'timesteps': 4954, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:43:14] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:43:14] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:43:14] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:43:14] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:43:16]
|
|
[AutoResearch] ========== Trial 22/50 ==========
|
|
[2026-04-13 16:43:16] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 16:43:16] UCB=3.3258 mu=3.0998 sigma=0.1130 params={'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.00047237107706507965, 'timesteps': 4731}
|
|
[2026-04-13 16:43:16] UCB=2.0722 mu=1.3063 sigma=0.3830 params={'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.0011620798877489517, 'timesteps': 4700}
|
|
[2026-04-13 16:43:16] UCB=1.7872 mu=-0.1514 sigma=0.9693 params={'n_steer': 9, 'n_throttle': 5, 'learning_rate': 0.0034249503127642264, 'timesteps': 4372}
|
|
[2026-04-13 16:43:16] UCB=1.7749 mu=-0.1723 sigma=0.9736 params={'n_steer': 9, 'n_throttle': 5, 'learning_rate': 0.003194575957101677, 'timesteps': 4362}
|
|
[2026-04-13 16:43:16] UCB=1.7471 mu=-0.2101 sigma=0.9786 params={'n_steer': 6, 'n_throttle': 5, 'learning_rate': 0.004701365865030676, 'timesteps': 1665}
|
|
[2026-04-13 16:43:16] [AutoResearch] Proposed: {'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.00047237107706507965, 'timesteps': 4731, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:43:18] [AutoResearch] Launching trial 22: {'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.00047237107706507965, 'timesteps': 4731, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:48:14] [AutoResearch] Trial 22 finished in 295.8s, returncode=0
|
|
[2026-04-13 16:48:14] [AutoResearch] Trial 22: mean_reward=562.9132 std_reward=10.4819
|
|
[2026-04-13 16:48:14] [AutoResearch] === Trial 22 Summary ===
|
|
[2026-04-13 16:48:14] Total Phase 1 runs: 54
|
|
[2026-04-13 16:48:14] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:48:14] Top 5:
|
|
[2026-04-13 16:48:14] mean_reward=2237.9305 params={'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.0005660634897015402, 'timesteps': 4954, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:48:14] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:48:14] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:48:14] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:48:14] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:48:16]
|
|
[AutoResearch] ========== Trial 23/50 ==========
|
|
[2026-04-13 16:48:16] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 16:48:16] UCB=5.7123 mu=4.0715 sigma=0.8204 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.000872444658512201, 'timesteps': 4959}
|
|
[2026-04-13 16:48:16] UCB=4.6632 mu=2.9971 sigma=0.8330 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0013578723525765764, 'timesteps': 4857}
|
|
[2026-04-13 16:48:16] UCB=3.8838 mu=2.3869 sigma=0.7485 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.0009439538581721204, 'timesteps': 4720}
|
|
[2026-04-13 16:48:16] UCB=3.8689 mu=2.3330 sigma=0.7680 params={'n_steer': 7, 'n_throttle': 4, 'learning_rate': 0.0006494808486629405, 'timesteps': 4787}
|
|
[2026-04-13 16:48:16] UCB=3.6625 mu=3.3554 sigma=0.1536 params={'n_steer': 7, 'n_throttle': 2, 'learning_rate': 0.0004171158677596043, 'timesteps': 4760}
|
|
[2026-04-13 16:48:16] [AutoResearch] Proposed: {'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.000872444658512201, 'timesteps': 4959, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:48:18] [AutoResearch] Launching trial 23: {'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.000872444658512201, 'timesteps': 4959, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:53:04] [AutoResearch] Trial 23 finished in 285.7s, returncode=0
|
|
[2026-04-13 16:53:04] [AutoResearch] Trial 23: mean_reward=14.9815 std_reward=0.0093
|
|
[2026-04-13 16:53:04] [AutoResearch] === Trial 23 Summary ===
|
|
[2026-04-13 16:53:04] Total Phase 1 runs: 55
|
|
[2026-04-13 16:53:04] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:53:04] Top 5:
|
|
[2026-04-13 16:53:04] mean_reward=2237.9305 params={'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.0005660634897015402, 'timesteps': 4954, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:53:04] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:53:04] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:53:04] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:53:04] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:53:06]
|
|
[AutoResearch] ========== Trial 24/50 ==========
|
|
[2026-04-13 16:53:06] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 16:53:06] UCB=3.4649 mu=3.1911 sigma=0.1369 params={'n_steer': 7, 'n_throttle': 2, 'learning_rate': 0.0001292488328476618, 'timesteps': 4749}
|
|
[2026-04-13 16:53:06] UCB=2.3675 mu=2.0286 sigma=0.1694 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.00035609774689880495, 'timesteps': 2952}
|
|
[2026-04-13 16:53:06] UCB=2.2707 mu=1.9059 sigma=0.1824 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0004136019141565726, 'timesteps': 2931}
|
|
[2026-04-13 16:53:06] UCB=2.1283 mu=1.6145 sigma=0.2569 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0019365337394847539, 'timesteps': 1025}
|
|
[2026-04-13 16:53:06] UCB=1.7996 mu=-0.1794 sigma=0.9895 params={'n_steer': 9, 'n_throttle': 5, 'learning_rate': 0.004779978427179205, 'timesteps': 4249}
|
|
[2026-04-13 16:53:06] [AutoResearch] Proposed: {'n_steer': 7, 'n_throttle': 2, 'learning_rate': 0.0001292488328476618, 'timesteps': 4749, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:53:08] [AutoResearch] Launching trial 24: {'n_steer': 7, 'n_throttle': 2, 'learning_rate': 0.0001292488328476618, 'timesteps': 4749, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:58:24] [AutoResearch] Trial 24 finished in 316.5s, returncode=0
|
|
[2026-04-13 16:58:24] [AutoResearch] Trial 24: mean_reward=657.5063 std_reward=2.2574
|
|
[2026-04-13 16:58:24] [AutoResearch] === Trial 24 Summary ===
|
|
[2026-04-13 16:58:24] Total Phase 1 runs: 56
|
|
[2026-04-13 16:58:24] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:58:24] Top 5:
|
|
[2026-04-13 16:58:24] mean_reward=2237.9305 params={'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.0005660634897015402, 'timesteps': 4954, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:58:24] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:58:24] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:58:24] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:58:24] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:58:26]
|
|
[AutoResearch] ========== Trial 25/50 ==========
|
|
[2026-04-13 16:58:26] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 16:58:26] UCB=4.2698 mu=3.4448 sigma=0.4125 params={'n_steer': 8, 'n_throttle': 2, 'learning_rate': 6.342460411299125e-05, 'timesteps': 4090}
|
|
[2026-04-13 16:58:26] UCB=3.7333 mu=2.7882 sigma=0.4726 params={'n_steer': 7, 'n_throttle': 2, 'learning_rate': 0.0011263751132521438, 'timesteps': 4576}
|
|
[2026-04-13 16:58:26] UCB=3.4290 mu=2.8470 sigma=0.2910 params={'n_steer': 8, 'n_throttle': 2, 'learning_rate': 0.00047455842448918, 'timesteps': 4720}
|
|
[2026-04-13 16:58:26] UCB=3.3911 mu=2.4851 sigma=0.4530 params={'n_steer': 8, 'n_throttle': 2, 'learning_rate': 0.0003300910432647304, 'timesteps': 2676}
|
|
[2026-04-13 16:58:26] UCB=3.3573 mu=1.8872 sigma=0.7351 params={'n_steer': 8, 'n_throttle': 2, 'learning_rate': 0.0015667124250089057, 'timesteps': 3815}
|
|
[2026-04-13 16:58:26] [AutoResearch] Proposed: {'n_steer': 8, 'n_throttle': 2, 'learning_rate': 6.342460411299125e-05, 'timesteps': 4090, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 16:58:28] [AutoResearch] Launching trial 25: {'n_steer': 8, 'n_throttle': 2, 'learning_rate': 6.342460411299125e-05, 'timesteps': 4090, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:02:19] [AutoResearch] Trial 25 finished in 230.3s, returncode=0
|
|
[2026-04-13 17:02:19] [AutoResearch] Trial 25: mean_reward=279.0168 std_reward=6.9418
|
|
[2026-04-13 17:02:19] [AutoResearch] === Trial 25 Summary ===
|
|
[2026-04-13 17:02:19] Total Phase 1 runs: 57
|
|
[2026-04-13 17:02:19] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:02:19] Top 5:
|
|
[2026-04-13 17:02:19] mean_reward=2237.9305 params={'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.0005660634897015402, 'timesteps': 4954, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:02:19] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:02:19] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:02:19] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:02:19] mean_reward=1072.7063 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00038717401417690916, 'timesteps': 2914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:02:21]
|
|
[AutoResearch] ========== Trial 26/50 ==========
|
|
[2026-04-13 17:02:21] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 17:02:21] UCB=3.3528 mu=2.3124 sigma=0.5202 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0004488352572615814, 'timesteps': 4898}
|
|
[2026-04-13 17:02:21] UCB=3.1485 mu=2.0065 sigma=0.5710 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0003938802691041149, 'timesteps': 4868}
|
|
[2026-04-13 17:02:21] UCB=2.4341 mu=1.1732 sigma=0.6304 params={'n_steer': 4, 'n_throttle': 3, 'learning_rate': 0.0006992915187758438, 'timesteps': 4823}
|
|
[2026-04-13 17:02:21] UCB=2.2359 mu=1.3933 sigma=0.4213 params={'n_steer': 8, 'n_throttle': 2, 'learning_rate': 0.0009559040057827017, 'timesteps': 2924}
|
|
[2026-04-13 17:02:21] UCB=2.1994 mu=1.9371 sigma=0.1312 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0007187013270107417, 'timesteps': 2786}
|
|
[2026-04-13 17:02:21] [AutoResearch] Proposed: {'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0004488352572615814, 'timesteps': 4898, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:02:23] [AutoResearch] Launching trial 26: {'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0004488352572615814, 'timesteps': 4898, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:07:50] [AutoResearch] Trial 26 finished in 327.3s, returncode=0
|
|
[2026-04-13 17:07:50] [AutoResearch] Trial 26: mean_reward=2306.761 std_reward=6.7895
|
|
[2026-04-13 17:07:50] [AutoResearch] === Trial 26 Summary ===
|
|
[2026-04-13 17:07:50] Total Phase 1 runs: 58
|
|
[2026-04-13 17:07:50] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:07:50] Top 5:
|
|
[2026-04-13 17:07:50] mean_reward=2306.7610 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0004488352572615814, 'timesteps': 4898, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:07:50] mean_reward=2237.9305 params={'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.0005660634897015402, 'timesteps': 4954, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:07:50] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:07:50] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:07:50] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:07:52]
|
|
[AutoResearch] ========== Trial 27/50 ==========
|
|
[2026-04-13 17:07:52] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 17:07:52] UCB=2.3063 mu=1.3113 sigma=0.4975 params={'n_steer': 8, 'n_throttle': 2, 'learning_rate': 0.0010435891469892742, 'timesteps': 4484}
|
|
[2026-04-13 17:07:52] UCB=2.1799 mu=0.7760 sigma=0.7020 params={'n_steer': 3, 'n_throttle': 4, 'learning_rate': 0.0007345461142898144, 'timesteps': 4774}
|
|
[2026-04-13 17:07:52] UCB=2.0153 mu=0.4851 sigma=0.7651 params={'n_steer': 4, 'n_throttle': 3, 'learning_rate': 0.0018910995990971135, 'timesteps': 4919}
|
|
[2026-04-13 17:07:52] UCB=2.0019 mu=0.8923 sigma=0.5548 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0013556755233696148, 'timesteps': 4750}
|
|
[2026-04-13 17:07:52] UCB=1.8940 mu=1.3114 sigma=0.2913 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0002403184577692745, 'timesteps': 4370}
|
|
[2026-04-13 17:07:52] [AutoResearch] Proposed: {'n_steer': 8, 'n_throttle': 2, 'learning_rate': 0.0010435891469892742, 'timesteps': 4484, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:07:54] [AutoResearch] Launching trial 27: {'n_steer': 8, 'n_throttle': 2, 'learning_rate': 0.0010435891469892742, 'timesteps': 4484, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:13:17] [AutoResearch] Trial 27 finished in 322.7s, returncode=0
|
|
[2026-04-13 17:13:17] [AutoResearch] Trial 27: mean_reward=332.1491 std_reward=0.1125
|
|
[2026-04-13 17:13:17] [AutoResearch] === Trial 27 Summary ===
|
|
[2026-04-13 17:13:17] Total Phase 1 runs: 59
|
|
[2026-04-13 17:13:17] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:13:17] Top 5:
|
|
[2026-04-13 17:13:17] mean_reward=2306.7610 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0004488352572615814, 'timesteps': 4898, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:13:17] mean_reward=2237.9305 params={'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.0005660634897015402, 'timesteps': 4954, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:13:17] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:13:17] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:13:17] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:13:19]
|
|
[AutoResearch] ========== Trial 28/50 ==========
|
|
[2026-04-13 17:13:19] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 17:13:19] UCB=3.2262 mu=2.8774 sigma=0.1744 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0007283347926280964, 'timesteps': 4804}
|
|
[2026-04-13 17:13:19] UCB=2.5026 mu=2.0716 sigma=0.2155 params={'n_steer': 7, 'n_throttle': 2, 'learning_rate': 0.000678144351086595, 'timesteps': 4957}
|
|
[2026-04-13 17:13:19] UCB=2.1611 mu=0.6361 sigma=0.7625 params={'n_steer': 3, 'n_throttle': 4, 'learning_rate': 0.00034031839912747184, 'timesteps': 4885}
|
|
[2026-04-13 17:13:19] UCB=1.9295 mu=1.6080 sigma=0.1608 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0005111405199774711, 'timesteps': 2981}
|
|
[2026-04-13 17:13:19] UCB=1.8912 mu=1.4029 sigma=0.2442 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006829056707945425, 'timesteps': 2729}
|
|
[2026-04-13 17:13:19] [AutoResearch] Proposed: {'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0007283347926280964, 'timesteps': 4804, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:13:21] [AutoResearch] Launching trial 28: {'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0007283347926280964, 'timesteps': 4804, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:18:46] [AutoResearch] Trial 28 finished in 325.3s, returncode=0
|
|
[2026-04-13 17:18:46] [AutoResearch] Trial 28: mean_reward=1125.892 std_reward=9.7091
|
|
[2026-04-13 17:18:46] [AutoResearch] === Trial 28 Summary ===
|
|
[2026-04-13 17:18:46] Total Phase 1 runs: 60
|
|
[2026-04-13 17:18:46] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:18:46] Top 5:
|
|
[2026-04-13 17:18:46] mean_reward=2306.7610 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0004488352572615814, 'timesteps': 4898, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:18:46] mean_reward=2237.9305 params={'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.0005660634897015402, 'timesteps': 4954, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:18:46] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:18:46] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:18:46] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:18:48]
|
|
[AutoResearch] ========== Trial 29/50 ==========
|
|
[2026-04-13 17:18:48] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 17:18:48] UCB=1.9565 mu=1.0416 sigma=0.4574 params={'n_steer': 8, 'n_throttle': 2, 'learning_rate': 0.0006213302398696545, 'timesteps': 3064}
|
|
[2026-04-13 17:18:48] UCB=1.7895 mu=-0.1956 sigma=0.9925 params={'n_steer': 8, 'n_throttle': 5, 'learning_rate': 0.00496713790460354, 'timesteps': 4807}
|
|
[2026-04-13 17:18:48] UCB=1.7703 mu=-0.2014 sigma=0.9858 params={'n_steer': 5, 'n_throttle': 5, 'learning_rate': 0.004942965581265541, 'timesteps': 2075}
|
|
[2026-04-13 17:18:48] UCB=1.7475 mu=-0.1993 sigma=0.9734 params={'n_steer': 5, 'n_throttle': 5, 'learning_rate': 0.004884055358157497, 'timesteps': 2512}
|
|
[2026-04-13 17:18:48] UCB=1.7288 mu=-0.2113 sigma=0.9701 params={'n_steer': 6, 'n_throttle': 5, 'learning_rate': 0.004639975825026214, 'timesteps': 2393}
|
|
[2026-04-13 17:18:48] [AutoResearch] Proposed: {'n_steer': 8, 'n_throttle': 2, 'learning_rate': 0.0006213302398696545, 'timesteps': 3064, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:18:50] [AutoResearch] Launching trial 29: {'n_steer': 8, 'n_throttle': 2, 'learning_rate': 0.0006213302398696545, 'timesteps': 3064, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:22:42] [AutoResearch] Trial 29 finished in 231.5s, returncode=0
|
|
[2026-04-13 17:22:42] [AutoResearch] Trial 29: mean_reward=35.5084 std_reward=0.0228
|
|
[2026-04-13 17:22:42] [AutoResearch] === Trial 29 Summary ===
|
|
[2026-04-13 17:22:42] Total Phase 1 runs: 61
|
|
[2026-04-13 17:22:42] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:22:42] Top 5:
|
|
[2026-04-13 17:22:42] mean_reward=2306.7610 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0004488352572615814, 'timesteps': 4898, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:22:42] mean_reward=2237.9305 params={'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.0005660634897015402, 'timesteps': 4954, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:22:42] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:22:42] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:22:42] mean_reward=1157.0470 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00037737321665256695, 'timesteps': 2717, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:22:44]
|
|
[AutoResearch] ========== Trial 30/50 ==========
|
|
[2026-04-13 17:22:44] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 17:22:44] UCB=4.8341 mu=4.5783 sigma=0.1279 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0003386484278685721, 'timesteps': 4977}
|
|
[2026-04-13 17:22:44] UCB=2.4343 mu=2.1541 sigma=0.1401 params={'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.00036335401272427295, 'timesteps': 4842}
|
|
[2026-04-13 17:22:44] UCB=1.9834 mu=1.5341 sigma=0.2247 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.0004804115725768152, 'timesteps': 4817}
|
|
[2026-04-13 17:22:44] UCB=1.8265 mu=0.4840 sigma=0.6712 params={'n_steer': 5, 'n_throttle': 2, 'learning_rate': 0.0011405885967124738, 'timesteps': 1031}
|
|
[2026-04-13 17:22:44] UCB=1.7885 mu=-0.1747 sigma=0.9816 params={'n_steer': 9, 'n_throttle': 5, 'learning_rate': 0.004833150908658096, 'timesteps': 3968}
|
|
[2026-04-13 17:22:44] [AutoResearch] Proposed: {'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0003386484278685721, 'timesteps': 4977, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:22:46] [AutoResearch] Launching trial 30: {'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0003386484278685721, 'timesteps': 4977, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:28:19] [AutoResearch] Trial 30 finished in 332.9s, returncode=0
|
|
[2026-04-13 17:28:19] [AutoResearch] Trial 30: mean_reward=2286.9085 std_reward=1.508
|
|
[2026-04-13 17:28:19] [AutoResearch] === Trial 30 Summary ===
|
|
[2026-04-13 17:28:19] Total Phase 1 runs: 62
|
|
[2026-04-13 17:28:19] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:28:19] Top 5:
|
|
[2026-04-13 17:28:19] mean_reward=2306.7610 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0004488352572615814, 'timesteps': 4898, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:28:19] mean_reward=2286.9085 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0003386484278685721, 'timesteps': 4977, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:28:19] mean_reward=2237.9305 params={'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.0005660634897015402, 'timesteps': 4954, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:28:19] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:28:19] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:28:20] [AutoResearch] Git push complete after trial 30
|
|
[2026-04-13 17:28:22]
|
|
[AutoResearch] ========== Trial 31/50 ==========
|
|
[2026-04-13 17:28:22] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 17:28:22] UCB=1.8293 mu=1.4150 sigma=0.2071 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00026823443520900794, 'timesteps': 3642}
|
|
[2026-04-13 17:28:22] UCB=1.7919 mu=-0.1513 sigma=0.9716 params={'n_steer': 8, 'n_throttle': 5, 'learning_rate': 0.004668146289604198, 'timesteps': 4165}
|
|
[2026-04-13 17:28:22] UCB=1.7859 mu=-0.1629 sigma=0.9744 params={'n_steer': 8, 'n_throttle': 5, 'learning_rate': 0.004976003269391529, 'timesteps': 3465}
|
|
[2026-04-13 17:28:22] UCB=1.7747 mu=-0.1971 sigma=0.9859 params={'n_steer': 5, 'n_throttle': 5, 'learning_rate': 0.004814301490182538, 'timesteps': 1962}
|
|
[2026-04-13 17:28:22] UCB=1.7493 mu=-0.2268 sigma=0.9880 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.004644700370535985, 'timesteps': 1204}
|
|
[2026-04-13 17:28:22] [AutoResearch] Proposed: {'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00026823443520900794, 'timesteps': 3642, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:28:24] [AutoResearch] Launching trial 31: {'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00026823443520900794, 'timesteps': 3642, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:31:58] [AutoResearch] Trial 31 finished in 214.0s, returncode=0
|
|
[2026-04-13 17:31:58] [AutoResearch] Trial 31: mean_reward=437.2376 std_reward=0.7096
|
|
[2026-04-13 17:31:58] [AutoResearch] === Trial 31 Summary ===
|
|
[2026-04-13 17:31:58] Total Phase 1 runs: 63
|
|
[2026-04-13 17:31:58] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:31:58] Top 5:
|
|
[2026-04-13 17:31:58] mean_reward=2306.7610 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0004488352572615814, 'timesteps': 4898, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:31:58] mean_reward=2286.9085 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0003386484278685721, 'timesteps': 4977, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:31:58] mean_reward=2237.9305 params={'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.0005660634897015402, 'timesteps': 4954, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:31:58] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:31:58] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:32:00]
|
|
[AutoResearch] ========== Trial 32/50 ==========
|
|
[2026-04-13 17:32:00] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 17:32:00] UCB=1.9266 mu=1.5375 sigma=0.1945 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 9.211025615061542e-05, 'timesteps': 3666}
|
|
[2026-04-13 17:32:00] UCB=1.8664 mu=1.2996 sigma=0.2834 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.0012635534641106208, 'timesteps': 1084}
|
|
[2026-04-13 17:32:00] UCB=1.7961 mu=-0.1766 sigma=0.9864 params={'n_steer': 5, 'n_throttle': 5, 'learning_rate': 0.004925273491069971, 'timesteps': 1904}
|
|
[2026-04-13 17:32:00] UCB=1.7562 mu=-0.2134 sigma=0.9848 params={'n_steer': 6, 'n_throttle': 5, 'learning_rate': 0.004794684466003254, 'timesteps': 1740}
|
|
[2026-04-13 17:32:00] UCB=1.7532 mu=-0.2189 sigma=0.9861 params={'n_steer': 6, 'n_throttle': 5, 'learning_rate': 0.00445038188039129, 'timesteps': 1658}
|
|
[2026-04-13 17:32:00] [AutoResearch] Proposed: {'n_steer': 5, 'n_throttle': 3, 'learning_rate': 9.211025615061542e-05, 'timesteps': 3666, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:32:02] [AutoResearch] Launching trial 32: {'n_steer': 5, 'n_throttle': 3, 'learning_rate': 9.211025615061542e-05, 'timesteps': 3666, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:35:28] [AutoResearch] Trial 32 finished in 206.1s, returncode=0
|
|
[2026-04-13 17:35:28] [AutoResearch] Trial 32: mean_reward=436.5947 std_reward=0.5625
|
|
[2026-04-13 17:35:28] [AutoResearch] === Trial 32 Summary ===
|
|
[2026-04-13 17:35:28] Total Phase 1 runs: 64
|
|
[2026-04-13 17:35:28] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:35:28] Top 5:
|
|
[2026-04-13 17:35:28] mean_reward=2306.7610 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0004488352572615814, 'timesteps': 4898, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:35:28] mean_reward=2286.9085 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0003386484278685721, 'timesteps': 4977, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:35:28] mean_reward=2237.9305 params={'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.0005660634897015402, 'timesteps': 4954, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:35:28] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:35:28] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:35:30]
|
|
[AutoResearch] ========== Trial 33/50 ==========
|
|
[2026-04-13 17:35:30] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 17:35:30] UCB=1.8230 mu=-0.1558 sigma=0.9894 params={'n_steer': 5, 'n_throttle': 5, 'learning_rate': 0.004904321552257595, 'timesteps': 2059}
|
|
[2026-04-13 17:35:30] UCB=1.8185 mu=-0.1516 sigma=0.9851 params={'n_steer': 7, 'n_throttle': 5, 'learning_rate': 0.004976831565587714, 'timesteps': 1229}
|
|
[2026-04-13 17:35:30] UCB=1.8177 mu=-0.1547 sigma=0.9862 params={'n_steer': 9, 'n_throttle': 5, 'learning_rate': 0.003732008547933269, 'timesteps': 4958}
|
|
[2026-04-13 17:35:30] UCB=1.7862 mu=-0.1917 sigma=0.9889 params={'n_steer': 6, 'n_throttle': 5, 'learning_rate': 0.004729802758232317, 'timesteps': 1370}
|
|
[2026-04-13 17:35:30] UCB=1.7841 mu=-0.1792 sigma=0.9816 params={'n_steer': 6, 'n_throttle': 5, 'learning_rate': 0.004976832119872584, 'timesteps': 2341}
|
|
[2026-04-13 17:35:30] [AutoResearch] Proposed: {'n_steer': 5, 'n_throttle': 5, 'learning_rate': 0.004904321552257595, 'timesteps': 2059, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:35:32] [AutoResearch] Launching trial 33: {'n_steer': 5, 'n_throttle': 5, 'learning_rate': 0.004904321552257595, 'timesteps': 2059, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:40:09] [AutoResearch] Trial 33 finished in 277.2s, returncode=0
|
|
[2026-04-13 17:40:09] [AutoResearch] Trial 33: mean_reward=15.6793 std_reward=0.0505
|
|
[2026-04-13 17:40:09] [AutoResearch] === Trial 33 Summary ===
|
|
[2026-04-13 17:40:09] Total Phase 1 runs: 65
|
|
[2026-04-13 17:40:09] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:40:09] Top 5:
|
|
[2026-04-13 17:40:09] mean_reward=2306.7610 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0004488352572615814, 'timesteps': 4898, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:40:09] mean_reward=2286.9085 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0003386484278685721, 'timesteps': 4977, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:40:09] mean_reward=2237.9305 params={'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.0005660634897015402, 'timesteps': 4954, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:40:09] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:40:09] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:40:11]
|
|
[AutoResearch] ========== Trial 34/50 ==========
|
|
[2026-04-13 17:40:11] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 17:40:11] UCB=1.8067 mu=1.4764 sigma=0.1652 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00038053863101293335, 'timesteps': 3088}
|
|
[2026-04-13 17:40:11] UCB=1.8047 mu=-0.1491 sigma=0.9769 params={'n_steer': 7, 'n_throttle': 5, 'learning_rate': 0.004636844298406863, 'timesteps': 4808}
|
|
[2026-04-13 17:40:11] UCB=1.6949 mu=-0.2530 sigma=0.9739 params={'n_steer': 8, 'n_throttle': 5, 'learning_rate': 0.004917100163005054, 'timesteps': 4026}
|
|
[2026-04-13 17:40:11] UCB=1.6353 mu=-0.2991 sigma=0.9672 params={'n_steer': 8, 'n_throttle': 5, 'learning_rate': 0.0046635403755889955, 'timesteps': 4363}
|
|
[2026-04-13 17:40:11] UCB=1.6196 mu=-0.2378 sigma=0.9287 params={'n_steer': 4, 'n_throttle': 2, 'learning_rate': 0.004469470790956243, 'timesteps': 1524}
|
|
[2026-04-13 17:40:11] [AutoResearch] Proposed: {'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00038053863101293335, 'timesteps': 3088, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:40:13] [AutoResearch] Launching trial 34: {'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00038053863101293335, 'timesteps': 3088, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:43:51] [AutoResearch] Trial 34 finished in 217.4s, returncode=0
|
|
[2026-04-13 17:43:51] [AutoResearch] Trial 34: mean_reward=638.2092 std_reward=0.5092
|
|
[2026-04-13 17:43:51] [AutoResearch] === Trial 34 Summary ===
|
|
[2026-04-13 17:43:51] Total Phase 1 runs: 66
|
|
[2026-04-13 17:43:51] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:43:51] Top 5:
|
|
[2026-04-13 17:43:51] mean_reward=2306.7610 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0004488352572615814, 'timesteps': 4898, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:43:51] mean_reward=2286.9085 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0003386484278685721, 'timesteps': 4977, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:43:51] mean_reward=2237.9305 params={'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.0005660634897015402, 'timesteps': 4954, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:43:51] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:43:51] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:43:53]
|
|
[AutoResearch] ========== Trial 35/50 ==========
|
|
[2026-04-13 17:43:53] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 17:43:53] UCB=2.9343 mu=2.4626 sigma=0.2358 params={'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.0002146725458057153, 'timesteps': 4944}
|
|
[2026-04-13 17:43:53] UCB=2.5541 mu=2.1447 sigma=0.2047 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0006747069310446365, 'timesteps': 4902}
|
|
[2026-04-13 17:43:53] UCB=2.4444 mu=2.0202 sigma=0.2121 params={'n_steer': 4, 'n_throttle': 3, 'learning_rate': 0.00030671251398934967, 'timesteps': 4520}
|
|
[2026-04-13 17:43:53] UCB=1.8159 mu=-0.1667 sigma=0.9913 params={'n_steer': 8, 'n_throttle': 5, 'learning_rate': 0.004716216831648299, 'timesteps': 4776}
|
|
[2026-04-13 17:43:53] UCB=1.7143 mu=0.9845 sigma=0.3649 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0022546821210683806, 'timesteps': 1041}
|
|
[2026-04-13 17:43:53] [AutoResearch] Proposed: {'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.0002146725458057153, 'timesteps': 4944, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:43:55] [AutoResearch] Launching trial 35: {'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.0002146725458057153, 'timesteps': 4944, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:49:20] [AutoResearch] Trial 35 finished in 325.0s, returncode=0
|
|
[2026-04-13 17:49:20] [AutoResearch] Trial 35: mean_reward=540.5951 std_reward=38.2538
|
|
[2026-04-13 17:49:20] [AutoResearch] === Trial 35 Summary ===
|
|
[2026-04-13 17:49:20] Total Phase 1 runs: 67
|
|
[2026-04-13 17:49:20] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:49:20] Top 5:
|
|
[2026-04-13 17:49:20] mean_reward=2306.7610 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0004488352572615814, 'timesteps': 4898, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:49:20] mean_reward=2286.9085 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0003386484278685721, 'timesteps': 4977, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:49:20] mean_reward=2237.9305 params={'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.0005660634897015402, 'timesteps': 4954, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:49:20] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:49:20] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:49:22]
|
|
[AutoResearch] ========== Trial 36/50 ==========
|
|
[2026-04-13 17:49:22] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 17:49:22] UCB=3.4845 mu=3.1499 sigma=0.1673 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00036413507909569123, 'timesteps': 4891}
|
|
[2026-04-13 17:49:22] UCB=3.0729 mu=1.8399 sigma=0.6165 params={'n_steer': 3, 'n_throttle': 3, 'learning_rate': 0.0005607137759058633, 'timesteps': 4948}
|
|
[2026-04-13 17:49:22] UCB=2.1034 mu=1.1264 sigma=0.4885 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 9.682211641519502e-05, 'timesteps': 4801}
|
|
[2026-04-13 17:49:22] UCB=1.9962 mu=1.5360 sigma=0.2301 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.00044878665030086243, 'timesteps': 4962}
|
|
[2026-04-13 17:49:22] UCB=1.7950 mu=-0.1681 sigma=0.9815 params={'n_steer': 9, 'n_throttle': 5, 'learning_rate': 0.004119006807166534, 'timesteps': 4402}
|
|
[2026-04-13 17:49:22] [AutoResearch] Proposed: {'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00036413507909569123, 'timesteps': 4891, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:49:24] [AutoResearch] Launching trial 36: {'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00036413507909569123, 'timesteps': 4891, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:55:26] [AutoResearch] Trial 36 finished in 362.4s, returncode=0
|
|
[2026-04-13 17:55:26] [AutoResearch] Trial 36: mean_reward=1101.0573 std_reward=2.347
|
|
[2026-04-13 17:55:26] [AutoResearch] === Trial 36 Summary ===
|
|
[2026-04-13 17:55:26] Total Phase 1 runs: 68
|
|
[2026-04-13 17:55:26] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:55:26] Top 5:
|
|
[2026-04-13 17:55:26] mean_reward=2306.7610 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0004488352572615814, 'timesteps': 4898, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:55:26] mean_reward=2286.9085 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0003386484278685721, 'timesteps': 4977, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:55:26] mean_reward=2237.9305 params={'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.0005660634897015402, 'timesteps': 4954, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:55:26] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:55:26] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:55:28]
|
|
[AutoResearch] ========== Trial 37/50 ==========
|
|
[2026-04-13 17:55:28] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 17:55:28] UCB=2.5270 mu=2.0453 sigma=0.2408 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.0011149861891360494, 'timesteps': 4845}
|
|
[2026-04-13 17:55:28] UCB=1.9170 mu=-0.0784 sigma=0.9977 params={'n_steer': 9, 'n_throttle': 5, 'learning_rate': 0.004779159072219799, 'timesteps': 4905}
|
|
[2026-04-13 17:55:28] UCB=1.8806 mu=-0.1103 sigma=0.9954 params={'n_steer': 9, 'n_throttle': 5, 'learning_rate': 0.004921533890627206, 'timesteps': 4615}
|
|
[2026-04-13 17:55:28] UCB=1.8401 mu=-0.1328 sigma=0.9865 params={'n_steer': 9, 'n_throttle': 5, 'learning_rate': 0.004334444786923186, 'timesteps': 4250}
|
|
[2026-04-13 17:55:28] UCB=1.6242 mu=-0.2667 sigma=0.9455 params={'n_steer': 8, 'n_throttle': 5, 'learning_rate': 0.0048913353300923485, 'timesteps': 3328}
|
|
[2026-04-13 17:55:28] [AutoResearch] Proposed: {'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.0011149861891360494, 'timesteps': 4845, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 17:55:30] [AutoResearch] Launching trial 37: {'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.0011149861891360494, 'timesteps': 4845, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:00:33] [AutoResearch] Trial 37 finished in 302.3s, returncode=0
|
|
[2026-04-13 18:00:33] [AutoResearch] Trial 37: mean_reward=15.189 std_reward=0.0224
|
|
[2026-04-13 18:00:33] [AutoResearch] === Trial 37 Summary ===
|
|
[2026-04-13 18:00:33] Total Phase 1 runs: 69
|
|
[2026-04-13 18:00:33] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:00:33] Top 5:
|
|
[2026-04-13 18:00:33] mean_reward=2306.7610 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0004488352572615814, 'timesteps': 4898, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:00:33] mean_reward=2286.9085 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0003386484278685721, 'timesteps': 4977, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:00:33] mean_reward=2237.9305 params={'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.0005660634897015402, 'timesteps': 4954, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:00:33] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:00:33] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:00:35]
|
|
[AutoResearch] ========== Trial 38/50 ==========
|
|
[2026-04-13 18:00:35] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 18:00:35] UCB=2.5712 mu=2.2343 sigma=0.1684 params={'n_steer': 7, 'n_throttle': 2, 'learning_rate': 0.0007984038886242428, 'timesteps': 4697}
|
|
[2026-04-13 18:00:35] UCB=2.1377 mu=1.0094 sigma=0.5642 params={'n_steer': 3, 'n_throttle': 3, 'learning_rate': 0.0005556402930741495, 'timesteps': 4665}
|
|
[2026-04-13 18:00:35] UCB=2.0180 mu=1.7138 sigma=0.1521 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0005979497433581058, 'timesteps': 4838}
|
|
[2026-04-13 18:00:35] UCB=1.7800 mu=-0.1920 sigma=0.9860 params={'n_steer': 8, 'n_throttle': 5, 'learning_rate': 0.00497349496844323, 'timesteps': 4446}
|
|
[2026-04-13 18:00:35] UCB=1.6935 mu=-0.2726 sigma=0.9831 params={'n_steer': 9, 'n_throttle': 4, 'learning_rate': 0.004892390050969622, 'timesteps': 4620}
|
|
[2026-04-13 18:00:35] [AutoResearch] Proposed: {'n_steer': 7, 'n_throttle': 2, 'learning_rate': 0.0007984038886242428, 'timesteps': 4697, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:00:37] [AutoResearch] Launching trial 38: {'n_steer': 7, 'n_throttle': 2, 'learning_rate': 0.0007984038886242428, 'timesteps': 4697, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:05:51] [AutoResearch] Trial 38 finished in 314.1s, returncode=0
|
|
[2026-04-13 18:05:51] [AutoResearch] Trial 38: mean_reward=634.4026 std_reward=27.7421
|
|
[2026-04-13 18:05:51] [AutoResearch] === Trial 38 Summary ===
|
|
[2026-04-13 18:05:51] Total Phase 1 runs: 70
|
|
[2026-04-13 18:05:51] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:05:51] Top 5:
|
|
[2026-04-13 18:05:51] mean_reward=2306.7610 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0004488352572615814, 'timesteps': 4898, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:05:51] mean_reward=2286.9085 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0003386484278685721, 'timesteps': 4977, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:05:51] mean_reward=2237.9305 params={'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.0005660634897015402, 'timesteps': 4954, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:05:51] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:05:51] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:05:53]
|
|
[AutoResearch] ========== Trial 39/50 ==========
|
|
[2026-04-13 18:05:53] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 18:05:53] UCB=4.5962 mu=4.2417 sigma=0.1772 params={'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.0011274566858403105, 'timesteps': 4920}
|
|
[2026-04-13 18:05:53] UCB=4.3882 mu=3.3729 sigma=0.5076 params={'n_steer': 4, 'n_throttle': 3, 'learning_rate': 0.0011908186822811213, 'timesteps': 4883}
|
|
[2026-04-13 18:05:53] UCB=3.9910 mu=3.4870 sigma=0.2520 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0006332928008448982, 'timesteps': 4906}
|
|
[2026-04-13 18:05:53] UCB=1.7785 mu=-0.1716 sigma=0.9750 params={'n_steer': 8, 'n_throttle': 5, 'learning_rate': 0.003871616386476668, 'timesteps': 4924}
|
|
[2026-04-13 18:05:53] UCB=1.7617 mu=-0.2085 sigma=0.9851 params={'n_steer': 8, 'n_throttle': 5, 'learning_rate': 0.004940867117565567, 'timesteps': 4330}
|
|
[2026-04-13 18:05:53] [AutoResearch] Proposed: {'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.0011274566858403105, 'timesteps': 4920, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:05:55] [AutoResearch] Launching trial 39: {'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.0011274566858403105, 'timesteps': 4920, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:10:40] [AutoResearch] Trial 39 finished in 285.6s, returncode=0
|
|
[2026-04-13 18:10:40] [AutoResearch] Trial 39: mean_reward=59.3316 std_reward=0.52
|
|
[2026-04-13 18:10:40] [AutoResearch] === Trial 39 Summary ===
|
|
[2026-04-13 18:10:40] Total Phase 1 runs: 71
|
|
[2026-04-13 18:10:40] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:10:40] Top 5:
|
|
[2026-04-13 18:10:40] mean_reward=2306.7610 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0004488352572615814, 'timesteps': 4898, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:10:40] mean_reward=2286.9085 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0003386484278685721, 'timesteps': 4977, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:10:40] mean_reward=2237.9305 params={'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.0005660634897015402, 'timesteps': 4954, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:10:40] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:10:40] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:10:42]
|
|
[AutoResearch] ========== Trial 40/50 ==========
|
|
[2026-04-13 18:10:42] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 18:10:42] UCB=8.0147 mu=6.2460 sigma=0.8844 params={'n_steer': 4, 'n_throttle': 2, 'learning_rate': 0.003545717728698868, 'timesteps': 4844}
|
|
[2026-04-13 18:10:42] UCB=7.6008 mu=5.7060 sigma=0.9474 params={'n_steer': 4, 'n_throttle': 2, 'learning_rate': 0.0041166941707886885, 'timesteps': 4609}
|
|
[2026-04-13 18:10:42] UCB=7.4375 mu=5.7581 sigma=0.8397 params={'n_steer': 5, 'n_throttle': 2, 'learning_rate': 0.003813319367886245, 'timesteps': 4028}
|
|
[2026-04-13 18:10:42] UCB=7.1609 mu=5.6563 sigma=0.7523 params={'n_steer': 5, 'n_throttle': 2, 'learning_rate': 0.003093130771847368, 'timesteps': 4995}
|
|
[2026-04-13 18:10:42] UCB=6.9083 mu=5.0808 sigma=0.9138 params={'n_steer': 5, 'n_throttle': 2, 'learning_rate': 0.00407600621129732, 'timesteps': 4000}
|
|
[2026-04-13 18:10:42] [AutoResearch] Proposed: {'n_steer': 4, 'n_throttle': 2, 'learning_rate': 0.003545717728698868, 'timesteps': 4844, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:10:44] [AutoResearch] Launching trial 40: {'n_steer': 4, 'n_throttle': 2, 'learning_rate': 0.003545717728698868, 'timesteps': 4844, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:15:31] [AutoResearch] Trial 40 finished in 286.2s, returncode=0
|
|
[2026-04-13 18:15:31] [AutoResearch] Trial 40: mean_reward=15.0795 std_reward=0.0038
|
|
[2026-04-13 18:15:31] [AutoResearch] === Trial 40 Summary ===
|
|
[2026-04-13 18:15:31] Total Phase 1 runs: 72
|
|
[2026-04-13 18:15:31] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:15:31] Top 5:
|
|
[2026-04-13 18:15:31] mean_reward=2306.7610 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0004488352572615814, 'timesteps': 4898, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:15:31] mean_reward=2286.9085 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0003386484278685721, 'timesteps': 4977, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:15:31] mean_reward=2237.9305 params={'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.0005660634897015402, 'timesteps': 4954, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:15:31] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:15:31] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:15:32] [AutoResearch] Git push complete after trial 40
|
|
[2026-04-13 18:15:34]
|
|
[AutoResearch] ========== Trial 41/50 ==========
|
|
[2026-04-13 18:15:34] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 18:15:34] UCB=7.3140 mu=5.9529 sigma=0.6805 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0032819869113354994, 'timesteps': 4854}
|
|
[2026-04-13 18:15:34] UCB=6.7856 mu=5.4612 sigma=0.6622 params={'n_steer': 8, 'n_throttle': 2, 'learning_rate': 0.0029238891580978905, 'timesteps': 4975}
|
|
[2026-04-13 18:15:34] UCB=6.7722 mu=4.9655 sigma=0.9033 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.004491394112651072, 'timesteps': 4986}
|
|
[2026-04-13 18:15:34] UCB=6.4527 mu=4.8075 sigma=0.8226 params={'n_steer': 8, 'n_throttle': 2, 'learning_rate': 0.003979484400133024, 'timesteps': 4359}
|
|
[2026-04-13 18:15:34] UCB=5.5984 mu=4.0364 sigma=0.7810 params={'n_steer': 8, 'n_throttle': 2, 'learning_rate': 0.0032037950825406776, 'timesteps': 4728}
|
|
[2026-04-13 18:15:34] [AutoResearch] Proposed: {'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0032819869113354994, 'timesteps': 4854, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:15:36] [AutoResearch] Launching trial 41: {'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0032819869113354994, 'timesteps': 4854, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:23:10] [AutoResearch] Trial 41 finished in 454.0s, returncode=0
|
|
[2026-04-13 18:23:10] [AutoResearch] Trial 41: mean_reward=15.6571 std_reward=0.0079
|
|
[2026-04-13 18:23:10] [AutoResearch] === Trial 41 Summary ===
|
|
[2026-04-13 18:23:10] Total Phase 1 runs: 73
|
|
[2026-04-13 18:23:10] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:23:10] Top 5:
|
|
[2026-04-13 18:23:10] mean_reward=2306.7610 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0004488352572615814, 'timesteps': 4898, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:23:10] mean_reward=2286.9085 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0003386484278685721, 'timesteps': 4977, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:23:10] mean_reward=2237.9305 params={'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.0005660634897015402, 'timesteps': 4954, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:23:10] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:23:10] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:23:12]
|
|
[AutoResearch] ========== Trial 42/50 ==========
|
|
[2026-04-13 18:23:12] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 18:23:12] UCB=7.4697 mu=5.6946 sigma=0.8875 params={'n_steer': 8, 'n_throttle': 2, 'learning_rate': 0.004494661479461838, 'timesteps': 4811}
|
|
[2026-04-13 18:23:12] UCB=7.3759 mu=5.5905 sigma=0.8927 params={'n_steer': 7, 'n_throttle': 2, 'learning_rate': 0.004611300185903181, 'timesteps': 4938}
|
|
[2026-04-13 18:23:12] UCB=5.4529 mu=3.6798 sigma=0.8865 params={'n_steer': 7, 'n_throttle': 2, 'learning_rate': 0.004782538904571954, 'timesteps': 4160}
|
|
[2026-04-13 18:23:12] UCB=5.3153 mu=4.0688 sigma=0.6233 params={'n_steer': 5, 'n_throttle': 2, 'learning_rate': 0.004053286435875833, 'timesteps': 4622}
|
|
[2026-04-13 18:23:12] UCB=3.7995 mu=2.3517 sigma=0.7239 params={'n_steer': 8, 'n_throttle': 2, 'learning_rate': 0.0036533517137863577, 'timesteps': 3930}
|
|
[2026-04-13 18:23:12] [AutoResearch] Proposed: {'n_steer': 8, 'n_throttle': 2, 'learning_rate': 0.004494661479461838, 'timesteps': 4811, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:23:14] [AutoResearch] Launching trial 42: {'n_steer': 8, 'n_throttle': 2, 'learning_rate': 0.004494661479461838, 'timesteps': 4811, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:30:58] [AutoResearch] Trial 42 finished in 463.9s, returncode=0
|
|
[2026-04-13 18:30:58] [AutoResearch] Trial 42: mean_reward=980.0742 std_reward=27.1137
|
|
[2026-04-13 18:30:58] [AutoResearch] === Trial 42 Summary ===
|
|
[2026-04-13 18:30:58] Total Phase 1 runs: 74
|
|
[2026-04-13 18:30:58] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:30:58] Top 5:
|
|
[2026-04-13 18:30:58] mean_reward=2306.7610 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0004488352572615814, 'timesteps': 4898, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:30:58] mean_reward=2286.9085 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0003386484278685721, 'timesteps': 4977, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:30:58] mean_reward=2237.9305 params={'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.0005660634897015402, 'timesteps': 4954, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:30:58] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:30:58] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:31:00]
|
|
[AutoResearch] ========== Trial 43/50 ==========
|
|
[2026-04-13 18:31:00] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 18:31:00] UCB=6.9048 mu=5.7145 sigma=0.5951 params={'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.004086162667261962, 'timesteps': 4914}
|
|
[2026-04-13 18:31:00] UCB=4.8000 mu=3.4867 sigma=0.6566 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.004220414914967129, 'timesteps': 4781}
|
|
[2026-04-13 18:31:00] UCB=4.7529 mu=3.9881 sigma=0.3824 params={'n_steer': 7, 'n_throttle': 2, 'learning_rate': 0.0033340801031464364, 'timesteps': 4807}
|
|
[2026-04-13 18:31:00] UCB=3.8268 mu=2.8515 sigma=0.4877 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0031689897677083267, 'timesteps': 4975}
|
|
[2026-04-13 18:31:00] UCB=3.8163 mu=2.2537 sigma=0.7813 params={'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.004053620338510361, 'timesteps': 3939}
|
|
[2026-04-13 18:31:00] [AutoResearch] Proposed: {'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.004086162667261962, 'timesteps': 4914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:31:02] [AutoResearch] Launching trial 43: {'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.004086162667261962, 'timesteps': 4914, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:38:03] [AutoResearch] Trial 43 finished in 420.3s, returncode=0
|
|
[2026-04-13 18:38:03] [AutoResearch] Trial 43: mean_reward=15.822 std_reward=0.0155
|
|
[2026-04-13 18:38:03] [AutoResearch] === Trial 43 Summary ===
|
|
[2026-04-13 18:38:03] Total Phase 1 runs: 75
|
|
[2026-04-13 18:38:03] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:38:03] Top 5:
|
|
[2026-04-13 18:38:03] mean_reward=2306.7610 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0004488352572615814, 'timesteps': 4898, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:38:03] mean_reward=2286.9085 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0003386484278685721, 'timesteps': 4977, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:38:03] mean_reward=2237.9305 params={'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.0005660634897015402, 'timesteps': 4954, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:38:03] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:38:03] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:38:05]
|
|
[AutoResearch] ========== Trial 44/50 ==========
|
|
[2026-04-13 18:38:05] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 18:38:05] UCB=3.5615 mu=2.1320 sigma=0.7148 params={'n_steer': 9, 'n_throttle': 2, 'learning_rate': 0.0029018837242704995, 'timesteps': 4982}
|
|
[2026-04-13 18:38:05] UCB=2.8781 mu=2.1898 sigma=0.3442 params={'n_steer': 8, 'n_throttle': 2, 'learning_rate': 0.0041151904231291935, 'timesteps': 4953}
|
|
[2026-04-13 18:38:05] UCB=2.6776 mu=1.6178 sigma=0.5299 params={'n_steer': 8, 'n_throttle': 2, 'learning_rate': 0.0027281830436155017, 'timesteps': 4721}
|
|
[2026-04-13 18:38:05] UCB=2.3508 mu=1.1391 sigma=0.6058 params={'n_steer': 8, 'n_throttle': 2, 'learning_rate': 0.003576630797117473, 'timesteps': 3955}
|
|
[2026-04-13 18:38:05] UCB=2.3297 mu=1.9936 sigma=0.1680 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.00105351764257976, 'timesteps': 4976}
|
|
[2026-04-13 18:38:05] [AutoResearch] Proposed: {'n_steer': 9, 'n_throttle': 2, 'learning_rate': 0.0029018837242704995, 'timesteps': 4982, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:38:07] [AutoResearch] Launching trial 44: {'n_steer': 9, 'n_throttle': 2, 'learning_rate': 0.0029018837242704995, 'timesteps': 4982, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:45:15] [AutoResearch] Trial 44 finished in 428.2s, returncode=0
|
|
[2026-04-13 18:45:15] [AutoResearch] Trial 44: mean_reward=14.9716 std_reward=0.0169
|
|
[2026-04-13 18:45:15] [AutoResearch] === Trial 44 Summary ===
|
|
[2026-04-13 18:45:15] Total Phase 1 runs: 76
|
|
[2026-04-13 18:45:15] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:45:15] Top 5:
|
|
[2026-04-13 18:45:15] mean_reward=2306.7610 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0004488352572615814, 'timesteps': 4898, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:45:15] mean_reward=2286.9085 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0003386484278685721, 'timesteps': 4977, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:45:15] mean_reward=2237.9305 params={'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.0005660634897015402, 'timesteps': 4954, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:45:15] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:45:15] mean_reward=1389.3806 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005504110507719487, 'timesteps': 2472, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:45:17]
|
|
[AutoResearch] ========== Trial 45/50 ==========
|
|
[2026-04-13 18:45:17] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 18:45:17] UCB=3.7477 mu=3.0769 sigma=0.3354 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0007707060431765714, 'timesteps': 4976}
|
|
[2026-04-13 18:45:17] UCB=2.5970 mu=1.6330 sigma=0.4820 params={'n_steer': 4, 'n_throttle': 4, 'learning_rate': 0.00020050085345383605, 'timesteps': 4929}
|
|
[2026-04-13 18:45:17] UCB=2.5924 mu=1.4768 sigma=0.5578 params={'n_steer': 3, 'n_throttle': 3, 'learning_rate': 0.0004133771186667665, 'timesteps': 4856}
|
|
[2026-04-13 18:45:17] UCB=2.2774 mu=0.9909 sigma=0.6433 params={'n_steer': 3, 'n_throttle': 3, 'learning_rate': 0.000258239957476511, 'timesteps': 4860}
|
|
[2026-04-13 18:45:17] UCB=2.1102 mu=1.4400 sigma=0.3351 params={'n_steer': 7, 'n_throttle': 2, 'learning_rate': 0.0032581964017107715, 'timesteps': 4659}
|
|
[2026-04-13 18:45:17] [AutoResearch] Proposed: {'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0007707060431765714, 'timesteps': 4976, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:45:19] [AutoResearch] Launching trial 45: {'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0007707060431765714, 'timesteps': 4976, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:51:12] [AutoResearch] Trial 45 finished in 353.6s, returncode=0
|
|
[2026-04-13 18:51:12] [AutoResearch] Trial 45: mean_reward=4314.8893 std_reward=709.8281
|
|
[2026-04-13 18:51:12] [AutoResearch] === Trial 45 Summary ===
|
|
[2026-04-13 18:51:12] Total Phase 1 runs: 77
|
|
[2026-04-13 18:51:12] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:51:12] Top 5:
|
|
[2026-04-13 18:51:12] mean_reward=4314.8893 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0007707060431765714, 'timesteps': 4976, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:51:12] mean_reward=2306.7610 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0004488352572615814, 'timesteps': 4898, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:51:12] mean_reward=2286.9085 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0003386484278685721, 'timesteps': 4977, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:51:12] mean_reward=2237.9305 params={'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.0005660634897015402, 'timesteps': 4954, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:51:12] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:51:14]
|
|
[AutoResearch] ========== Trial 46/50 ==========
|
|
[2026-04-13 18:51:14] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 18:51:14] UCB=6.4999 mu=5.0495 sigma=0.7252 params={'n_steer': 4, 'n_throttle': 3, 'learning_rate': 0.0023937750629647986, 'timesteps': 4836}
|
|
[2026-04-13 18:51:14] UCB=5.7084 mu=4.9106 sigma=0.3989 params={'n_steer': 4, 'n_throttle': 3, 'learning_rate': 0.0008748090132533172, 'timesteps': 4866}
|
|
[2026-04-13 18:51:14] UCB=5.5209 mu=3.8795 sigma=0.8207 params={'n_steer': 3, 'n_throttle': 4, 'learning_rate': 0.001388455889997006, 'timesteps': 4861}
|
|
[2026-04-13 18:51:14] UCB=5.3207 mu=3.4850 sigma=0.9178 params={'n_steer': 3, 'n_throttle': 4, 'learning_rate': 0.002305837623532898, 'timesteps': 4993}
|
|
[2026-04-13 18:51:14] UCB=4.4431 mu=2.7688 sigma=0.8372 params={'n_steer': 3, 'n_throttle': 3, 'learning_rate': 0.0023504082814860415, 'timesteps': 4636}
|
|
[2026-04-13 18:51:14] [AutoResearch] Proposed: {'n_steer': 4, 'n_throttle': 3, 'learning_rate': 0.0023937750629647986, 'timesteps': 4836, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:51:16] [AutoResearch] Launching trial 46: {'n_steer': 4, 'n_throttle': 3, 'learning_rate': 0.0023937750629647986, 'timesteps': 4836, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:57:23] [AutoResearch] Trial 46 finished in 366.5s, returncode=0
|
|
[2026-04-13 18:57:23] [AutoResearch] Trial 46: mean_reward=14.9853 std_reward=0.0252
|
|
[2026-04-13 18:57:23] [AutoResearch] === Trial 46 Summary ===
|
|
[2026-04-13 18:57:23] Total Phase 1 runs: 78
|
|
[2026-04-13 18:57:23] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:57:23] Top 5:
|
|
[2026-04-13 18:57:23] mean_reward=4314.8893 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0007707060431765714, 'timesteps': 4976, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:57:23] mean_reward=2306.7610 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0004488352572615814, 'timesteps': 4898, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:57:23] mean_reward=2286.9085 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0003386484278685721, 'timesteps': 4977, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:57:23] mean_reward=2237.9305 params={'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.0005660634897015402, 'timesteps': 4954, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:57:23] mean_reward=1859.8470 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0005669006119489946, 'timesteps': 2156, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:57:25]
|
|
[AutoResearch] ========== Trial 47/50 ==========
|
|
[2026-04-13 18:57:25] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 18:57:25] UCB=4.1357 mu=3.0492 sigma=0.5433 params={'n_steer': 4, 'n_throttle': 3, 'learning_rate': 0.0004222996001111442, 'timesteps': 4942}
|
|
[2026-04-13 18:57:25] UCB=2.2577 mu=1.0331 sigma=0.6123 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.001404145260025, 'timesteps': 4843}
|
|
[2026-04-13 18:57:25] UCB=1.8012 mu=0.3762 sigma=0.7125 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0019228183558907053, 'timesteps': 4832}
|
|
[2026-04-13 18:57:25] UCB=1.7332 mu=1.2215 sigma=0.2559 params={'n_steer': 8, 'n_throttle': 2, 'learning_rate': 0.0040978681388418505, 'timesteps': 4846}
|
|
[2026-04-13 18:57:25] UCB=1.7225 mu=-0.1961 sigma=0.9593 params={'n_steer': 9, 'n_throttle': 5, 'learning_rate': 0.0034754845633088315, 'timesteps': 4434}
|
|
[2026-04-13 18:57:25] [AutoResearch] Proposed: {'n_steer': 4, 'n_throttle': 3, 'learning_rate': 0.0004222996001111442, 'timesteps': 4942, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 18:57:27] [AutoResearch] Launching trial 47: {'n_steer': 4, 'n_throttle': 3, 'learning_rate': 0.0004222996001111442, 'timesteps': 4942, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 19:03:22] [AutoResearch] Trial 47 finished in 355.3s, returncode=0
|
|
[2026-04-13 19:03:22] [AutoResearch] Trial 47: mean_reward=4462.293 std_reward=2.1401
|
|
[2026-04-13 19:03:22] [AutoResearch] === Trial 47 Summary ===
|
|
[2026-04-13 19:03:22] Total Phase 1 runs: 79
|
|
[2026-04-13 19:03:22] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 19:03:22] Top 5:
|
|
[2026-04-13 19:03:22] mean_reward=4462.2930 params={'n_steer': 4, 'n_throttle': 3, 'learning_rate': 0.0004222996001111442, 'timesteps': 4942, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 19:03:22] mean_reward=4314.8893 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0007707060431765714, 'timesteps': 4976, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 19:03:22] mean_reward=2306.7610 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0004488352572615814, 'timesteps': 4898, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 19:03:22] mean_reward=2286.9085 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0003386484278685721, 'timesteps': 4977, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 19:03:22] mean_reward=2237.9305 params={'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.0005660634897015402, 'timesteps': 4954, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 19:03:24]
|
|
[AutoResearch] ========== Trial 48/50 ==========
|
|
[2026-04-13 19:03:24] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 19:03:24] UCB=4.1327 mu=2.7945 sigma=0.6691 params={'n_steer': 3, 'n_throttle': 2, 'learning_rate': 0.0011350422703903862, 'timesteps': 4898}
|
|
[2026-04-13 19:03:24] UCB=3.1163 mu=2.2489 sigma=0.4337 params={'n_steer': 3, 'n_throttle': 3, 'learning_rate': 0.00121598827031687, 'timesteps': 4696}
|
|
[2026-04-13 19:03:24] UCB=2.5824 mu=1.2026 sigma=0.6899 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0016751005980373689, 'timesteps': 4825}
|
|
[2026-04-13 19:03:24] UCB=2.2379 mu=1.9070 sigma=0.1655 params={'n_steer': 6, 'n_throttle': 3, 'learning_rate': 0.001006125943831317, 'timesteps': 4941}
|
|
[2026-04-13 19:03:24] UCB=2.1810 mu=1.1490 sigma=0.5160 params={'n_steer': 4, 'n_throttle': 2, 'learning_rate': 0.0005172426207023312, 'timesteps': 4618}
|
|
[2026-04-13 19:03:24] [AutoResearch] Proposed: {'n_steer': 3, 'n_throttle': 2, 'learning_rate': 0.0011350422703903862, 'timesteps': 4898, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 19:03:26] [AutoResearch] Launching trial 48: {'n_steer': 3, 'n_throttle': 2, 'learning_rate': 0.0011350422703903862, 'timesteps': 4898, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 19:08:23] [AutoResearch] Trial 48 finished in 296.5s, returncode=0
|
|
[2026-04-13 19:08:23] [AutoResearch] Trial 48: mean_reward=16.8539 std_reward=0.0201
|
|
[2026-04-13 19:08:23] [AutoResearch] === Trial 48 Summary ===
|
|
[2026-04-13 19:08:23] Total Phase 1 runs: 80
|
|
[2026-04-13 19:08:23] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 19:08:23] Top 5:
|
|
[2026-04-13 19:08:23] mean_reward=4462.2930 params={'n_steer': 4, 'n_throttle': 3, 'learning_rate': 0.0004222996001111442, 'timesteps': 4942, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 19:08:23] mean_reward=4314.8893 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0007707060431765714, 'timesteps': 4976, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 19:08:23] mean_reward=2306.7610 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0004488352572615814, 'timesteps': 4898, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 19:08:23] mean_reward=2286.9085 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0003386484278685721, 'timesteps': 4977, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 19:08:23] mean_reward=2237.9305 params={'n_steer': 6, 'n_throttle': 2, 'learning_rate': 0.0005660634897015402, 'timesteps': 4954, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 19:08:25]
|
|
[AutoResearch] ========== Trial 49/50 ==========
|
|
[2026-04-13 19:08:25] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 19:08:25] UCB=6.7812 mu=6.4357 sigma=0.1727 params={'n_steer': 4, 'n_throttle': 3, 'learning_rate': 0.0010146909128518657, 'timesteps': 4979}
|
|
[2026-04-13 19:08:25] UCB=3.9810 mu=3.6607 sigma=0.1602 params={'n_steer': 4, 'n_throttle': 3, 'learning_rate': 0.0019488453328446698, 'timesteps': 4998}
|
|
[2026-04-13 19:08:25] UCB=2.8979 mu=1.9399 sigma=0.4790 params={'n_steer': 4, 'n_throttle': 4, 'learning_rate': 0.0001502626554457094, 'timesteps': 4829}
|
|
[2026-04-13 19:08:25] UCB=2.4072 mu=0.8987 sigma=0.7542 params={'n_steer': 4, 'n_throttle': 5, 'learning_rate': 0.0015006496751962756, 'timesteps': 4966}
|
|
[2026-04-13 19:08:25] UCB=2.3469 mu=0.6192 sigma=0.8639 params={'n_steer': 3, 'n_throttle': 4, 'learning_rate': 0.001942190017735736, 'timesteps': 4781}
|
|
[2026-04-13 19:08:25] [AutoResearch] Proposed: {'n_steer': 4, 'n_throttle': 3, 'learning_rate': 0.0010146909128518657, 'timesteps': 4979, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 19:08:27] [AutoResearch] Launching trial 49: {'n_steer': 4, 'n_throttle': 3, 'learning_rate': 0.0010146909128518657, 'timesteps': 4979, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 19:13:25] [AutoResearch] Trial 49 finished in 298.3s, returncode=0
|
|
[2026-04-13 19:13:25] [AutoResearch] Trial 49: mean_reward=3332.0024 std_reward=5.8587
|
|
[2026-04-13 19:13:25] [AutoResearch] === Trial 49 Summary ===
|
|
[2026-04-13 19:13:25] Total Phase 1 runs: 81
|
|
[2026-04-13 19:13:25] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 19:13:25] Top 5:
|
|
[2026-04-13 19:13:25] mean_reward=4462.2930 params={'n_steer': 4, 'n_throttle': 3, 'learning_rate': 0.0004222996001111442, 'timesteps': 4942, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 19:13:25] mean_reward=4314.8893 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0007707060431765714, 'timesteps': 4976, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 19:13:25] mean_reward=3332.0024 params={'n_steer': 4, 'n_throttle': 3, 'learning_rate': 0.0010146909128518657, 'timesteps': 4979, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 19:13:25] mean_reward=2306.7610 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0004488352572615814, 'timesteps': 4898, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 19:13:25] mean_reward=2286.9085 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0003386484278685721, 'timesteps': 4977, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 19:13:27]
|
|
[AutoResearch] ========== Trial 50/50 ==========
|
|
[2026-04-13 19:13:27] [AutoResearch] GP UCB top-5 candidates:
|
|
[2026-04-13 19:13:27] UCB=4.4481 mu=4.0168 sigma=0.2157 params={'n_steer': 4, 'n_throttle': 3, 'learning_rate': 0.0007722480975921185, 'timesteps': 4972}
|
|
[2026-04-13 19:13:27] UCB=3.7707 mu=3.0773 sigma=0.3467 params={'n_steer': 4, 'n_throttle': 2, 'learning_rate': 0.00025395141290818565, 'timesteps': 4833}
|
|
[2026-04-13 19:13:27] UCB=3.5632 mu=2.6471 sigma=0.4581 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.003481880426416106, 'timesteps': 4938}
|
|
[2026-04-13 19:13:27] UCB=3.4746 mu=1.8819 sigma=0.7964 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.004090738433673601, 'timesteps': 4988}
|
|
[2026-04-13 19:13:27] UCB=2.9865 mu=1.3160 sigma=0.8353 params={'n_steer': 5, 'n_throttle': 4, 'learning_rate': 0.0030171536490468946, 'timesteps': 4972}
|
|
[2026-04-13 19:13:27] [AutoResearch] Proposed: {'n_steer': 4, 'n_throttle': 3, 'learning_rate': 0.0007722480975921185, 'timesteps': 4972, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 19:13:29] [AutoResearch] Launching trial 50: {'n_steer': 4, 'n_throttle': 3, 'learning_rate': 0.0007722480975921185, 'timesteps': 4972, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 19:17:56] [AutoResearch] Trial 50 finished in 267.0s, returncode=0
|
|
[2026-04-13 19:17:56] [AutoResearch] Trial 50: mean_reward=20.8886 std_reward=0.0537
|
|
[2026-04-13 19:17:56] [AutoResearch] === Trial 50 Summary ===
|
|
[2026-04-13 19:17:56] Total Phase 1 runs: 82
|
|
[2026-04-13 19:17:56] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 19:17:56] Top 5:
|
|
[2026-04-13 19:17:56] mean_reward=4462.2930 params={'n_steer': 4, 'n_throttle': 3, 'learning_rate': 0.0004222996001111442, 'timesteps': 4942, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 19:17:56] mean_reward=4314.8893 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0007707060431765714, 'timesteps': 4976, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 19:17:56] mean_reward=3332.0024 params={'n_steer': 4, 'n_throttle': 3, 'learning_rate': 0.0010146909128518657, 'timesteps': 4979, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 19:17:56] mean_reward=2306.7610 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0004488352572615814, 'timesteps': 4898, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 19:17:56] mean_reward=2286.9085 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0003386484278685721, 'timesteps': 4977, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 19:17:58] [AutoResearch] Git push complete after trial 50
|
|
[2026-04-13 19:18:00] [AutoResearch] All trials complete!
|
|
[2026-04-13 19:18:00] [AutoResearch] === Trial 50 Summary ===
|
|
[2026-04-13 19:18:00] Total Phase 1 runs: 82
|
|
[2026-04-13 19:18:00] Champion: trial=5 mean_reward=4582.7984 params={'n_steer': 7, 'n_throttle': 3, 'learning_rate': 0.0006801262090358742, 'timesteps': 4787, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 19:18:00] Top 5:
|
|
[2026-04-13 19:18:00] mean_reward=4462.2930 params={'n_steer': 4, 'n_throttle': 3, 'learning_rate': 0.0004222996001111442, 'timesteps': 4942, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 19:18:00] mean_reward=4314.8893 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0007707060431765714, 'timesteps': 4976, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 19:18:00] mean_reward=3332.0024 params={'n_steer': 4, 'n_throttle': 3, 'learning_rate': 0.0010146909128518657, 'timesteps': 4979, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 19:18:00] mean_reward=2306.7610 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0004488352572615814, 'timesteps': 4898, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|
|
[2026-04-13 19:18:00] mean_reward=2286.9085 params={'n_steer': 5, 'n_throttle': 3, 'learning_rate': 0.0003386484278685721, 'timesteps': 4977, 'agent': 'ppo', 'eval_episodes': 3, 'reward_shaping': True}
|