[21:23:45] ============================================================ [21:23:45] Exp 22: generated_road + generated_track, warm-started, v6 reward [21:23:45] Warm start: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/champion/model.zip [21:23:45] Sim 1: localhost:9091 -> generated_road [21:23:45] Sim 2: localhost:9093 -> generated_track [21:23:45] throttle_min=0.2, lr=0.000225, total=150,000 [21:23:45] Reward: v6 (speed x CTE with progress/efficiency exploit termination) [21:23:45] Stuck timeout: 8.0s, hard cap: 25.0s [21:23:45] Progress patience: 100 steps [21:23:45] Checkpoints: every 10,000 steps [21:23:45] ============================================================ [21:23:45] Creating DummyVecEnv with the two road tracks... starting DonkeyGym env Setting default: start_delay 5.0 Setting default: max_cte 8.0 Setting default: frame_skip 1 Setting default: cam_resolution (120, 160, 3) Setting default: log_level 20 Setting default: steer_limit 1.0 Setting default: throttle_min 0.0 Setting default: throttle_max 1.0 loading scene generated_road starting DonkeyGym env Setting default: start_delay 5.0 Setting default: max_cte 8.0 Setting default: frame_skip 1 Setting default: cam_resolution (120, 160, 3) Setting default: log_level 20 Setting default: steer_limit 1.0 Setting default: throttle_min 0.0 Setting default: throttle_max 1.0 loading scene generated_track [21:23:47] VecEnv num_envs=2, obs=(3, 120, 160) [21:23:50] Warm-start model attached. Starting training... --------------------------------- | rollout/ | | | ep_len_mean | 118 | | ep_rew_mean | 102 | | time/ | | | fps | 29 | | iterations | 1 | | time_elapsed | 139 | | total_timesteps | 18432 | --------------------------------- ----------------------------------------- | rollout/ | | | ep_len_mean | 118 | | ep_rew_mean | 102 | | time/ | | | fps | 21 | | iterations | 2 | | time_elapsed | 378 | | total_timesteps | 22528 | | train/ | | | approx_kl | 0.024176385 | | clip_fraction | 0.244 | | clip_range | 0.2 | | entropy_loss | -2.79 | | explained_variance | -1.36 | | learning_rate | 0.000225 | | loss | 12.5 | | n_updates | 80 | | policy_gradient_loss | 0.0113 | | std | 0.976 | | value_loss | 41.2 | ----------------------------------------- ----------------------------------------- | rollout/ | | | ep_len_mean | 118 | | ep_rew_mean | 102 | | time/ | | | fps | 19 | | iterations | 3 | | time_elapsed | 616 | | total_timesteps | 26624 | | train/ | | | approx_kl | 0.021042215 | | clip_fraction | 0.227 | | clip_range | 0.2 | | entropy_loss | -2.77 | | explained_variance | 0.519 | | learning_rate | 0.000225 | | loss | 2.82 | | n_updates | 90 | | policy_gradient_loss | 0.00236 | | std | 0.959 | | value_loss | 9.14 | ----------------------------------------- [21:35:50] [10,000/150,000] Checkpoint saved: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/exp22-generated-pair-warm-v6/checkpoint_0010000.zip [21:35:56] Eval: gen_road=3.0r/64s ❌@64 gen_track=1.1r/63s ❌@63 [21:35:56] NEW BEST: combined steps=127 reward=4.1 --------------------------------- | rollout/ | | | ep_len_mean | 118 | | ep_rew_mean | 102 | | time/ | | | fps | 31 | | iterations | 1 | | time_elapsed | 129 | | total_timesteps | 30720 | --------------------------------- ----------------------------------------- | rollout/ | | | ep_len_mean | 118 | | ep_rew_mean | 102 | | time/ | | | fps | 22 | | iterations | 2 | | time_elapsed | 357 | | total_timesteps | 34816 | | train/ | | | approx_kl | 0.027895104 | | clip_fraction | 0.222 | | clip_range | 0.2 | | entropy_loss | -2.67 | | explained_variance | 0.27 | | learning_rate | 0.000225 | | loss | 0.0657 | | n_updates | 110 | | policy_gradient_loss | -0.0236 | | std | 0.907 | | value_loss | 0.549 | ----------------------------------------- ----------------------------------------- | rollout/ | | | ep_len_mean | 118 | | ep_rew_mean | 102 | | time/ | | | fps | 20 | | iterations | 3 | | time_elapsed | 587 | | total_timesteps | 38912 | | train/ | | | approx_kl | 0.038819656 | | clip_fraction | 0.24 | | clip_range | 0.2 | | entropy_loss | -2.63 | | explained_variance | 0.346 | | learning_rate | 0.000225 | | loss | -0.0014 | | n_updates | 120 | | policy_gradient_loss | -0.0293 | | std | 0.893 | | value_loss | 0.157 | ----------------------------------------- [21:47:36] [20,000/150,000] Checkpoint saved: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/exp22-generated-pair-warm-v6/checkpoint_0020000.zip [21:47:42] Eval: gen_road=2.9r/64s ❌@64 gen_track=1.1r/63s ❌@63 --------------------------------- | rollout/ | | | ep_len_mean | 118 | | ep_rew_mean | 102 | | time/ | | | fps | 33 | | iterations | 1 | | time_elapsed | 122 | | total_timesteps | 43008 | --------------------------------- ----------------------------------------- | rollout/ | | | ep_len_mean | 118 | | ep_rew_mean | 102 | | time/ | | | fps | 23 | | iterations | 2 | | time_elapsed | 351 | | total_timesteps | 47104 | | train/ | | | approx_kl | 0.060704876 | | clip_fraction | 0.327 | | clip_range | 0.2 | | entropy_loss | -2.53 | | explained_variance | 0.877 | | learning_rate | 0.000225 | | loss | -0.0427 | | n_updates | 140 | | policy_gradient_loss | -0.045 | | std | 0.847 | | value_loss | 0.0676 | ----------------------------------------- ---------------------------------------- | rollout/ | | | ep_len_mean | 118 | | ep_rew_mean | 102 | | time/ | | | fps | 21 | | iterations | 3 | | time_elapsed | 571 | | total_timesteps | 51200 | | train/ | | | approx_kl | 0.06585144 | | clip_fraction | 0.35 | | clip_range | 0.2 | | entropy_loss | -2.49 | | explained_variance | 0.883 | | learning_rate | 0.000225 | | loss | -0.0429 | | n_updates | 150 | | policy_gradient_loss | -0.0419 | | std | 0.833 | | value_loss | 0.0814 | ---------------------------------------- [21:58:56] [30,000/150,000] Checkpoint saved: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/exp22-generated-pair-warm-v6/checkpoint_0030000.zip [21:59:02] Eval: gen_road=2.7r/63s ❌@63 gen_track=1.1r/62s ❌@62 --------------------------------- | rollout/ | | | ep_len_mean | 118 | | ep_rew_mean | 102 | | time/ | | | fps | 33 | | iterations | 1 | | time_elapsed | 121 | | total_timesteps | 55296 | --------------------------------- ----------------------------------------- | rollout/ | | | ep_len_mean | 118 | | ep_rew_mean | 102 | | time/ | | | fps | 23 | | iterations | 2 | | time_elapsed | 343 | | total_timesteps | 59392 | | train/ | | | approx_kl | 0.096836925 | | clip_fraction | 0.422 | | clip_range | 0.2 | | entropy_loss | -2.42 | | explained_variance | 0.85 | | learning_rate | 0.000225 | | loss | -0.0767 | | n_updates | 170 | | policy_gradient_loss | -0.053 | | std | 0.8 | | value_loss | 0.0973 | ----------------------------------------- ---------------------------------------- | rollout/ | | | ep_len_mean | 118 | | ep_rew_mean | 102 | | time/ | | | fps | 22 | | iterations | 3 | | time_elapsed | 556 | | total_timesteps | 63488 | | train/ | | | approx_kl | 0.16407205 | | clip_fraction | 0.461 | | clip_range | 0.2 | | entropy_loss | -2.35 | | explained_variance | 0.9 | | learning_rate | 0.000225 | | loss | -0.0875 | | n_updates | 180 | | policy_gradient_loss | -0.0633 | | std | 0.758 | | value_loss | 0.035 | ---------------------------------------- [22:10:03] [40,000/150,000] Checkpoint saved: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/exp22-generated-pair-warm-v6/checkpoint_0040000.zip [22:10:09] Eval: gen_road=3.1r/59s ❌@59 gen_track=1.3r/58s ❌@58 --------------------------------- | rollout/ | | | ep_len_mean | 118 | | ep_rew_mean | 102 | | time/ | | | fps | 36 | | iterations | 1 | | time_elapsed | 113 | | total_timesteps | 67584 | --------------------------------- ---------------------------------------- | rollout/ | | | ep_len_mean | 118 | | ep_rew_mean | 102 | | time/ | | | fps | 24 | | iterations | 2 | | time_elapsed | 329 | | total_timesteps | 71680 | | train/ | | | approx_kl | 0.17689857 | | clip_fraction | 0.489 | | clip_range | 0.2 | | entropy_loss | -2.18 | | explained_variance | 0.917 | | learning_rate | 0.000225 | | loss | -0.0885 | | n_updates | 200 | | policy_gradient_loss | -0.0635 | | std | 0.698 | | value_loss | 0.054 | ---------------------------------------- --------------------------------------- | rollout/ | | | ep_len_mean | 118 | | ep_rew_mean | 102 | | time/ | | | fps | 22 | | iterations | 3 | | time_elapsed | 548 | | total_timesteps | 75776 | | train/ | | | approx_kl | 0.1996874 | | clip_fraction | 0.506 | | clip_range | 0.2 | | entropy_loss | -2.08 | | explained_variance | 0.933 | | learning_rate | 0.000225 | | loss | -0.0906 | | n_updates | 210 | | policy_gradient_loss | -0.0629 | | std | 0.666 | | value_loss | 0.043 | --------------------------------------- [22:20:58] [50,000/150,000] Checkpoint saved: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/exp22-generated-pair-warm-v6/checkpoint_0050000.zip [22:21:04] Eval: gen_road=5.9r/67s ❌@67 gen_track=1.6r/66s ❌@66 [22:21:04] NEW BEST: combined steps=133 reward=7.6 --------------------------------- | rollout/ | | | ep_len_mean | 118 | | ep_rew_mean | 102 | | time/ | | | fps | 35 | | iterations | 1 | | time_elapsed | 115 | | total_timesteps | 79872 | --------------------------------- -------------------------------------- | rollout/ | | | ep_len_mean | 118 | | ep_rew_mean | 102 | | time/ | | | fps | 25 | | iterations | 2 | | time_elapsed | 326 | | total_timesteps | 83968 | | train/ | | | approx_kl | 0.254287 | | clip_fraction | 0.543 | | clip_range | 0.2 | | entropy_loss | -1.94 | | explained_variance | 0.89 | | learning_rate | 0.000225 | | loss | -0.102 | | n_updates | 230 | | policy_gradient_loss | -0.0707 | | std | 0.62 | | value_loss | 0.0646 | -------------------------------------- ---------------------------------------- | rollout/ | | | ep_len_mean | 118 | | ep_rew_mean | 102 | | time/ | | | fps | 22 | | iterations | 3 | | time_elapsed | 551 | | total_timesteps | 88064 | | train/ | | | approx_kl | 0.32521772 | | clip_fraction | 0.604 | | clip_range | 0.2 | | entropy_loss | -1.85 | | explained_variance | 0.803 | | learning_rate | 0.000225 | | loss | -0.0781 | | n_updates | 240 | | policy_gradient_loss | -0.0776 | | std | 0.594 | | value_loss | 0.102 | ---------------------------------------- [22:32:03] [60,000/150,000] Checkpoint saved: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/exp22-generated-pair-warm-v6/checkpoint_0060000.zip [22:32:09] Eval: gen_road=7.7r/93s ❌@93 gen_track=3.7r/92s ❌@92 [22:32:09] NEW BEST: combined steps=185 reward=11.4 --------------------------------- | rollout/ | | | ep_len_mean | 118 | | ep_rew_mean | 102 | | time/ | | | fps | 43 | | iterations | 1 | | time_elapsed | 94 | | total_timesteps | 92160 | ---------------------------------