donkeycar-rl-autoresearch/agent/models/exp22-generated-pair-warm-v6/run_2026-04-28_2132_openfix...

384 lines
16 KiB
Plaintext

[21:23:45] ============================================================
[21:23:45] Exp 22: generated_road + generated_track, warm-started, v6 reward
[21:23:45] Warm start: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/champion/model.zip
[21:23:45] Sim 1: localhost:9091 -> generated_road
[21:23:45] Sim 2: localhost:9093 -> generated_track
[21:23:45] throttle_min=0.2, lr=0.000225, total=150,000
[21:23:45] Reward: v6 (speed x CTE with progress/efficiency exploit termination)
[21:23:45] Stuck timeout: 8.0s, hard cap: 25.0s
[21:23:45] Progress patience: 100 steps
[21:23:45] Checkpoints: every 10,000 steps
[21:23:45] ============================================================
[21:23:45] Creating DummyVecEnv with the two road tracks...
starting DonkeyGym env
Setting default: start_delay 5.0
Setting default: max_cte 8.0
Setting default: frame_skip 1
Setting default: cam_resolution (120, 160, 3)
Setting default: log_level 20
Setting default: steer_limit 1.0
Setting default: throttle_min 0.0
Setting default: throttle_max 1.0
loading scene generated_road
starting DonkeyGym env
Setting default: start_delay 5.0
Setting default: max_cte 8.0
Setting default: frame_skip 1
Setting default: cam_resolution (120, 160, 3)
Setting default: log_level 20
Setting default: steer_limit 1.0
Setting default: throttle_min 0.0
Setting default: throttle_max 1.0
loading scene generated_track
[21:23:47] VecEnv num_envs=2, obs=(3, 120, 160)
[21:23:50] Warm-start model attached. Starting training...
---------------------------------
| rollout/ | |
| ep_len_mean | 118 |
| ep_rew_mean | 102 |
| time/ | |
| fps | 29 |
| iterations | 1 |
| time_elapsed | 139 |
| total_timesteps | 18432 |
---------------------------------
-----------------------------------------
| rollout/ | |
| ep_len_mean | 118 |
| ep_rew_mean | 102 |
| time/ | |
| fps | 21 |
| iterations | 2 |
| time_elapsed | 378 |
| total_timesteps | 22528 |
| train/ | |
| approx_kl | 0.024176385 |
| clip_fraction | 0.244 |
| clip_range | 0.2 |
| entropy_loss | -2.79 |
| explained_variance | -1.36 |
| learning_rate | 0.000225 |
| loss | 12.5 |
| n_updates | 80 |
| policy_gradient_loss | 0.0113 |
| std | 0.976 |
| value_loss | 41.2 |
-----------------------------------------
-----------------------------------------
| rollout/ | |
| ep_len_mean | 118 |
| ep_rew_mean | 102 |
| time/ | |
| fps | 19 |
| iterations | 3 |
| time_elapsed | 616 |
| total_timesteps | 26624 |
| train/ | |
| approx_kl | 0.021042215 |
| clip_fraction | 0.227 |
| clip_range | 0.2 |
| entropy_loss | -2.77 |
| explained_variance | 0.519 |
| learning_rate | 0.000225 |
| loss | 2.82 |
| n_updates | 90 |
| policy_gradient_loss | 0.00236 |
| std | 0.959 |
| value_loss | 9.14 |
-----------------------------------------
[21:35:50] [10,000/150,000] Checkpoint saved: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/exp22-generated-pair-warm-v6/checkpoint_0010000.zip
[21:35:56] Eval: gen_road=3.0r/64s ❌@64 gen_track=1.1r/63s ❌@63
[21:35:56] NEW BEST: combined steps=127 reward=4.1
---------------------------------
| rollout/ | |
| ep_len_mean | 118 |
| ep_rew_mean | 102 |
| time/ | |
| fps | 31 |
| iterations | 1 |
| time_elapsed | 129 |
| total_timesteps | 30720 |
---------------------------------
-----------------------------------------
| rollout/ | |
| ep_len_mean | 118 |
| ep_rew_mean | 102 |
| time/ | |
| fps | 22 |
| iterations | 2 |
| time_elapsed | 357 |
| total_timesteps | 34816 |
| train/ | |
| approx_kl | 0.027895104 |
| clip_fraction | 0.222 |
| clip_range | 0.2 |
| entropy_loss | -2.67 |
| explained_variance | 0.27 |
| learning_rate | 0.000225 |
| loss | 0.0657 |
| n_updates | 110 |
| policy_gradient_loss | -0.0236 |
| std | 0.907 |
| value_loss | 0.549 |
-----------------------------------------
-----------------------------------------
| rollout/ | |
| ep_len_mean | 118 |
| ep_rew_mean | 102 |
| time/ | |
| fps | 20 |
| iterations | 3 |
| time_elapsed | 587 |
| total_timesteps | 38912 |
| train/ | |
| approx_kl | 0.038819656 |
| clip_fraction | 0.24 |
| clip_range | 0.2 |
| entropy_loss | -2.63 |
| explained_variance | 0.346 |
| learning_rate | 0.000225 |
| loss | -0.0014 |
| n_updates | 120 |
| policy_gradient_loss | -0.0293 |
| std | 0.893 |
| value_loss | 0.157 |
-----------------------------------------
[21:47:36] [20,000/150,000] Checkpoint saved: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/exp22-generated-pair-warm-v6/checkpoint_0020000.zip
[21:47:42] Eval: gen_road=2.9r/64s ❌@64 gen_track=1.1r/63s ❌@63
---------------------------------
| rollout/ | |
| ep_len_mean | 118 |
| ep_rew_mean | 102 |
| time/ | |
| fps | 33 |
| iterations | 1 |
| time_elapsed | 122 |
| total_timesteps | 43008 |
---------------------------------
-----------------------------------------
| rollout/ | |
| ep_len_mean | 118 |
| ep_rew_mean | 102 |
| time/ | |
| fps | 23 |
| iterations | 2 |
| time_elapsed | 351 |
| total_timesteps | 47104 |
| train/ | |
| approx_kl | 0.060704876 |
| clip_fraction | 0.327 |
| clip_range | 0.2 |
| entropy_loss | -2.53 |
| explained_variance | 0.877 |
| learning_rate | 0.000225 |
| loss | -0.0427 |
| n_updates | 140 |
| policy_gradient_loss | -0.045 |
| std | 0.847 |
| value_loss | 0.0676 |
-----------------------------------------
----------------------------------------
| rollout/ | |
| ep_len_mean | 118 |
| ep_rew_mean | 102 |
| time/ | |
| fps | 21 |
| iterations | 3 |
| time_elapsed | 571 |
| total_timesteps | 51200 |
| train/ | |
| approx_kl | 0.06585144 |
| clip_fraction | 0.35 |
| clip_range | 0.2 |
| entropy_loss | -2.49 |
| explained_variance | 0.883 |
| learning_rate | 0.000225 |
| loss | -0.0429 |
| n_updates | 150 |
| policy_gradient_loss | -0.0419 |
| std | 0.833 |
| value_loss | 0.0814 |
----------------------------------------
[21:58:56] [30,000/150,000] Checkpoint saved: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/exp22-generated-pair-warm-v6/checkpoint_0030000.zip
[21:59:02] Eval: gen_road=2.7r/63s ❌@63 gen_track=1.1r/62s ❌@62
---------------------------------
| rollout/ | |
| ep_len_mean | 118 |
| ep_rew_mean | 102 |
| time/ | |
| fps | 33 |
| iterations | 1 |
| time_elapsed | 121 |
| total_timesteps | 55296 |
---------------------------------
-----------------------------------------
| rollout/ | |
| ep_len_mean | 118 |
| ep_rew_mean | 102 |
| time/ | |
| fps | 23 |
| iterations | 2 |
| time_elapsed | 343 |
| total_timesteps | 59392 |
| train/ | |
| approx_kl | 0.096836925 |
| clip_fraction | 0.422 |
| clip_range | 0.2 |
| entropy_loss | -2.42 |
| explained_variance | 0.85 |
| learning_rate | 0.000225 |
| loss | -0.0767 |
| n_updates | 170 |
| policy_gradient_loss | -0.053 |
| std | 0.8 |
| value_loss | 0.0973 |
-----------------------------------------
----------------------------------------
| rollout/ | |
| ep_len_mean | 118 |
| ep_rew_mean | 102 |
| time/ | |
| fps | 22 |
| iterations | 3 |
| time_elapsed | 556 |
| total_timesteps | 63488 |
| train/ | |
| approx_kl | 0.16407205 |
| clip_fraction | 0.461 |
| clip_range | 0.2 |
| entropy_loss | -2.35 |
| explained_variance | 0.9 |
| learning_rate | 0.000225 |
| loss | -0.0875 |
| n_updates | 180 |
| policy_gradient_loss | -0.0633 |
| std | 0.758 |
| value_loss | 0.035 |
----------------------------------------
[22:10:03] [40,000/150,000] Checkpoint saved: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/exp22-generated-pair-warm-v6/checkpoint_0040000.zip
[22:10:09] Eval: gen_road=3.1r/59s ❌@59 gen_track=1.3r/58s ❌@58
---------------------------------
| rollout/ | |
| ep_len_mean | 118 |
| ep_rew_mean | 102 |
| time/ | |
| fps | 36 |
| iterations | 1 |
| time_elapsed | 113 |
| total_timesteps | 67584 |
---------------------------------
----------------------------------------
| rollout/ | |
| ep_len_mean | 118 |
| ep_rew_mean | 102 |
| time/ | |
| fps | 24 |
| iterations | 2 |
| time_elapsed | 329 |
| total_timesteps | 71680 |
| train/ | |
| approx_kl | 0.17689857 |
| clip_fraction | 0.489 |
| clip_range | 0.2 |
| entropy_loss | -2.18 |
| explained_variance | 0.917 |
| learning_rate | 0.000225 |
| loss | -0.0885 |
| n_updates | 200 |
| policy_gradient_loss | -0.0635 |
| std | 0.698 |
| value_loss | 0.054 |
----------------------------------------
---------------------------------------
| rollout/ | |
| ep_len_mean | 118 |
| ep_rew_mean | 102 |
| time/ | |
| fps | 22 |
| iterations | 3 |
| time_elapsed | 548 |
| total_timesteps | 75776 |
| train/ | |
| approx_kl | 0.1996874 |
| clip_fraction | 0.506 |
| clip_range | 0.2 |
| entropy_loss | -2.08 |
| explained_variance | 0.933 |
| learning_rate | 0.000225 |
| loss | -0.0906 |
| n_updates | 210 |
| policy_gradient_loss | -0.0629 |
| std | 0.666 |
| value_loss | 0.043 |
---------------------------------------
[22:20:58] [50,000/150,000] Checkpoint saved: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/exp22-generated-pair-warm-v6/checkpoint_0050000.zip
[22:21:04] Eval: gen_road=5.9r/67s ❌@67 gen_track=1.6r/66s ❌@66
[22:21:04] NEW BEST: combined steps=133 reward=7.6
---------------------------------
| rollout/ | |
| ep_len_mean | 118 |
| ep_rew_mean | 102 |
| time/ | |
| fps | 35 |
| iterations | 1 |
| time_elapsed | 115 |
| total_timesteps | 79872 |
---------------------------------
--------------------------------------
| rollout/ | |
| ep_len_mean | 118 |
| ep_rew_mean | 102 |
| time/ | |
| fps | 25 |
| iterations | 2 |
| time_elapsed | 326 |
| total_timesteps | 83968 |
| train/ | |
| approx_kl | 0.254287 |
| clip_fraction | 0.543 |
| clip_range | 0.2 |
| entropy_loss | -1.94 |
| explained_variance | 0.89 |
| learning_rate | 0.000225 |
| loss | -0.102 |
| n_updates | 230 |
| policy_gradient_loss | -0.0707 |
| std | 0.62 |
| value_loss | 0.0646 |
--------------------------------------
----------------------------------------
| rollout/ | |
| ep_len_mean | 118 |
| ep_rew_mean | 102 |
| time/ | |
| fps | 22 |
| iterations | 3 |
| time_elapsed | 551 |
| total_timesteps | 88064 |
| train/ | |
| approx_kl | 0.32521772 |
| clip_fraction | 0.604 |
| clip_range | 0.2 |
| entropy_loss | -1.85 |
| explained_variance | 0.803 |
| learning_rate | 0.000225 |
| loss | -0.0781 |
| n_updates | 240 |
| policy_gradient_loss | -0.0776 |
| std | 0.594 |
| value_loss | 0.102 |
----------------------------------------
[22:32:03] [60,000/150,000] Checkpoint saved: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/exp22-generated-pair-warm-v6/checkpoint_0060000.zip
[22:32:09] Eval: gen_road=7.7r/93s ❌@93 gen_track=3.7r/92s ❌@92
[22:32:09] NEW BEST: combined steps=185 reward=11.4
---------------------------------
| rollout/ | |
| ep_len_mean | 118 |
| ep_rew_mean | 102 |
| time/ | |
| fps | 43 |
| iterations | 1 |
| time_elapsed | 94 |
| total_timesteps | 92160 |
---------------------------------