384 lines
16 KiB
Plaintext
384 lines
16 KiB
Plaintext
[21:23:45] ============================================================
|
|
[21:23:45] Exp 22: generated_road + generated_track, warm-started, v6 reward
|
|
[21:23:45] Warm start: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/champion/model.zip
|
|
[21:23:45] Sim 1: localhost:9091 -> generated_road
|
|
[21:23:45] Sim 2: localhost:9093 -> generated_track
|
|
[21:23:45] throttle_min=0.2, lr=0.000225, total=150,000
|
|
[21:23:45] Reward: v6 (speed x CTE with progress/efficiency exploit termination)
|
|
[21:23:45] Stuck timeout: 8.0s, hard cap: 25.0s
|
|
[21:23:45] Progress patience: 100 steps
|
|
[21:23:45] Checkpoints: every 10,000 steps
|
|
[21:23:45] ============================================================
|
|
[21:23:45] Creating DummyVecEnv with the two road tracks...
|
|
starting DonkeyGym env
|
|
Setting default: start_delay 5.0
|
|
Setting default: max_cte 8.0
|
|
Setting default: frame_skip 1
|
|
Setting default: cam_resolution (120, 160, 3)
|
|
Setting default: log_level 20
|
|
Setting default: steer_limit 1.0
|
|
Setting default: throttle_min 0.0
|
|
Setting default: throttle_max 1.0
|
|
loading scene generated_road
|
|
starting DonkeyGym env
|
|
Setting default: start_delay 5.0
|
|
Setting default: max_cte 8.0
|
|
Setting default: frame_skip 1
|
|
Setting default: cam_resolution (120, 160, 3)
|
|
Setting default: log_level 20
|
|
Setting default: steer_limit 1.0
|
|
Setting default: throttle_min 0.0
|
|
Setting default: throttle_max 1.0
|
|
loading scene generated_track
|
|
[21:23:47] VecEnv num_envs=2, obs=(3, 120, 160)
|
|
[21:23:50] Warm-start model attached. Starting training...
|
|
---------------------------------
|
|
| rollout/ | |
|
|
| ep_len_mean | 118 |
|
|
| ep_rew_mean | 102 |
|
|
| time/ | |
|
|
| fps | 29 |
|
|
| iterations | 1 |
|
|
| time_elapsed | 139 |
|
|
| total_timesteps | 18432 |
|
|
---------------------------------
|
|
-----------------------------------------
|
|
| rollout/ | |
|
|
| ep_len_mean | 118 |
|
|
| ep_rew_mean | 102 |
|
|
| time/ | |
|
|
| fps | 21 |
|
|
| iterations | 2 |
|
|
| time_elapsed | 378 |
|
|
| total_timesteps | 22528 |
|
|
| train/ | |
|
|
| approx_kl | 0.024176385 |
|
|
| clip_fraction | 0.244 |
|
|
| clip_range | 0.2 |
|
|
| entropy_loss | -2.79 |
|
|
| explained_variance | -1.36 |
|
|
| learning_rate | 0.000225 |
|
|
| loss | 12.5 |
|
|
| n_updates | 80 |
|
|
| policy_gradient_loss | 0.0113 |
|
|
| std | 0.976 |
|
|
| value_loss | 41.2 |
|
|
-----------------------------------------
|
|
-----------------------------------------
|
|
| rollout/ | |
|
|
| ep_len_mean | 118 |
|
|
| ep_rew_mean | 102 |
|
|
| time/ | |
|
|
| fps | 19 |
|
|
| iterations | 3 |
|
|
| time_elapsed | 616 |
|
|
| total_timesteps | 26624 |
|
|
| train/ | |
|
|
| approx_kl | 0.021042215 |
|
|
| clip_fraction | 0.227 |
|
|
| clip_range | 0.2 |
|
|
| entropy_loss | -2.77 |
|
|
| explained_variance | 0.519 |
|
|
| learning_rate | 0.000225 |
|
|
| loss | 2.82 |
|
|
| n_updates | 90 |
|
|
| policy_gradient_loss | 0.00236 |
|
|
| std | 0.959 |
|
|
| value_loss | 9.14 |
|
|
-----------------------------------------
|
|
[21:35:50] [10,000/150,000] Checkpoint saved: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/exp22-generated-pair-warm-v6/checkpoint_0010000.zip
|
|
[21:35:56] Eval: gen_road=3.0r/64s ❌@64 gen_track=1.1r/63s ❌@63
|
|
[21:35:56] NEW BEST: combined steps=127 reward=4.1
|
|
---------------------------------
|
|
| rollout/ | |
|
|
| ep_len_mean | 118 |
|
|
| ep_rew_mean | 102 |
|
|
| time/ | |
|
|
| fps | 31 |
|
|
| iterations | 1 |
|
|
| time_elapsed | 129 |
|
|
| total_timesteps | 30720 |
|
|
---------------------------------
|
|
-----------------------------------------
|
|
| rollout/ | |
|
|
| ep_len_mean | 118 |
|
|
| ep_rew_mean | 102 |
|
|
| time/ | |
|
|
| fps | 22 |
|
|
| iterations | 2 |
|
|
| time_elapsed | 357 |
|
|
| total_timesteps | 34816 |
|
|
| train/ | |
|
|
| approx_kl | 0.027895104 |
|
|
| clip_fraction | 0.222 |
|
|
| clip_range | 0.2 |
|
|
| entropy_loss | -2.67 |
|
|
| explained_variance | 0.27 |
|
|
| learning_rate | 0.000225 |
|
|
| loss | 0.0657 |
|
|
| n_updates | 110 |
|
|
| policy_gradient_loss | -0.0236 |
|
|
| std | 0.907 |
|
|
| value_loss | 0.549 |
|
|
-----------------------------------------
|
|
-----------------------------------------
|
|
| rollout/ | |
|
|
| ep_len_mean | 118 |
|
|
| ep_rew_mean | 102 |
|
|
| time/ | |
|
|
| fps | 20 |
|
|
| iterations | 3 |
|
|
| time_elapsed | 587 |
|
|
| total_timesteps | 38912 |
|
|
| train/ | |
|
|
| approx_kl | 0.038819656 |
|
|
| clip_fraction | 0.24 |
|
|
| clip_range | 0.2 |
|
|
| entropy_loss | -2.63 |
|
|
| explained_variance | 0.346 |
|
|
| learning_rate | 0.000225 |
|
|
| loss | -0.0014 |
|
|
| n_updates | 120 |
|
|
| policy_gradient_loss | -0.0293 |
|
|
| std | 0.893 |
|
|
| value_loss | 0.157 |
|
|
-----------------------------------------
|
|
[21:47:36] [20,000/150,000] Checkpoint saved: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/exp22-generated-pair-warm-v6/checkpoint_0020000.zip
|
|
[21:47:42] Eval: gen_road=2.9r/64s ❌@64 gen_track=1.1r/63s ❌@63
|
|
---------------------------------
|
|
| rollout/ | |
|
|
| ep_len_mean | 118 |
|
|
| ep_rew_mean | 102 |
|
|
| time/ | |
|
|
| fps | 33 |
|
|
| iterations | 1 |
|
|
| time_elapsed | 122 |
|
|
| total_timesteps | 43008 |
|
|
---------------------------------
|
|
-----------------------------------------
|
|
| rollout/ | |
|
|
| ep_len_mean | 118 |
|
|
| ep_rew_mean | 102 |
|
|
| time/ | |
|
|
| fps | 23 |
|
|
| iterations | 2 |
|
|
| time_elapsed | 351 |
|
|
| total_timesteps | 47104 |
|
|
| train/ | |
|
|
| approx_kl | 0.060704876 |
|
|
| clip_fraction | 0.327 |
|
|
| clip_range | 0.2 |
|
|
| entropy_loss | -2.53 |
|
|
| explained_variance | 0.877 |
|
|
| learning_rate | 0.000225 |
|
|
| loss | -0.0427 |
|
|
| n_updates | 140 |
|
|
| policy_gradient_loss | -0.045 |
|
|
| std | 0.847 |
|
|
| value_loss | 0.0676 |
|
|
-----------------------------------------
|
|
----------------------------------------
|
|
| rollout/ | |
|
|
| ep_len_mean | 118 |
|
|
| ep_rew_mean | 102 |
|
|
| time/ | |
|
|
| fps | 21 |
|
|
| iterations | 3 |
|
|
| time_elapsed | 571 |
|
|
| total_timesteps | 51200 |
|
|
| train/ | |
|
|
| approx_kl | 0.06585144 |
|
|
| clip_fraction | 0.35 |
|
|
| clip_range | 0.2 |
|
|
| entropy_loss | -2.49 |
|
|
| explained_variance | 0.883 |
|
|
| learning_rate | 0.000225 |
|
|
| loss | -0.0429 |
|
|
| n_updates | 150 |
|
|
| policy_gradient_loss | -0.0419 |
|
|
| std | 0.833 |
|
|
| value_loss | 0.0814 |
|
|
----------------------------------------
|
|
[21:58:56] [30,000/150,000] Checkpoint saved: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/exp22-generated-pair-warm-v6/checkpoint_0030000.zip
|
|
[21:59:02] Eval: gen_road=2.7r/63s ❌@63 gen_track=1.1r/62s ❌@62
|
|
---------------------------------
|
|
| rollout/ | |
|
|
| ep_len_mean | 118 |
|
|
| ep_rew_mean | 102 |
|
|
| time/ | |
|
|
| fps | 33 |
|
|
| iterations | 1 |
|
|
| time_elapsed | 121 |
|
|
| total_timesteps | 55296 |
|
|
---------------------------------
|
|
-----------------------------------------
|
|
| rollout/ | |
|
|
| ep_len_mean | 118 |
|
|
| ep_rew_mean | 102 |
|
|
| time/ | |
|
|
| fps | 23 |
|
|
| iterations | 2 |
|
|
| time_elapsed | 343 |
|
|
| total_timesteps | 59392 |
|
|
| train/ | |
|
|
| approx_kl | 0.096836925 |
|
|
| clip_fraction | 0.422 |
|
|
| clip_range | 0.2 |
|
|
| entropy_loss | -2.42 |
|
|
| explained_variance | 0.85 |
|
|
| learning_rate | 0.000225 |
|
|
| loss | -0.0767 |
|
|
| n_updates | 170 |
|
|
| policy_gradient_loss | -0.053 |
|
|
| std | 0.8 |
|
|
| value_loss | 0.0973 |
|
|
-----------------------------------------
|
|
----------------------------------------
|
|
| rollout/ | |
|
|
| ep_len_mean | 118 |
|
|
| ep_rew_mean | 102 |
|
|
| time/ | |
|
|
| fps | 22 |
|
|
| iterations | 3 |
|
|
| time_elapsed | 556 |
|
|
| total_timesteps | 63488 |
|
|
| train/ | |
|
|
| approx_kl | 0.16407205 |
|
|
| clip_fraction | 0.461 |
|
|
| clip_range | 0.2 |
|
|
| entropy_loss | -2.35 |
|
|
| explained_variance | 0.9 |
|
|
| learning_rate | 0.000225 |
|
|
| loss | -0.0875 |
|
|
| n_updates | 180 |
|
|
| policy_gradient_loss | -0.0633 |
|
|
| std | 0.758 |
|
|
| value_loss | 0.035 |
|
|
----------------------------------------
|
|
[22:10:03] [40,000/150,000] Checkpoint saved: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/exp22-generated-pair-warm-v6/checkpoint_0040000.zip
|
|
[22:10:09] Eval: gen_road=3.1r/59s ❌@59 gen_track=1.3r/58s ❌@58
|
|
---------------------------------
|
|
| rollout/ | |
|
|
| ep_len_mean | 118 |
|
|
| ep_rew_mean | 102 |
|
|
| time/ | |
|
|
| fps | 36 |
|
|
| iterations | 1 |
|
|
| time_elapsed | 113 |
|
|
| total_timesteps | 67584 |
|
|
---------------------------------
|
|
----------------------------------------
|
|
| rollout/ | |
|
|
| ep_len_mean | 118 |
|
|
| ep_rew_mean | 102 |
|
|
| time/ | |
|
|
| fps | 24 |
|
|
| iterations | 2 |
|
|
| time_elapsed | 329 |
|
|
| total_timesteps | 71680 |
|
|
| train/ | |
|
|
| approx_kl | 0.17689857 |
|
|
| clip_fraction | 0.489 |
|
|
| clip_range | 0.2 |
|
|
| entropy_loss | -2.18 |
|
|
| explained_variance | 0.917 |
|
|
| learning_rate | 0.000225 |
|
|
| loss | -0.0885 |
|
|
| n_updates | 200 |
|
|
| policy_gradient_loss | -0.0635 |
|
|
| std | 0.698 |
|
|
| value_loss | 0.054 |
|
|
----------------------------------------
|
|
---------------------------------------
|
|
| rollout/ | |
|
|
| ep_len_mean | 118 |
|
|
| ep_rew_mean | 102 |
|
|
| time/ | |
|
|
| fps | 22 |
|
|
| iterations | 3 |
|
|
| time_elapsed | 548 |
|
|
| total_timesteps | 75776 |
|
|
| train/ | |
|
|
| approx_kl | 0.1996874 |
|
|
| clip_fraction | 0.506 |
|
|
| clip_range | 0.2 |
|
|
| entropy_loss | -2.08 |
|
|
| explained_variance | 0.933 |
|
|
| learning_rate | 0.000225 |
|
|
| loss | -0.0906 |
|
|
| n_updates | 210 |
|
|
| policy_gradient_loss | -0.0629 |
|
|
| std | 0.666 |
|
|
| value_loss | 0.043 |
|
|
---------------------------------------
|
|
[22:20:58] [50,000/150,000] Checkpoint saved: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/exp22-generated-pair-warm-v6/checkpoint_0050000.zip
|
|
[22:21:04] Eval: gen_road=5.9r/67s ❌@67 gen_track=1.6r/66s ❌@66
|
|
[22:21:04] NEW BEST: combined steps=133 reward=7.6
|
|
---------------------------------
|
|
| rollout/ | |
|
|
| ep_len_mean | 118 |
|
|
| ep_rew_mean | 102 |
|
|
| time/ | |
|
|
| fps | 35 |
|
|
| iterations | 1 |
|
|
| time_elapsed | 115 |
|
|
| total_timesteps | 79872 |
|
|
---------------------------------
|
|
--------------------------------------
|
|
| rollout/ | |
|
|
| ep_len_mean | 118 |
|
|
| ep_rew_mean | 102 |
|
|
| time/ | |
|
|
| fps | 25 |
|
|
| iterations | 2 |
|
|
| time_elapsed | 326 |
|
|
| total_timesteps | 83968 |
|
|
| train/ | |
|
|
| approx_kl | 0.254287 |
|
|
| clip_fraction | 0.543 |
|
|
| clip_range | 0.2 |
|
|
| entropy_loss | -1.94 |
|
|
| explained_variance | 0.89 |
|
|
| learning_rate | 0.000225 |
|
|
| loss | -0.102 |
|
|
| n_updates | 230 |
|
|
| policy_gradient_loss | -0.0707 |
|
|
| std | 0.62 |
|
|
| value_loss | 0.0646 |
|
|
--------------------------------------
|
|
----------------------------------------
|
|
| rollout/ | |
|
|
| ep_len_mean | 118 |
|
|
| ep_rew_mean | 102 |
|
|
| time/ | |
|
|
| fps | 22 |
|
|
| iterations | 3 |
|
|
| time_elapsed | 551 |
|
|
| total_timesteps | 88064 |
|
|
| train/ | |
|
|
| approx_kl | 0.32521772 |
|
|
| clip_fraction | 0.604 |
|
|
| clip_range | 0.2 |
|
|
| entropy_loss | -1.85 |
|
|
| explained_variance | 0.803 |
|
|
| learning_rate | 0.000225 |
|
|
| loss | -0.0781 |
|
|
| n_updates | 240 |
|
|
| policy_gradient_loss | -0.0776 |
|
|
| std | 0.594 |
|
|
| value_loss | 0.102 |
|
|
----------------------------------------
|
|
[22:32:03] [60,000/150,000] Checkpoint saved: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/exp22-generated-pair-warm-v6/checkpoint_0060000.zip
|
|
[22:32:09] Eval: gen_road=7.7r/93s ❌@93 gen_track=3.7r/92s ❌@92
|
|
[22:32:09] NEW BEST: combined steps=185 reward=11.4
|
|
---------------------------------
|
|
| rollout/ | |
|
|
| ep_len_mean | 118 |
|
|
| ep_rew_mean | 102 |
|
|
| time/ | |
|
|
| fps | 43 |
|
|
| iterations | 1 |
|
|
| time_elapsed | 94 |
|
|
| total_timesteps | 92160 |
|
|
---------------------------------
|