donkeycar-rl-autoresearch/agent/models/exp20-parallel-450k-v5/run_2026-04-28_163923.log

1207 lines
48 KiB
Plaintext

[16:39:32] ============================================================
[16:39:32] Exp 20: Parallel DummyVecEnv — 450k steps (sim rebuild + progress fix)
[16:39:32] Sim 1: localhost:9091 → generated_track
[16:39:32] Sim 2: localhost:9093 → mountain_track
[16:39:32] throttle_min=0.2, lr=0.000725, total=450,000
[16:39:32] Reward: v6 + exploit fix (window=200, min_lap=12.0s)
[16:39:32] Stuck termination: 40 steps (~2s), hard cap 30s
[16:39:32] Progress patience: 150 steps (~7.5s at 20fps)
[16:39:32] Checkpoints: every 20,000 steps
[16:39:32] ============================================================
[16:39:32] Creating DummyVecEnv with two tracks...
starting DonkeyGym env
Setting default: start_delay 5.0
Setting default: max_cte 8.0
Setting default: frame_skip 1
Setting default: cam_resolution (120, 160, 3)
Setting default: log_level 20
Setting default: steer_limit 1.0
Setting default: throttle_min 0.0
Setting default: throttle_max 1.0
loading scene generated_track
starting DonkeyGym env
Setting default: start_delay 5.0
Setting default: max_cte 8.0
Setting default: frame_skip 1
Setting default: cam_resolution (120, 160, 3)
Setting default: log_level 20
Setting default: steer_limit 1.0
Setting default: throttle_min 0.0
Setting default: throttle_max 1.0
loading scene mountain_track
[16:39:34] VecEnv num_envs=2, obs=(3, 120, 160)
Using cpu device
[16:39:38] PPO created. Starting training...
-----------------------------
| time/ | |
| fps | 28 |
| iterations | 1 |
| time_elapsed | 142 |
| total_timesteps | 4096 |
-----------------------------
-----------------------------------------
| time/ | |
| fps | 19 |
| iterations | 2 |
| time_elapsed | 428 |
| total_timesteps | 8192 |
| train/ | |
| approx_kl | 0.106491014 |
| clip_fraction | 0.241 |
| clip_range | 0.2 |
| entropy_loss | -2.81 |
| explained_variance | 0.0458 |
| learning_rate | 0.000725 |
| loss | -0.0816 |
| n_updates | 10 |
| policy_gradient_loss | -0.0458 |
| std | 0.953 |
| value_loss | 1.39 |
-----------------------------------------
----------------------------------------
| time/ | |
| fps | 17 |
| iterations | 3 |
| time_elapsed | 703 |
| total_timesteps | 12288 |
| train/ | |
| approx_kl | 0.15251182 |
| clip_fraction | 0.428 |
| clip_range | 0.2 |
| entropy_loss | -2.74 |
| explained_variance | 0.0179 |
| learning_rate | 0.000725 |
| loss | 0.0731 |
| n_updates | 20 |
| policy_gradient_loss | -0.0196 |
| std | 0.946 |
| value_loss | 18.6 |
----------------------------------------
----------------------------------------
| time/ | |
| fps | 17 |
| iterations | 4 |
| time_elapsed | 955 |
| total_timesteps | 16384 |
| train/ | |
| approx_kl | 0.22289596 |
| clip_fraction | 0.505 |
| clip_range | 0.2 |
| entropy_loss | -2.68 |
| explained_variance | -0.418 |
| learning_rate | 0.000725 |
| loss | -0.0581 |
| n_updates | 30 |
| policy_gradient_loss | -0.0692 |
| std | 0.882 |
| value_loss | 0.303 |
----------------------------------------
----------------------------------------
| time/ | |
| fps | 17 |
| iterations | 5 |
| time_elapsed | 1143 |
| total_timesteps | 20480 |
| train/ | |
| approx_kl | 0.50940317 |
| clip_fraction | 0.619 |
| clip_range | 0.2 |
| entropy_loss | -2.51 |
| explained_variance | 0.405 |
| learning_rate | 0.000725 |
| loss | -0.117 |
| n_updates | 40 |
| policy_gradient_loss | -0.0885 |
| std | 0.807 |
| value_loss | 0.196 |
----------------------------------------
[17:00:28] [20,000/450,000] Checkpoint saved: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/exp20-parallel-450k-v5/checkpoint_0020000.zip
[17:00:34] Eval: gen_track=6.7r/67s ❌@67 mountain=5.0r/67s ❌@67
[17:00:34] NEW BEST: 11.7 combined reward
------------------------------
| time/ | |
| fps | 44 |
| iterations | 1 |
| time_elapsed | 91 |
| total_timesteps | 24576 |
------------------------------
---------------------------------------
| time/ | |
| fps | 31 |
| iterations | 2 |
| time_elapsed | 260 |
| total_timesteps | 28672 |
| train/ | |
| approx_kl | 0.4478566 |
| clip_fraction | 0.658 |
| clip_range | 0.2 |
| entropy_loss | -2.14 |
| explained_variance | 0.597 |
| learning_rate | 0.000725 |
| loss | -0.0885 |
| n_updates | 60 |
| policy_gradient_loss | -0.0843 |
| std | 0.67 |
| value_loss | 0.212 |
---------------------------------------
---------------------------------------
| time/ | |
| fps | 29 |
| iterations | 3 |
| time_elapsed | 420 |
| total_timesteps | 32768 |
| train/ | |
| approx_kl | 0.9769267 |
| clip_fraction | 0.658 |
| clip_range | 0.2 |
| entropy_loss | -1.96 |
| explained_variance | 0.399 |
| learning_rate | 0.000725 |
| loss | -0.107 |
| n_updates | 70 |
| policy_gradient_loss | -0.0829 |
| std | 0.61 |
| value_loss | 0.403 |
---------------------------------------
---------------------------------------
| time/ | |
| fps | 28 |
| iterations | 4 |
| time_elapsed | 576 |
| total_timesteps | 36864 |
| train/ | |
| approx_kl | 0.5094907 |
| clip_fraction | 0.682 |
| clip_range | 0.2 |
| entropy_loss | -1.81 |
| explained_variance | 0.45 |
| learning_rate | 0.000725 |
| loss | -0.0843 |
| n_updates | 80 |
| policy_gradient_loss | -0.076 |
| std | 0.562 |
| value_loss | 0.396 |
---------------------------------------
----------------------------------------
| time/ | |
| fps | 27 |
| iterations | 5 |
| time_elapsed | 731 |
| total_timesteps | 40960 |
| train/ | |
| approx_kl | 0.70720506 |
| clip_fraction | 0.703 |
| clip_range | 0.2 |
| entropy_loss | -1.61 |
| explained_variance | 0.466 |
| learning_rate | 0.000725 |
| loss | -0.0856 |
| n_updates | 90 |
| policy_gradient_loss | -0.0797 |
| std | 0.513 |
| value_loss | 0.468 |
----------------------------------------
[17:14:13] [40,000/450,000] Checkpoint saved: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/exp20-parallel-450k-v5/checkpoint_0040000.zip
[17:14:19] Eval: gen_track=21.1r/148s ❌@148 mountain=15.7r/147s ❌@147
[17:14:20] NEW BEST: 36.8 combined reward
------------------------------
| time/ | |
| fps | 64 |
| iterations | 1 |
| time_elapsed | 63 |
| total_timesteps | 45056 |
------------------------------
----------------------------------------
| time/ | |
| fps | 37 |
| iterations | 2 |
| time_elapsed | 218 |
| total_timesteps | 49152 |
| train/ | |
| approx_kl | 0.74158007 |
| clip_fraction | 0.711 |
| clip_range | 0.2 |
| entropy_loss | -1.21 |
| explained_variance | 0.591 |
| learning_rate | 0.000725 |
| loss | -0.0973 |
| n_updates | 110 |
| policy_gradient_loss | -0.0698 |
| std | 0.416 |
| value_loss | 0.462 |
----------------------------------------
---------------------------------------
| time/ | |
| fps | 33 |
| iterations | 3 |
| time_elapsed | 368 |
| total_timesteps | 53248 |
| train/ | |
| approx_kl | 0.8501822 |
| clip_fraction | 0.727 |
| clip_range | 0.2 |
| entropy_loss | -0.999 |
| explained_variance | 0.727 |
| learning_rate | 0.000725 |
| loss | -0.048 |
| n_updates | 120 |
| policy_gradient_loss | -0.0694 |
| std | 0.373 |
| value_loss | 0.399 |
---------------------------------------
----------------------------------------
| time/ | |
| fps | 31 |
| iterations | 4 |
| time_elapsed | 523 |
| total_timesteps | 57344 |
| train/ | |
| approx_kl | 0.99027634 |
| clip_fraction | 0.729 |
| clip_range | 0.2 |
| entropy_loss | -0.791 |
| explained_variance | 0.804 |
| learning_rate | 0.000725 |
| loss | -0.0752 |
| n_updates | 130 |
| policy_gradient_loss | -0.0647 |
| std | 0.335 |
| value_loss | 0.333 |
----------------------------------------
---------------------------------------
| time/ | |
| fps | 30 |
| iterations | 5 |
| time_elapsed | 669 |
| total_timesteps | 61440 |
| train/ | |
| approx_kl | 1.0802925 |
| clip_fraction | 0.727 |
| clip_range | 0.2 |
| entropy_loss | -0.588 |
| explained_variance | 0.81 |
| learning_rate | 0.000725 |
| loss | -0.0711 |
| n_updates | 140 |
| policy_gradient_loss | -0.0571 |
| std | 0.303 |
| value_loss | 0.459 |
---------------------------------------
[17:27:06] [60,000/450,000] Checkpoint saved: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/exp20-parallel-450k-v5/checkpoint_0060000.zip
[17:27:13] Eval: gen_track=6.6r/65s ❌@65 mountain=8.8r/112s ❌@112
------------------------------
| time/ | |
| fps | 66 |
| iterations | 1 |
| time_elapsed | 61 |
| total_timesteps | 65536 |
------------------------------
---------------------------------------
| time/ | |
| fps | 39 |
| iterations | 2 |
| time_elapsed | 208 |
| total_timesteps | 69632 |
| train/ | |
| approx_kl | 1.1047575 |
| clip_fraction | 0.752 |
| clip_range | 0.2 |
| entropy_loss | -0.185 |
| explained_variance | 0.811 |
| learning_rate | 0.000725 |
| loss | -0.0207 |
| n_updates | 160 |
| policy_gradient_loss | -0.0497 |
| std | 0.247 |
| value_loss | 0.504 |
---------------------------------------
---------------------------------------
| time/ | |
| fps | 34 |
| iterations | 3 |
| time_elapsed | 355 |
| total_timesteps | 73728 |
| train/ | |
| approx_kl | 1.0727906 |
| clip_fraction | 0.757 |
| clip_range | 0.2 |
| entropy_loss | 0.0196 |
| explained_variance | 0.82 |
| learning_rate | 0.000725 |
| loss | -0.0671 |
| n_updates | 170 |
| policy_gradient_loss | -0.0498 |
| std | 0.223 |
| value_loss | 0.409 |
---------------------------------------
----------------------------------------
| time/ | |
| fps | 32 |
| iterations | 4 |
| time_elapsed | 504 |
| total_timesteps | 77824 |
| train/ | |
| approx_kl | 0.91813445 |
| clip_fraction | 0.738 |
| clip_range | 0.2 |
| entropy_loss | 0.23 |
| explained_variance | 0.889 |
| learning_rate | 0.000725 |
| loss | -0.0627 |
| n_updates | 180 |
| policy_gradient_loss | -0.0518 |
| std | 0.201 |
| value_loss | 0.324 |
----------------------------------------
---------------------------------------
| time/ | |
| fps | 30 |
| iterations | 5 |
| time_elapsed | 670 |
| total_timesteps | 81920 |
| train/ | |
| approx_kl | 1.0347024 |
| clip_fraction | 0.742 |
| clip_range | 0.2 |
| entropy_loss | 0.431 |
| explained_variance | 0.841 |
| learning_rate | 0.000725 |
| loss | -0.043 |
| n_updates | 190 |
| policy_gradient_loss | -0.0455 |
| std | 0.182 |
| value_loss | 0.473 |
---------------------------------------
[17:39:48] [80,000/450,000] Checkpoint saved: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/exp20-parallel-450k-v5/checkpoint_0080000.zip
[17:39:55] Eval: gen_track=37.2r/209s ❌@209 mountain=28.0r/208s ❌@208
[17:39:55] NEW BEST: 65.2 combined reward
------------------------------
| time/ | |
| fps | 70 |
| iterations | 1 |
| time_elapsed | 57 |
| total_timesteps | 86016 |
------------------------------
---------------------------------------
| time/ | |
| fps | 41 |
| iterations | 2 |
| time_elapsed | 197 |
| total_timesteps | 90112 |
| train/ | |
| approx_kl | 1.3096654 |
| clip_fraction | 0.777 |
| clip_range | 0.2 |
| entropy_loss | 0.824 |
| explained_variance | 0.676 |
| learning_rate | 0.000725 |
| loss | -0.0305 |
| n_updates | 210 |
| policy_gradient_loss | -0.0378 |
| std | 0.149 |
| value_loss | 0.656 |
---------------------------------------
---------------------------------------
| time/ | |
| fps | 36 |
| iterations | 3 |
| time_elapsed | 335 |
| total_timesteps | 94208 |
| train/ | |
| approx_kl | 1.4713185 |
| clip_fraction | 0.773 |
| clip_range | 0.2 |
| entropy_loss | 1.02 |
| explained_variance | 0.733 |
| learning_rate | 0.000725 |
| loss | -0.0886 |
| n_updates | 220 |
| policy_gradient_loss | -0.0435 |
| std | 0.135 |
| value_loss | 0.46 |
---------------------------------------
---------------------------------------
| time/ | |
| fps | 33 |
| iterations | 4 |
| time_elapsed | 482 |
| total_timesteps | 98304 |
| train/ | |
| approx_kl | 1.4152278 |
| clip_fraction | 0.768 |
| clip_range | 0.2 |
| entropy_loss | 1.22 |
| explained_variance | 0.816 |
| learning_rate | 0.000725 |
| loss | -0.045 |
| n_updates | 230 |
| policy_gradient_loss | -0.0434 |
| std | 0.122 |
| value_loss | 0.423 |
---------------------------------------
---------------------------------------
| time/ | |
| fps | 32 |
| iterations | 5 |
| time_elapsed | 627 |
| total_timesteps | 102400 |
| train/ | |
| approx_kl | 1.4566097 |
| clip_fraction | 0.774 |
| clip_range | 0.2 |
| entropy_loss | 1.4 |
| explained_variance | 0.833 |
| learning_rate | 0.000725 |
| loss | -0.0272 |
| n_updates | 240 |
| policy_gradient_loss | -0.0242 |
| std | 0.111 |
| value_loss | 0.603 |
---------------------------------------
[17:51:51] [100,000/450,000] Checkpoint saved: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/exp20-parallel-450k-v5/checkpoint_0100000.zip
[17:51:58] Eval: gen_track=28.5r/175s ❌@175 mountain=27.3r/175s ❌@175
-------------------------------
| time/ | |
| fps | 80 |
| iterations | 1 |
| time_elapsed | 50 |
| total_timesteps | 106496 |
-------------------------------
---------------------------------------
| time/ | |
| fps | 42 |
| iterations | 2 |
| time_elapsed | 194 |
| total_timesteps | 110592 |
| train/ | |
| approx_kl | 1.3409375 |
| clip_fraction | 0.781 |
| clip_range | 0.2 |
| entropy_loss | 1.79 |
| explained_variance | 0.893 |
| learning_rate | 0.000725 |
| loss | -0.0767 |
| n_updates | 260 |
| policy_gradient_loss | -0.0304 |
| std | 0.092 |
| value_loss | 0.312 |
---------------------------------------
---------------------------------------
| time/ | |
| fps | 36 |
| iterations | 3 |
| time_elapsed | 340 |
| total_timesteps | 114688 |
| train/ | |
| approx_kl | 1.1949866 |
| clip_fraction | 0.775 |
| clip_range | 0.2 |
| entropy_loss | 1.95 |
| explained_variance | 0.851 |
| learning_rate | 0.000725 |
| loss | -0.0391 |
| n_updates | 270 |
| policy_gradient_loss | -0.0179 |
| std | 0.0843 |
| value_loss | 0.466 |
---------------------------------------
---------------------------------------
| time/ | |
| fps | 34 |
| iterations | 4 |
| time_elapsed | 479 |
| total_timesteps | 118784 |
| train/ | |
| approx_kl | 1.3683267 |
| clip_fraction | 0.795 |
| clip_range | 0.2 |
| entropy_loss | 2.11 |
| explained_variance | 0.847 |
| learning_rate | 0.000725 |
| loss | -0.04 |
| n_updates | 280 |
| policy_gradient_loss | 0.0118 |
| std | 0.0781 |
| value_loss | 0.518 |
---------------------------------------
---------------------------------------
| time/ | |
| fps | 32 |
| iterations | 5 |
| time_elapsed | 635 |
| total_timesteps | 122880 |
| train/ | |
| approx_kl | 1.6239161 |
| clip_fraction | 0.793 |
| clip_range | 0.2 |
| entropy_loss | 2.25 |
| explained_variance | 0.839 |
| learning_rate | 0.000725 |
| loss | -0.0657 |
| n_updates | 290 |
| policy_gradient_loss | 0.007 |
| std | 0.0734 |
| value_loss | 0.621 |
---------------------------------------
[18:04:01] [120,000/450,000] Checkpoint saved: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/exp20-parallel-450k-v5/checkpoint_0120000.zip
[18:04:08] Eval: gen_track=27.7r/180s ❌@180 mountain=25.7r/180s ❌@180
-------------------------------
| time/ | |
| fps | 56 |
| iterations | 1 |
| time_elapsed | 73 |
| total_timesteps | 126976 |
-------------------------------
---------------------------------------
| time/ | |
| fps | 27 |
| iterations | 2 |
| time_elapsed | 292 |
| total_timesteps | 131072 |
| train/ | |
| approx_kl | 1.5144145 |
| clip_fraction | 0.816 |
| clip_range | 0.2 |
| entropy_loss | 2.45 |
| explained_variance | 0.721 |
| learning_rate | 0.000725 |
| loss | 0.0266 |
| n_updates | 310 |
| policy_gradient_loss | 0.0224 |
| std | 0.0652 |
| value_loss | 0.869 |
---------------------------------------
---------------------------------------
| time/ | |
| fps | 24 |
| iterations | 3 |
| time_elapsed | 502 |
| total_timesteps | 135168 |
| train/ | |
| approx_kl | 1.7737575 |
| clip_fraction | 0.817 |
| clip_range | 0.2 |
| entropy_loss | 2.59 |
| explained_variance | 0.751 |
| learning_rate | 0.000725 |
| loss | -0.0113 |
| n_updates | 320 |
| policy_gradient_loss | 0.0307 |
| std | 0.061 |
| value_loss | 0.81 |
---------------------------------------
---------------------------------------
| time/ | |
| fps | 22 |
| iterations | 4 |
| time_elapsed | 725 |
| total_timesteps | 139264 |
| train/ | |
| approx_kl | 2.3732958 |
| clip_fraction | 0.831 |
| clip_range | 0.2 |
| entropy_loss | 2.73 |
| explained_variance | 0.711 |
| learning_rate | 0.000725 |
| loss | 0.0475 |
| n_updates | 330 |
| policy_gradient_loss | 0.0432 |
| std | 0.0575 |
| value_loss | 0.965 |
---------------------------------------
--------------------------------------
| time/ | |
| fps | 21 |
| iterations | 5 |
| time_elapsed | 946 |
| total_timesteps | 143360 |
| train/ | |
| approx_kl | 3.434102 |
| clip_fraction | 0.86 |
| clip_range | 0.2 |
| entropy_loss | 2.83 |
| explained_variance | 0.574 |
| learning_rate | 0.000725 |
| loss | 0.119 |
| n_updates | 340 |
| policy_gradient_loss | 0.0767 |
| std | 0.0555 |
| value_loss | 1.35 |
--------------------------------------
[18:22:13] [140,000/450,000] Checkpoint saved: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/exp20-parallel-450k-v5/checkpoint_0140000.zip
[18:22:19] Eval: gen_track=8.2r/79s ❌@79 mountain=8.9r/79s ❌@79
-------------------------------
| time/ | |
| fps | 46 |
| iterations | 1 |
| time_elapsed | 88 |
| total_timesteps | 147456 |
-------------------------------
---------------------------------------
| time/ | |
| fps | 22 |
| iterations | 2 |
| time_elapsed | 357 |
| total_timesteps | 151552 |
| train/ | |
| approx_kl | 2.1389558 |
| clip_fraction | 0.838 |
| clip_range | 0.2 |
| entropy_loss | 2.94 |
| explained_variance | 0.596 |
| learning_rate | 0.000725 |
| loss | 0.156 |
| n_updates | 360 |
| policy_gradient_loss | 0.0675 |
| std | 0.0521 |
| value_loss | 1.21 |
---------------------------------------
--------------------------------------
| time/ | |
| fps | 19 |
| iterations | 3 |
| time_elapsed | 629 |
| total_timesteps | 155648 |
| train/ | |
| approx_kl | 4.391129 |
| clip_fraction | 0.863 |
| clip_range | 0.2 |
| entropy_loss | 3.04 |
| explained_variance | 0.656 |
| learning_rate | 0.000725 |
| loss | 0.0562 |
| n_updates | 370 |
| policy_gradient_loss | 0.0623 |
| std | 0.0491 |
| value_loss | 0.813 |
--------------------------------------
---------------------------------------
| time/ | |
| fps | 18 |
| iterations | 4 |
| time_elapsed | 890 |
| total_timesteps | 159744 |
| train/ | |
| approx_kl | 2.7348022 |
| clip_fraction | 0.855 |
| clip_range | 0.2 |
| entropy_loss | 3.12 |
| explained_variance | 0.698 |
| learning_rate | 0.000725 |
| loss | 0.0226 |
| n_updates | 380 |
| policy_gradient_loss | 0.038 |
| std | 0.047 |
| value_loss | 0.562 |
---------------------------------------
--------------------------------------
| time/ | |
| fps | 18 |
| iterations | 5 |
| time_elapsed | 1133 |
| total_timesteps | 163840 |
| train/ | |
| approx_kl | 2.956015 |
| clip_fraction | 0.826 |
| clip_range | 0.2 |
| entropy_loss | 3.25 |
| explained_variance | 0.718 |
| learning_rate | 0.000725 |
| loss | -0.0102 |
| n_updates | 390 |
| policy_gradient_loss | 0.0226 |
| std | 0.0448 |
| value_loss | 0.436 |
--------------------------------------
[18:43:43] [160,000/450,000] Checkpoint saved: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/exp20-parallel-450k-v5/checkpoint_0160000.zip
[18:43:49] Eval: gen_track=5.5r/69s ❌@69 mountain=7.4r/97s ❌@97
-------------------------------
| time/ | |
| fps | 41 |
| iterations | 1 |
| time_elapsed | 98 |
| total_timesteps | 167936 |
-------------------------------
---------------------------------------
| time/ | |
| fps | 23 |
| iterations | 2 |
| time_elapsed | 350 |
| total_timesteps | 172032 |
| train/ | |
| approx_kl | 2.9826787 |
| clip_fraction | 0.813 |
| clip_range | 0.2 |
| entropy_loss | 3.5 |
| explained_variance | 0.778 |
| learning_rate | 0.000725 |
| loss | -0.0133 |
| n_updates | 410 |
| policy_gradient_loss | 0.000808 |
| std | 0.0392 |
| value_loss | 0.266 |
---------------------------------------
---------------------------------------
| time/ | |
| fps | 20 |
| iterations | 3 |
| time_elapsed | 601 |
| total_timesteps | 176128 |
| train/ | |
| approx_kl | 3.4679208 |
| clip_fraction | 0.801 |
| clip_range | 0.2 |
| entropy_loss | 3.68 |
| explained_variance | 0.786 |
| learning_rate | 0.000725 |
| loss | -0.0178 |
| n_updates | 420 |
| policy_gradient_loss | -0.0115 |
| std | 0.0359 |
| value_loss | 0.278 |
---------------------------------------
--------------------------------------
| time/ | |
| fps | 19 |
| iterations | 4 |
| time_elapsed | 842 |
| total_timesteps | 180224 |
| train/ | |
| approx_kl | 3.401013 |
| clip_fraction | 0.81 |
| clip_range | 0.2 |
| entropy_loss | 3.84 |
| explained_variance | 0.778 |
| learning_rate | 0.000725 |
| loss | -0.0331 |
| n_updates | 430 |
| policy_gradient_loss | -0.00564 |
| std | 0.0333 |
| value_loss | 0.299 |
--------------------------------------
---------------------------------------
| time/ | |
| fps | 19 |
| iterations | 5 |
| time_elapsed | 1033 |
| total_timesteps | 184320 |
| train/ | |
| approx_kl | 2.4161968 |
| clip_fraction | 0.822 |
| clip_range | 0.2 |
| entropy_loss | 3.98 |
| explained_variance | 0.806 |
| learning_rate | 0.000725 |
| loss | -0.0604 |
| n_updates | 440 |
| policy_gradient_loss | 0.00771 |
| std | 0.031 |
| value_loss | 0.283 |
---------------------------------------
[19:02:39] [180,000/450,000] Checkpoint saved: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/exp20-parallel-450k-v5/checkpoint_0180000.zip
[19:02:45] Eval: gen_track=6.7r/80s ❌@80 mountain=8.2r/97s ❌@97
-------------------------------
| time/ | |
| fps | 45 |
| iterations | 1 |
| time_elapsed | 90 |
| total_timesteps | 188416 |
-------------------------------
--------------------------------------
| time/ | |
| fps | 30 |
| iterations | 2 |
| time_elapsed | 270 |
| total_timesteps | 192512 |
| train/ | |
| approx_kl | 4.115147 |
| clip_fraction | 0.836 |
| clip_range | 0.2 |
| entropy_loss | 4.22 |
| explained_variance | 0.78 |
| learning_rate | 0.000725 |
| loss | 0.0814 |
| n_updates | 460 |
| policy_gradient_loss | 0.0216 |
| std | 0.0278 |
| value_loss | 0.338 |
--------------------------------------
--------------------------------------
| time/ | |
| fps | 26 |
| iterations | 3 |
| time_elapsed | 464 |
| total_timesteps | 196608 |
| train/ | |
| approx_kl | 4.002183 |
| clip_fraction | 0.841 |
| clip_range | 0.2 |
| entropy_loss | 4.31 |
| explained_variance | 0.78 |
| learning_rate | 0.000725 |
| loss | -0.0155 |
| n_updates | 470 |
| policy_gradient_loss | 0.0366 |
| std | 0.0268 |
| value_loss | 0.306 |
--------------------------------------
--------------------------------------
| time/ | |
| fps | 25 |
| iterations | 4 |
| time_elapsed | 652 |
| total_timesteps | 200704 |
| train/ | |
| approx_kl | 3.378588 |
| clip_fraction | 0.822 |
| clip_range | 0.2 |
| entropy_loss | 4.41 |
| explained_variance | 0.807 |
| learning_rate | 0.000725 |
| loss | -0.0678 |
| n_updates | 480 |
| policy_gradient_loss | 0.0144 |
| std | 0.0254 |
| value_loss | 0.339 |
--------------------------------------
---------------------------------------
| time/ | |
| fps | 24 |
| iterations | 5 |
| time_elapsed | 840 |
| total_timesteps | 204800 |
| train/ | |
| approx_kl | 3.6989145 |
| clip_fraction | 0.809 |
| clip_range | 0.2 |
| entropy_loss | 4.5 |
| explained_variance | 0.776 |
| learning_rate | 0.000725 |
| loss | 0.0385 |
| n_updates | 490 |
| policy_gradient_loss | 0.0184 |
| std | 0.0242 |
| value_loss | 0.352 |
---------------------------------------
[19:18:17] [200,000/450,000] Checkpoint saved: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/exp20-parallel-450k-v5/checkpoint_0200000.zip
[19:18:23] Eval: gen_track=5.7r/69s ❌@69 mountain=6.7r/69s ❌@69
-------------------------------
| time/ | |
| fps | 43 |
| iterations | 1 |
| time_elapsed | 94 |
| total_timesteps | 208896 |
-------------------------------
---------------------------------------
| time/ | |
| fps | 28 |
| iterations | 2 |
| time_elapsed | 287 |
| total_timesteps | 212992 |
| train/ | |
| approx_kl | 3.2493048 |
| clip_fraction | 0.827 |
| clip_range | 0.2 |
| entropy_loss | 4.71 |
| explained_variance | 0.795 |
| learning_rate | 0.000725 |
| loss | 0.0462 |
| n_updates | 510 |
| policy_gradient_loss | 0.03 |
| std | 0.0219 |
| value_loss | 0.373 |
---------------------------------------
---------------------------------------
| time/ | |
| fps | 25 |
| iterations | 3 |
| time_elapsed | 477 |
| total_timesteps | 217088 |
| train/ | |
| approx_kl | 4.8058205 |
| clip_fraction | 0.862 |
| clip_range | 0.2 |
| entropy_loss | 4.78 |
| explained_variance | 0.826 |
| learning_rate | 0.000725 |
| loss | 0.0104 |
| n_updates | 520 |
| policy_gradient_loss | 0.0622 |
| std | 0.0211 |
| value_loss | 0.393 |
---------------------------------------
--------------------------------------
| time/ | |
| fps | 24 |
| iterations | 4 |
| time_elapsed | 664 |
| total_timesteps | 221184 |
| train/ | |
| approx_kl | 4.024245 |
| clip_fraction | 0.842 |
| clip_range | 0.2 |
| entropy_loss | 4.87 |
| explained_variance | 0.796 |
| learning_rate | 0.000725 |
| loss | 0.126 |
| n_updates | 530 |
| policy_gradient_loss | 0.0408 |
| std | 0.0205 |
| value_loss | 0.394 |
--------------------------------------
---------------------------------------
| time/ | |
| fps | 24 |
| iterations | 5 |
| time_elapsed | 852 |
| total_timesteps | 225280 |
| train/ | |
| approx_kl | 5.5941677 |
| clip_fraction | 0.824 |
| clip_range | 0.2 |
| entropy_loss | 4.95 |
| explained_variance | 0.799 |
| learning_rate | 0.000725 |
| loss | 0.0566 |
| n_updates | 540 |
| policy_gradient_loss | 0.13 |
| std | 0.0194 |
| value_loss | 0.351 |
---------------------------------------
[19:34:05] [220,000/450,000] Checkpoint saved: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/exp20-parallel-450k-v5/checkpoint_0220000.zip
[19:34:10] Eval: gen_track=6.9r/82s ❌@82 mountain=7.9r/82s ❌@82
-------------------------------
| time/ | |
| fps | 46 |
| iterations | 1 |
| time_elapsed | 88 |
| total_timesteps | 229376 |
-------------------------------
---------------------------------------
| time/ | |
| fps | 30 |
| iterations | 2 |
| time_elapsed | 266 |
| total_timesteps | 233472 |
| train/ | |
| approx_kl | 3.5075796 |
| clip_fraction | 0.857 |
| clip_range | 0.2 |
| entropy_loss | 5.08 |
| explained_variance | 0.834 |
| learning_rate | 0.000725 |
| loss | 0.0347 |
| n_updates | 560 |
| policy_gradient_loss | 0.0538 |
| std | 0.0184 |
| value_loss | 0.273 |
---------------------------------------
---------------------------------------
| time/ | |
| fps | 27 |
| iterations | 3 |
| time_elapsed | 454 |
| total_timesteps | 237568 |
| train/ | |
| approx_kl | 3.8775434 |
| clip_fraction | 0.863 |
| clip_range | 0.2 |
| entropy_loss | 5.15 |
| explained_variance | 0.875 |
| learning_rate | 0.000725 |
| loss | -0.0319 |
| n_updates | 570 |
| policy_gradient_loss | 0.0654 |
| std | 0.0179 |
| value_loss | 0.223 |
---------------------------------------
--------------------------------------
| time/ | |
| fps | 25 |
| iterations | 4 |
| time_elapsed | 633 |
| total_timesteps | 241664 |
| train/ | |
| approx_kl | 9.942043 |
| clip_fraction | 0.846 |
| clip_range | 0.2 |
| entropy_loss | 5.23 |
| explained_variance | 0.846 |
| learning_rate | 0.000725 |
| loss | 0.0481 |
| n_updates | 580 |
| policy_gradient_loss | 0.0485 |
| std | 0.0172 |
| value_loss | 0.342 |
--------------------------------------
--------------------------------------
| time/ | |
| fps | 25 |
| iterations | 5 |
| time_elapsed | 812 |
| total_timesteps | 245760 |
| train/ | |
| approx_kl | 6.776287 |
| clip_fraction | 0.889 |
| clip_range | 0.2 |
| entropy_loss | 5.23 |
| explained_variance | 0.857 |
| learning_rate | 0.000725 |
| loss | 0.118 |
| n_updates | 590 |
| policy_gradient_loss | 0.0907 |
| std | 0.0175 |
| value_loss | 0.325 |
--------------------------------------
[19:49:11] [240,000/450,000] Checkpoint saved: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/exp20-parallel-450k-v5/checkpoint_0240000.zip
[19:49:17] Eval: gen_track=6.5r/80s ❌@80 mountain=8.2r/80s ❌@80
-------------------------------
| time/ | |
| fps | 44 |
| iterations | 1 |
| time_elapsed | 91 |
| total_timesteps | 249856 |
-------------------------------
--------------------------------------
| time/ | |
| fps | 30 |
| iterations | 2 |
| time_elapsed | 271 |
| total_timesteps | 253952 |
| train/ | |
| approx_kl | 8.686437 |
| clip_fraction | 0.846 |
| clip_range | 0.2 |
| entropy_loss | 5.32 |
| explained_variance | 0.732 |
| learning_rate | 0.000725 |
| loss | 0.169 |
| n_updates | 610 |
| policy_gradient_loss | 0.0525 |
| std | 0.016 |
| value_loss | 0.457 |
--------------------------------------
--------------------------------------
| time/ | |
| fps | 27 |
| iterations | 3 |
| time_elapsed | 449 |
| total_timesteps | 258048 |
| train/ | |
| approx_kl | 8.093328 |
| clip_fraction | 0.853 |
| clip_range | 0.2 |
| entropy_loss | 5.4 |
| explained_variance | 0.811 |
| learning_rate | 0.000725 |
| loss | 0.109 |
| n_updates | 620 |
| policy_gradient_loss | 0.0624 |
| std | 0.0155 |
| value_loss | 0.405 |
--------------------------------------
--------------------------------------
| time/ | |
| fps | 25 |
| iterations | 4 |
| time_elapsed | 634 |
| total_timesteps | 262144 |
| train/ | |
| approx_kl | 8.560518 |
| clip_fraction | 0.825 |
| clip_range | 0.2 |
| entropy_loss | 5.47 |
| explained_variance | 0.801 |
| learning_rate | 0.000725 |
| loss | 0.00362 |
| n_updates | 630 |
| policy_gradient_loss | 0.0468 |
| std | 0.0149 |
| value_loss | 0.417 |
--------------------------------------
--------------------------------------
| time/ | |
| fps | 23 |
| iterations | 5 |
| time_elapsed | 890 |
| total_timesteps | 266240 |
| train/ | |
| approx_kl | 6.063898 |
| clip_fraction | 0.86 |
| clip_range | 0.2 |
| entropy_loss | 5.56 |
| explained_variance | 0.768 |
| learning_rate | 0.000725 |
| loss | 0.00797 |
| n_updates | 640 |
| policy_gradient_loss | 0.0776 |
| std | 0.0146 |
| value_loss | 0.486 |
--------------------------------------
[20:06:13] [260,000/450,000] Checkpoint saved: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/exp20-parallel-450k-v5/checkpoint_0260000.zip
[20:06:19] Eval: gen_track=3.6r/59s ❌@59 mountain=6.4r/99s ❌@99
-------------------------------
| time/ | |
| fps | 37 |
| iterations | 1 |
| time_elapsed | 110 |
| total_timesteps | 270336 |
-------------------------------
---------------------------------------
| time/ | |
| fps | 23 |
| iterations | 2 |
| time_elapsed | 343 |
| total_timesteps | 274432 |
| train/ | |
| approx_kl | 11.335977 |
| clip_fraction | 0.891 |
| clip_range | 0.2 |
| entropy_loss | 5.69 |
| explained_variance | 0.825 |
| learning_rate | 0.000725 |
| loss | 0.0424 |
| n_updates | 660 |
| policy_gradient_loss | 0.0787 |
| std | 0.0136 |
| value_loss | 0.223 |
---------------------------------------
---------------------------------------
| time/ | |
| fps | 20 |
| iterations | 3 |
| time_elapsed | 594 |
| total_timesteps | 278528 |
| train/ | |
| approx_kl | 11.321661 |
| clip_fraction | 0.852 |
| clip_range | 0.2 |
| entropy_loss | 5.78 |
| explained_variance | 0.808 |
| learning_rate | 0.000725 |
| loss | 0.104 |
| n_updates | 670 |
| policy_gradient_loss | 0.0575 |
| std | 0.0128 |
| value_loss | 0.213 |
---------------------------------------
---------------------------------------
| time/ | |
| fps | 19 |
| iterations | 4 |
| time_elapsed | 860 |
| total_timesteps | 282624 |
| train/ | |
| approx_kl | 18.683723 |
| clip_fraction | 0.862 |
| clip_range | 0.2 |
| entropy_loss | 5.93 |
| explained_variance | 0.844 |
| learning_rate | 0.000725 |
| loss | 0.0861 |
| n_updates | 680 |
| policy_gradient_loss | 0.076 |
| std | 0.0119 |
| value_loss | 0.183 |
---------------------------------------