[16:39:32] ============================================================ [16:39:32] Exp 20: Parallel DummyVecEnv — 450k steps (sim rebuild + progress fix) [16:39:32] Sim 1: localhost:9091 → generated_track [16:39:32] Sim 2: localhost:9093 → mountain_track [16:39:32] throttle_min=0.2, lr=0.000725, total=450,000 [16:39:32] Reward: v6 + exploit fix (window=200, min_lap=12.0s) [16:39:32] Stuck termination: 40 steps (~2s), hard cap 30s [16:39:32] Progress patience: 150 steps (~7.5s at 20fps) [16:39:32] Checkpoints: every 20,000 steps [16:39:32] ============================================================ [16:39:32] Creating DummyVecEnv with two tracks... starting DonkeyGym env Setting default: start_delay 5.0 Setting default: max_cte 8.0 Setting default: frame_skip 1 Setting default: cam_resolution (120, 160, 3) Setting default: log_level 20 Setting default: steer_limit 1.0 Setting default: throttle_min 0.0 Setting default: throttle_max 1.0 loading scene generated_track starting DonkeyGym env Setting default: start_delay 5.0 Setting default: max_cte 8.0 Setting default: frame_skip 1 Setting default: cam_resolution (120, 160, 3) Setting default: log_level 20 Setting default: steer_limit 1.0 Setting default: throttle_min 0.0 Setting default: throttle_max 1.0 loading scene mountain_track [16:39:34] VecEnv num_envs=2, obs=(3, 120, 160) Using cpu device [16:39:38] PPO created. Starting training... ----------------------------- | time/ | | | fps | 28 | | iterations | 1 | | time_elapsed | 142 | | total_timesteps | 4096 | ----------------------------- ----------------------------------------- | time/ | | | fps | 19 | | iterations | 2 | | time_elapsed | 428 | | total_timesteps | 8192 | | train/ | | | approx_kl | 0.106491014 | | clip_fraction | 0.241 | | clip_range | 0.2 | | entropy_loss | -2.81 | | explained_variance | 0.0458 | | learning_rate | 0.000725 | | loss | -0.0816 | | n_updates | 10 | | policy_gradient_loss | -0.0458 | | std | 0.953 | | value_loss | 1.39 | ----------------------------------------- ---------------------------------------- | time/ | | | fps | 17 | | iterations | 3 | | time_elapsed | 703 | | total_timesteps | 12288 | | train/ | | | approx_kl | 0.15251182 | | clip_fraction | 0.428 | | clip_range | 0.2 | | entropy_loss | -2.74 | | explained_variance | 0.0179 | | learning_rate | 0.000725 | | loss | 0.0731 | | n_updates | 20 | | policy_gradient_loss | -0.0196 | | std | 0.946 | | value_loss | 18.6 | ---------------------------------------- ---------------------------------------- | time/ | | | fps | 17 | | iterations | 4 | | time_elapsed | 955 | | total_timesteps | 16384 | | train/ | | | approx_kl | 0.22289596 | | clip_fraction | 0.505 | | clip_range | 0.2 | | entropy_loss | -2.68 | | explained_variance | -0.418 | | learning_rate | 0.000725 | | loss | -0.0581 | | n_updates | 30 | | policy_gradient_loss | -0.0692 | | std | 0.882 | | value_loss | 0.303 | ---------------------------------------- ---------------------------------------- | time/ | | | fps | 17 | | iterations | 5 | | time_elapsed | 1143 | | total_timesteps | 20480 | | train/ | | | approx_kl | 0.50940317 | | clip_fraction | 0.619 | | clip_range | 0.2 | | entropy_loss | -2.51 | | explained_variance | 0.405 | | learning_rate | 0.000725 | | loss | -0.117 | | n_updates | 40 | | policy_gradient_loss | -0.0885 | | std | 0.807 | | value_loss | 0.196 | ---------------------------------------- [17:00:28] [20,000/450,000] Checkpoint saved: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/exp20-parallel-450k-v5/checkpoint_0020000.zip [17:00:34] Eval: gen_track=6.7r/67s ❌@67 mountain=5.0r/67s ❌@67 [17:00:34] NEW BEST: 11.7 combined reward ------------------------------ | time/ | | | fps | 44 | | iterations | 1 | | time_elapsed | 91 | | total_timesteps | 24576 | ------------------------------ --------------------------------------- | time/ | | | fps | 31 | | iterations | 2 | | time_elapsed | 260 | | total_timesteps | 28672 | | train/ | | | approx_kl | 0.4478566 | | clip_fraction | 0.658 | | clip_range | 0.2 | | entropy_loss | -2.14 | | explained_variance | 0.597 | | learning_rate | 0.000725 | | loss | -0.0885 | | n_updates | 60 | | policy_gradient_loss | -0.0843 | | std | 0.67 | | value_loss | 0.212 | --------------------------------------- --------------------------------------- | time/ | | | fps | 29 | | iterations | 3 | | time_elapsed | 420 | | total_timesteps | 32768 | | train/ | | | approx_kl | 0.9769267 | | clip_fraction | 0.658 | | clip_range | 0.2 | | entropy_loss | -1.96 | | explained_variance | 0.399 | | learning_rate | 0.000725 | | loss | -0.107 | | n_updates | 70 | | policy_gradient_loss | -0.0829 | | std | 0.61 | | value_loss | 0.403 | --------------------------------------- --------------------------------------- | time/ | | | fps | 28 | | iterations | 4 | | time_elapsed | 576 | | total_timesteps | 36864 | | train/ | | | approx_kl | 0.5094907 | | clip_fraction | 0.682 | | clip_range | 0.2 | | entropy_loss | -1.81 | | explained_variance | 0.45 | | learning_rate | 0.000725 | | loss | -0.0843 | | n_updates | 80 | | policy_gradient_loss | -0.076 | | std | 0.562 | | value_loss | 0.396 | --------------------------------------- ---------------------------------------- | time/ | | | fps | 27 | | iterations | 5 | | time_elapsed | 731 | | total_timesteps | 40960 | | train/ | | | approx_kl | 0.70720506 | | clip_fraction | 0.703 | | clip_range | 0.2 | | entropy_loss | -1.61 | | explained_variance | 0.466 | | learning_rate | 0.000725 | | loss | -0.0856 | | n_updates | 90 | | policy_gradient_loss | -0.0797 | | std | 0.513 | | value_loss | 0.468 | ---------------------------------------- [17:14:13] [40,000/450,000] Checkpoint saved: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/exp20-parallel-450k-v5/checkpoint_0040000.zip [17:14:19] Eval: gen_track=21.1r/148s ❌@148 mountain=15.7r/147s ❌@147 [17:14:20] NEW BEST: 36.8 combined reward ------------------------------ | time/ | | | fps | 64 | | iterations | 1 | | time_elapsed | 63 | | total_timesteps | 45056 | ------------------------------ ---------------------------------------- | time/ | | | fps | 37 | | iterations | 2 | | time_elapsed | 218 | | total_timesteps | 49152 | | train/ | | | approx_kl | 0.74158007 | | clip_fraction | 0.711 | | clip_range | 0.2 | | entropy_loss | -1.21 | | explained_variance | 0.591 | | learning_rate | 0.000725 | | loss | -0.0973 | | n_updates | 110 | | policy_gradient_loss | -0.0698 | | std | 0.416 | | value_loss | 0.462 | ---------------------------------------- --------------------------------------- | time/ | | | fps | 33 | | iterations | 3 | | time_elapsed | 368 | | total_timesteps | 53248 | | train/ | | | approx_kl | 0.8501822 | | clip_fraction | 0.727 | | clip_range | 0.2 | | entropy_loss | -0.999 | | explained_variance | 0.727 | | learning_rate | 0.000725 | | loss | -0.048 | | n_updates | 120 | | policy_gradient_loss | -0.0694 | | std | 0.373 | | value_loss | 0.399 | --------------------------------------- ---------------------------------------- | time/ | | | fps | 31 | | iterations | 4 | | time_elapsed | 523 | | total_timesteps | 57344 | | train/ | | | approx_kl | 0.99027634 | | clip_fraction | 0.729 | | clip_range | 0.2 | | entropy_loss | -0.791 | | explained_variance | 0.804 | | learning_rate | 0.000725 | | loss | -0.0752 | | n_updates | 130 | | policy_gradient_loss | -0.0647 | | std | 0.335 | | value_loss | 0.333 | ---------------------------------------- --------------------------------------- | time/ | | | fps | 30 | | iterations | 5 | | time_elapsed | 669 | | total_timesteps | 61440 | | train/ | | | approx_kl | 1.0802925 | | clip_fraction | 0.727 | | clip_range | 0.2 | | entropy_loss | -0.588 | | explained_variance | 0.81 | | learning_rate | 0.000725 | | loss | -0.0711 | | n_updates | 140 | | policy_gradient_loss | -0.0571 | | std | 0.303 | | value_loss | 0.459 | --------------------------------------- [17:27:06] [60,000/450,000] Checkpoint saved: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/exp20-parallel-450k-v5/checkpoint_0060000.zip [17:27:13] Eval: gen_track=6.6r/65s ❌@65 mountain=8.8r/112s ❌@112 ------------------------------ | time/ | | | fps | 66 | | iterations | 1 | | time_elapsed | 61 | | total_timesteps | 65536 | ------------------------------ --------------------------------------- | time/ | | | fps | 39 | | iterations | 2 | | time_elapsed | 208 | | total_timesteps | 69632 | | train/ | | | approx_kl | 1.1047575 | | clip_fraction | 0.752 | | clip_range | 0.2 | | entropy_loss | -0.185 | | explained_variance | 0.811 | | learning_rate | 0.000725 | | loss | -0.0207 | | n_updates | 160 | | policy_gradient_loss | -0.0497 | | std | 0.247 | | value_loss | 0.504 | --------------------------------------- --------------------------------------- | time/ | | | fps | 34 | | iterations | 3 | | time_elapsed | 355 | | total_timesteps | 73728 | | train/ | | | approx_kl | 1.0727906 | | clip_fraction | 0.757 | | clip_range | 0.2 | | entropy_loss | 0.0196 | | explained_variance | 0.82 | | learning_rate | 0.000725 | | loss | -0.0671 | | n_updates | 170 | | policy_gradient_loss | -0.0498 | | std | 0.223 | | value_loss | 0.409 | --------------------------------------- ---------------------------------------- | time/ | | | fps | 32 | | iterations | 4 | | time_elapsed | 504 | | total_timesteps | 77824 | | train/ | | | approx_kl | 0.91813445 | | clip_fraction | 0.738 | | clip_range | 0.2 | | entropy_loss | 0.23 | | explained_variance | 0.889 | | learning_rate | 0.000725 | | loss | -0.0627 | | n_updates | 180 | | policy_gradient_loss | -0.0518 | | std | 0.201 | | value_loss | 0.324 | ---------------------------------------- --------------------------------------- | time/ | | | fps | 30 | | iterations | 5 | | time_elapsed | 670 | | total_timesteps | 81920 | | train/ | | | approx_kl | 1.0347024 | | clip_fraction | 0.742 | | clip_range | 0.2 | | entropy_loss | 0.431 | | explained_variance | 0.841 | | learning_rate | 0.000725 | | loss | -0.043 | | n_updates | 190 | | policy_gradient_loss | -0.0455 | | std | 0.182 | | value_loss | 0.473 | --------------------------------------- [17:39:48] [80,000/450,000] Checkpoint saved: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/exp20-parallel-450k-v5/checkpoint_0080000.zip [17:39:55] Eval: gen_track=37.2r/209s ❌@209 mountain=28.0r/208s ❌@208 [17:39:55] NEW BEST: 65.2 combined reward ------------------------------ | time/ | | | fps | 70 | | iterations | 1 | | time_elapsed | 57 | | total_timesteps | 86016 | ------------------------------ --------------------------------------- | time/ | | | fps | 41 | | iterations | 2 | | time_elapsed | 197 | | total_timesteps | 90112 | | train/ | | | approx_kl | 1.3096654 | | clip_fraction | 0.777 | | clip_range | 0.2 | | entropy_loss | 0.824 | | explained_variance | 0.676 | | learning_rate | 0.000725 | | loss | -0.0305 | | n_updates | 210 | | policy_gradient_loss | -0.0378 | | std | 0.149 | | value_loss | 0.656 | --------------------------------------- --------------------------------------- | time/ | | | fps | 36 | | iterations | 3 | | time_elapsed | 335 | | total_timesteps | 94208 | | train/ | | | approx_kl | 1.4713185 | | clip_fraction | 0.773 | | clip_range | 0.2 | | entropy_loss | 1.02 | | explained_variance | 0.733 | | learning_rate | 0.000725 | | loss | -0.0886 | | n_updates | 220 | | policy_gradient_loss | -0.0435 | | std | 0.135 | | value_loss | 0.46 | --------------------------------------- --------------------------------------- | time/ | | | fps | 33 | | iterations | 4 | | time_elapsed | 482 | | total_timesteps | 98304 | | train/ | | | approx_kl | 1.4152278 | | clip_fraction | 0.768 | | clip_range | 0.2 | | entropy_loss | 1.22 | | explained_variance | 0.816 | | learning_rate | 0.000725 | | loss | -0.045 | | n_updates | 230 | | policy_gradient_loss | -0.0434 | | std | 0.122 | | value_loss | 0.423 | --------------------------------------- --------------------------------------- | time/ | | | fps | 32 | | iterations | 5 | | time_elapsed | 627 | | total_timesteps | 102400 | | train/ | | | approx_kl | 1.4566097 | | clip_fraction | 0.774 | | clip_range | 0.2 | | entropy_loss | 1.4 | | explained_variance | 0.833 | | learning_rate | 0.000725 | | loss | -0.0272 | | n_updates | 240 | | policy_gradient_loss | -0.0242 | | std | 0.111 | | value_loss | 0.603 | --------------------------------------- [17:51:51] [100,000/450,000] Checkpoint saved: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/exp20-parallel-450k-v5/checkpoint_0100000.zip [17:51:58] Eval: gen_track=28.5r/175s ❌@175 mountain=27.3r/175s ❌@175 ------------------------------- | time/ | | | fps | 80 | | iterations | 1 | | time_elapsed | 50 | | total_timesteps | 106496 | ------------------------------- --------------------------------------- | time/ | | | fps | 42 | | iterations | 2 | | time_elapsed | 194 | | total_timesteps | 110592 | | train/ | | | approx_kl | 1.3409375 | | clip_fraction | 0.781 | | clip_range | 0.2 | | entropy_loss | 1.79 | | explained_variance | 0.893 | | learning_rate | 0.000725 | | loss | -0.0767 | | n_updates | 260 | | policy_gradient_loss | -0.0304 | | std | 0.092 | | value_loss | 0.312 | --------------------------------------- --------------------------------------- | time/ | | | fps | 36 | | iterations | 3 | | time_elapsed | 340 | | total_timesteps | 114688 | | train/ | | | approx_kl | 1.1949866 | | clip_fraction | 0.775 | | clip_range | 0.2 | | entropy_loss | 1.95 | | explained_variance | 0.851 | | learning_rate | 0.000725 | | loss | -0.0391 | | n_updates | 270 | | policy_gradient_loss | -0.0179 | | std | 0.0843 | | value_loss | 0.466 | --------------------------------------- --------------------------------------- | time/ | | | fps | 34 | | iterations | 4 | | time_elapsed | 479 | | total_timesteps | 118784 | | train/ | | | approx_kl | 1.3683267 | | clip_fraction | 0.795 | | clip_range | 0.2 | | entropy_loss | 2.11 | | explained_variance | 0.847 | | learning_rate | 0.000725 | | loss | -0.04 | | n_updates | 280 | | policy_gradient_loss | 0.0118 | | std | 0.0781 | | value_loss | 0.518 | --------------------------------------- --------------------------------------- | time/ | | | fps | 32 | | iterations | 5 | | time_elapsed | 635 | | total_timesteps | 122880 | | train/ | | | approx_kl | 1.6239161 | | clip_fraction | 0.793 | | clip_range | 0.2 | | entropy_loss | 2.25 | | explained_variance | 0.839 | | learning_rate | 0.000725 | | loss | -0.0657 | | n_updates | 290 | | policy_gradient_loss | 0.007 | | std | 0.0734 | | value_loss | 0.621 | --------------------------------------- [18:04:01] [120,000/450,000] Checkpoint saved: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/exp20-parallel-450k-v5/checkpoint_0120000.zip [18:04:08] Eval: gen_track=27.7r/180s ❌@180 mountain=25.7r/180s ❌@180 ------------------------------- | time/ | | | fps | 56 | | iterations | 1 | | time_elapsed | 73 | | total_timesteps | 126976 | ------------------------------- --------------------------------------- | time/ | | | fps | 27 | | iterations | 2 | | time_elapsed | 292 | | total_timesteps | 131072 | | train/ | | | approx_kl | 1.5144145 | | clip_fraction | 0.816 | | clip_range | 0.2 | | entropy_loss | 2.45 | | explained_variance | 0.721 | | learning_rate | 0.000725 | | loss | 0.0266 | | n_updates | 310 | | policy_gradient_loss | 0.0224 | | std | 0.0652 | | value_loss | 0.869 | --------------------------------------- --------------------------------------- | time/ | | | fps | 24 | | iterations | 3 | | time_elapsed | 502 | | total_timesteps | 135168 | | train/ | | | approx_kl | 1.7737575 | | clip_fraction | 0.817 | | clip_range | 0.2 | | entropy_loss | 2.59 | | explained_variance | 0.751 | | learning_rate | 0.000725 | | loss | -0.0113 | | n_updates | 320 | | policy_gradient_loss | 0.0307 | | std | 0.061 | | value_loss | 0.81 | --------------------------------------- --------------------------------------- | time/ | | | fps | 22 | | iterations | 4 | | time_elapsed | 725 | | total_timesteps | 139264 | | train/ | | | approx_kl | 2.3732958 | | clip_fraction | 0.831 | | clip_range | 0.2 | | entropy_loss | 2.73 | | explained_variance | 0.711 | | learning_rate | 0.000725 | | loss | 0.0475 | | n_updates | 330 | | policy_gradient_loss | 0.0432 | | std | 0.0575 | | value_loss | 0.965 | --------------------------------------- -------------------------------------- | time/ | | | fps | 21 | | iterations | 5 | | time_elapsed | 946 | | total_timesteps | 143360 | | train/ | | | approx_kl | 3.434102 | | clip_fraction | 0.86 | | clip_range | 0.2 | | entropy_loss | 2.83 | | explained_variance | 0.574 | | learning_rate | 0.000725 | | loss | 0.119 | | n_updates | 340 | | policy_gradient_loss | 0.0767 | | std | 0.0555 | | value_loss | 1.35 | -------------------------------------- [18:22:13] [140,000/450,000] Checkpoint saved: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/exp20-parallel-450k-v5/checkpoint_0140000.zip [18:22:19] Eval: gen_track=8.2r/79s ❌@79 mountain=8.9r/79s ❌@79 ------------------------------- | time/ | | | fps | 46 | | iterations | 1 | | time_elapsed | 88 | | total_timesteps | 147456 | ------------------------------- --------------------------------------- | time/ | | | fps | 22 | | iterations | 2 | | time_elapsed | 357 | | total_timesteps | 151552 | | train/ | | | approx_kl | 2.1389558 | | clip_fraction | 0.838 | | clip_range | 0.2 | | entropy_loss | 2.94 | | explained_variance | 0.596 | | learning_rate | 0.000725 | | loss | 0.156 | | n_updates | 360 | | policy_gradient_loss | 0.0675 | | std | 0.0521 | | value_loss | 1.21 | --------------------------------------- -------------------------------------- | time/ | | | fps | 19 | | iterations | 3 | | time_elapsed | 629 | | total_timesteps | 155648 | | train/ | | | approx_kl | 4.391129 | | clip_fraction | 0.863 | | clip_range | 0.2 | | entropy_loss | 3.04 | | explained_variance | 0.656 | | learning_rate | 0.000725 | | loss | 0.0562 | | n_updates | 370 | | policy_gradient_loss | 0.0623 | | std | 0.0491 | | value_loss | 0.813 | -------------------------------------- --------------------------------------- | time/ | | | fps | 18 | | iterations | 4 | | time_elapsed | 890 | | total_timesteps | 159744 | | train/ | | | approx_kl | 2.7348022 | | clip_fraction | 0.855 | | clip_range | 0.2 | | entropy_loss | 3.12 | | explained_variance | 0.698 | | learning_rate | 0.000725 | | loss | 0.0226 | | n_updates | 380 | | policy_gradient_loss | 0.038 | | std | 0.047 | | value_loss | 0.562 | --------------------------------------- -------------------------------------- | time/ | | | fps | 18 | | iterations | 5 | | time_elapsed | 1133 | | total_timesteps | 163840 | | train/ | | | approx_kl | 2.956015 | | clip_fraction | 0.826 | | clip_range | 0.2 | | entropy_loss | 3.25 | | explained_variance | 0.718 | | learning_rate | 0.000725 | | loss | -0.0102 | | n_updates | 390 | | policy_gradient_loss | 0.0226 | | std | 0.0448 | | value_loss | 0.436 | -------------------------------------- [18:43:43] [160,000/450,000] Checkpoint saved: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/exp20-parallel-450k-v5/checkpoint_0160000.zip [18:43:49] Eval: gen_track=5.5r/69s ❌@69 mountain=7.4r/97s ❌@97 ------------------------------- | time/ | | | fps | 41 | | iterations | 1 | | time_elapsed | 98 | | total_timesteps | 167936 | ------------------------------- --------------------------------------- | time/ | | | fps | 23 | | iterations | 2 | | time_elapsed | 350 | | total_timesteps | 172032 | | train/ | | | approx_kl | 2.9826787 | | clip_fraction | 0.813 | | clip_range | 0.2 | | entropy_loss | 3.5 | | explained_variance | 0.778 | | learning_rate | 0.000725 | | loss | -0.0133 | | n_updates | 410 | | policy_gradient_loss | 0.000808 | | std | 0.0392 | | value_loss | 0.266 | --------------------------------------- --------------------------------------- | time/ | | | fps | 20 | | iterations | 3 | | time_elapsed | 601 | | total_timesteps | 176128 | | train/ | | | approx_kl | 3.4679208 | | clip_fraction | 0.801 | | clip_range | 0.2 | | entropy_loss | 3.68 | | explained_variance | 0.786 | | learning_rate | 0.000725 | | loss | -0.0178 | | n_updates | 420 | | policy_gradient_loss | -0.0115 | | std | 0.0359 | | value_loss | 0.278 | --------------------------------------- -------------------------------------- | time/ | | | fps | 19 | | iterations | 4 | | time_elapsed | 842 | | total_timesteps | 180224 | | train/ | | | approx_kl | 3.401013 | | clip_fraction | 0.81 | | clip_range | 0.2 | | entropy_loss | 3.84 | | explained_variance | 0.778 | | learning_rate | 0.000725 | | loss | -0.0331 | | n_updates | 430 | | policy_gradient_loss | -0.00564 | | std | 0.0333 | | value_loss | 0.299 | -------------------------------------- --------------------------------------- | time/ | | | fps | 19 | | iterations | 5 | | time_elapsed | 1033 | | total_timesteps | 184320 | | train/ | | | approx_kl | 2.4161968 | | clip_fraction | 0.822 | | clip_range | 0.2 | | entropy_loss | 3.98 | | explained_variance | 0.806 | | learning_rate | 0.000725 | | loss | -0.0604 | | n_updates | 440 | | policy_gradient_loss | 0.00771 | | std | 0.031 | | value_loss | 0.283 | --------------------------------------- [19:02:39] [180,000/450,000] Checkpoint saved: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/exp20-parallel-450k-v5/checkpoint_0180000.zip [19:02:45] Eval: gen_track=6.7r/80s ❌@80 mountain=8.2r/97s ❌@97 ------------------------------- | time/ | | | fps | 45 | | iterations | 1 | | time_elapsed | 90 | | total_timesteps | 188416 | ------------------------------- -------------------------------------- | time/ | | | fps | 30 | | iterations | 2 | | time_elapsed | 270 | | total_timesteps | 192512 | | train/ | | | approx_kl | 4.115147 | | clip_fraction | 0.836 | | clip_range | 0.2 | | entropy_loss | 4.22 | | explained_variance | 0.78 | | learning_rate | 0.000725 | | loss | 0.0814 | | n_updates | 460 | | policy_gradient_loss | 0.0216 | | std | 0.0278 | | value_loss | 0.338 | -------------------------------------- -------------------------------------- | time/ | | | fps | 26 | | iterations | 3 | | time_elapsed | 464 | | total_timesteps | 196608 | | train/ | | | approx_kl | 4.002183 | | clip_fraction | 0.841 | | clip_range | 0.2 | | entropy_loss | 4.31 | | explained_variance | 0.78 | | learning_rate | 0.000725 | | loss | -0.0155 | | n_updates | 470 | | policy_gradient_loss | 0.0366 | | std | 0.0268 | | value_loss | 0.306 | -------------------------------------- -------------------------------------- | time/ | | | fps | 25 | | iterations | 4 | | time_elapsed | 652 | | total_timesteps | 200704 | | train/ | | | approx_kl | 3.378588 | | clip_fraction | 0.822 | | clip_range | 0.2 | | entropy_loss | 4.41 | | explained_variance | 0.807 | | learning_rate | 0.000725 | | loss | -0.0678 | | n_updates | 480 | | policy_gradient_loss | 0.0144 | | std | 0.0254 | | value_loss | 0.339 | -------------------------------------- --------------------------------------- | time/ | | | fps | 24 | | iterations | 5 | | time_elapsed | 840 | | total_timesteps | 204800 | | train/ | | | approx_kl | 3.6989145 | | clip_fraction | 0.809 | | clip_range | 0.2 | | entropy_loss | 4.5 | | explained_variance | 0.776 | | learning_rate | 0.000725 | | loss | 0.0385 | | n_updates | 490 | | policy_gradient_loss | 0.0184 | | std | 0.0242 | | value_loss | 0.352 | --------------------------------------- [19:18:17] [200,000/450,000] Checkpoint saved: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/exp20-parallel-450k-v5/checkpoint_0200000.zip [19:18:23] Eval: gen_track=5.7r/69s ❌@69 mountain=6.7r/69s ❌@69 ------------------------------- | time/ | | | fps | 43 | | iterations | 1 | | time_elapsed | 94 | | total_timesteps | 208896 | ------------------------------- --------------------------------------- | time/ | | | fps | 28 | | iterations | 2 | | time_elapsed | 287 | | total_timesteps | 212992 | | train/ | | | approx_kl | 3.2493048 | | clip_fraction | 0.827 | | clip_range | 0.2 | | entropy_loss | 4.71 | | explained_variance | 0.795 | | learning_rate | 0.000725 | | loss | 0.0462 | | n_updates | 510 | | policy_gradient_loss | 0.03 | | std | 0.0219 | | value_loss | 0.373 | --------------------------------------- --------------------------------------- | time/ | | | fps | 25 | | iterations | 3 | | time_elapsed | 477 | | total_timesteps | 217088 | | train/ | | | approx_kl | 4.8058205 | | clip_fraction | 0.862 | | clip_range | 0.2 | | entropy_loss | 4.78 | | explained_variance | 0.826 | | learning_rate | 0.000725 | | loss | 0.0104 | | n_updates | 520 | | policy_gradient_loss | 0.0622 | | std | 0.0211 | | value_loss | 0.393 | --------------------------------------- -------------------------------------- | time/ | | | fps | 24 | | iterations | 4 | | time_elapsed | 664 | | total_timesteps | 221184 | | train/ | | | approx_kl | 4.024245 | | clip_fraction | 0.842 | | clip_range | 0.2 | | entropy_loss | 4.87 | | explained_variance | 0.796 | | learning_rate | 0.000725 | | loss | 0.126 | | n_updates | 530 | | policy_gradient_loss | 0.0408 | | std | 0.0205 | | value_loss | 0.394 | -------------------------------------- --------------------------------------- | time/ | | | fps | 24 | | iterations | 5 | | time_elapsed | 852 | | total_timesteps | 225280 | | train/ | | | approx_kl | 5.5941677 | | clip_fraction | 0.824 | | clip_range | 0.2 | | entropy_loss | 4.95 | | explained_variance | 0.799 | | learning_rate | 0.000725 | | loss | 0.0566 | | n_updates | 540 | | policy_gradient_loss | 0.13 | | std | 0.0194 | | value_loss | 0.351 | --------------------------------------- [19:34:05] [220,000/450,000] Checkpoint saved: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/exp20-parallel-450k-v5/checkpoint_0220000.zip [19:34:10] Eval: gen_track=6.9r/82s ❌@82 mountain=7.9r/82s ❌@82 ------------------------------- | time/ | | | fps | 46 | | iterations | 1 | | time_elapsed | 88 | | total_timesteps | 229376 | ------------------------------- --------------------------------------- | time/ | | | fps | 30 | | iterations | 2 | | time_elapsed | 266 | | total_timesteps | 233472 | | train/ | | | approx_kl | 3.5075796 | | clip_fraction | 0.857 | | clip_range | 0.2 | | entropy_loss | 5.08 | | explained_variance | 0.834 | | learning_rate | 0.000725 | | loss | 0.0347 | | n_updates | 560 | | policy_gradient_loss | 0.0538 | | std | 0.0184 | | value_loss | 0.273 | --------------------------------------- --------------------------------------- | time/ | | | fps | 27 | | iterations | 3 | | time_elapsed | 454 | | total_timesteps | 237568 | | train/ | | | approx_kl | 3.8775434 | | clip_fraction | 0.863 | | clip_range | 0.2 | | entropy_loss | 5.15 | | explained_variance | 0.875 | | learning_rate | 0.000725 | | loss | -0.0319 | | n_updates | 570 | | policy_gradient_loss | 0.0654 | | std | 0.0179 | | value_loss | 0.223 | --------------------------------------- -------------------------------------- | time/ | | | fps | 25 | | iterations | 4 | | time_elapsed | 633 | | total_timesteps | 241664 | | train/ | | | approx_kl | 9.942043 | | clip_fraction | 0.846 | | clip_range | 0.2 | | entropy_loss | 5.23 | | explained_variance | 0.846 | | learning_rate | 0.000725 | | loss | 0.0481 | | n_updates | 580 | | policy_gradient_loss | 0.0485 | | std | 0.0172 | | value_loss | 0.342 | -------------------------------------- -------------------------------------- | time/ | | | fps | 25 | | iterations | 5 | | time_elapsed | 812 | | total_timesteps | 245760 | | train/ | | | approx_kl | 6.776287 | | clip_fraction | 0.889 | | clip_range | 0.2 | | entropy_loss | 5.23 | | explained_variance | 0.857 | | learning_rate | 0.000725 | | loss | 0.118 | | n_updates | 590 | | policy_gradient_loss | 0.0907 | | std | 0.0175 | | value_loss | 0.325 | -------------------------------------- [19:49:11] [240,000/450,000] Checkpoint saved: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/exp20-parallel-450k-v5/checkpoint_0240000.zip [19:49:17] Eval: gen_track=6.5r/80s ❌@80 mountain=8.2r/80s ❌@80 ------------------------------- | time/ | | | fps | 44 | | iterations | 1 | | time_elapsed | 91 | | total_timesteps | 249856 | ------------------------------- -------------------------------------- | time/ | | | fps | 30 | | iterations | 2 | | time_elapsed | 271 | | total_timesteps | 253952 | | train/ | | | approx_kl | 8.686437 | | clip_fraction | 0.846 | | clip_range | 0.2 | | entropy_loss | 5.32 | | explained_variance | 0.732 | | learning_rate | 0.000725 | | loss | 0.169 | | n_updates | 610 | | policy_gradient_loss | 0.0525 | | std | 0.016 | | value_loss | 0.457 | -------------------------------------- -------------------------------------- | time/ | | | fps | 27 | | iterations | 3 | | time_elapsed | 449 | | total_timesteps | 258048 | | train/ | | | approx_kl | 8.093328 | | clip_fraction | 0.853 | | clip_range | 0.2 | | entropy_loss | 5.4 | | explained_variance | 0.811 | | learning_rate | 0.000725 | | loss | 0.109 | | n_updates | 620 | | policy_gradient_loss | 0.0624 | | std | 0.0155 | | value_loss | 0.405 | -------------------------------------- -------------------------------------- | time/ | | | fps | 25 | | iterations | 4 | | time_elapsed | 634 | | total_timesteps | 262144 | | train/ | | | approx_kl | 8.560518 | | clip_fraction | 0.825 | | clip_range | 0.2 | | entropy_loss | 5.47 | | explained_variance | 0.801 | | learning_rate | 0.000725 | | loss | 0.00362 | | n_updates | 630 | | policy_gradient_loss | 0.0468 | | std | 0.0149 | | value_loss | 0.417 | -------------------------------------- -------------------------------------- | time/ | | | fps | 23 | | iterations | 5 | | time_elapsed | 890 | | total_timesteps | 266240 | | train/ | | | approx_kl | 6.063898 | | clip_fraction | 0.86 | | clip_range | 0.2 | | entropy_loss | 5.56 | | explained_variance | 0.768 | | learning_rate | 0.000725 | | loss | 0.00797 | | n_updates | 640 | | policy_gradient_loss | 0.0776 | | std | 0.0146 | | value_loss | 0.486 | -------------------------------------- [20:06:13] [260,000/450,000] Checkpoint saved: /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/exp20-parallel-450k-v5/checkpoint_0260000.zip [20:06:19] Eval: gen_track=3.6r/59s ❌@59 mountain=6.4r/99s ❌@99 ------------------------------- | time/ | | | fps | 37 | | iterations | 1 | | time_elapsed | 110 | | total_timesteps | 270336 | ------------------------------- --------------------------------------- | time/ | | | fps | 23 | | iterations | 2 | | time_elapsed | 343 | | total_timesteps | 274432 | | train/ | | | approx_kl | 11.335977 | | clip_fraction | 0.891 | | clip_range | 0.2 | | entropy_loss | 5.69 | | explained_variance | 0.825 | | learning_rate | 0.000725 | | loss | 0.0424 | | n_updates | 660 | | policy_gradient_loss | 0.0787 | | std | 0.0136 | | value_loss | 0.223 | --------------------------------------- --------------------------------------- | time/ | | | fps | 20 | | iterations | 3 | | time_elapsed | 594 | | total_timesteps | 278528 | | train/ | | | approx_kl | 11.321661 | | clip_fraction | 0.852 | | clip_range | 0.2 | | entropy_loss | 5.78 | | explained_variance | 0.808 | | learning_rate | 0.000725 | | loss | 0.104 | | n_updates | 670 | | policy_gradient_loss | 0.0575 | | std | 0.0128 | | value_loss | 0.213 | --------------------------------------- --------------------------------------- | time/ | | | fps | 19 | | iterations | 4 | | time_elapsed | 860 | | total_timesteps | 282624 | | train/ | | | approx_kl | 18.683723 | | clip_fraction | 0.862 | | clip_range | 0.2 | | entropy_loss | 5.93 | | explained_variance | 0.844 | | learning_rate | 0.000725 | | loss | 0.0861 | | n_updates | 680 | | policy_gradient_loss | 0.076 | | std | 0.0119 | | value_loss | 0.183 | ---------------------------------------