donkeycar-rl-autoresearch

Commit Graph

Author	SHA1	Message	Date
Paul Huliganga	b504b89b2a	feat: add exp17 parallel DummyVecEnv 450k training + strategy docs - exp17_parallel_450k.py: parallel two-track training (generated_track:9091, mountain_track:9093), 450k steps, v6 reward, HOST=localhost - DECISIONS.md: ADR-025 (parallel strategy) and ADR-026 (mountain friction fix) - docs/STATE.md: updated to April 2026 state with current champions and strategy - docs/TEST_HISTORY.md: mountain friction fix notes + Exp 17 full design - outerloop-results: exp14 finetune logs and robust mountain eval results Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-28 02:42:20 -04:00
Paul Huliganga	6e2427571a	docs: record failed cross-track warm-start transfer experiments exp15 and exp16	2026-04-20 20:18:08 -04:00
Paul Huliganga	0da04327ef	docs: capture robust mountain finetune winner at 36k and preserve eval comparison	2026-04-20 00:43:27 -04:00
Paul Huliganga	0993d4f1e7	docs: Exp 11 + 11b results — parallel envs work, v6 prevents circles, but plateaus at ~194 steps Exp 11 (v5 reward): aborted at 66k — circular driving returned without efficiency term Exp 11b (v6 reward): completed 90k — no circles but plateaus at 170-195 steps All 4 tracks eval: remarkably consistent ~194 steps (including zero-shot) Parallel DummyVecEnv infrastructure proven stable. Next: increase training budget (90k may be insufficient for 2 parallel envs).	2026-04-19 13:26:29 -04:00
Paul Huliganga	db1274174f	docs: Exp10 vs Exp9 vs Wave4 Trial 9 root cause analysis — random seed lottery	2026-04-19 10:29:16 -04:00
Paul Huliganga	3d04b53a86	docs: Exp10 eval results — total failure, crashes on all tracks (massive regression from Exp9/W4T9)	2026-04-19 10:19:16 -04:00
Paul Huliganga	fecba1dd35	docs: TEST_HISTORY Exp10 plan added Exp10: generated_track + mountain_track, v5 reward, throttle_min=0.2 Same as Exp9 but with visual diversity from second track. Agent: pi	2026-04-18 17:59:07 -04:00
Paul Huliganga	b19dcc8b80	feat: run_eval.py — standard eval runner with persistent logging Every test run now saves to agent/test-results/YYYY-MM-DD_HH-MM_<model>.log so results are never lost. Also added 3-set Exp9 eval results to TEST_HISTORY. Usage: python3 agent/run_eval.py --model models/exp9-.../best_model.zip --sets 3 Agent: pi Tests: 102 passed Tests-Added: 0 TypeScript: N/A	2026-04-18 15:32:36 -04:00
Paul Huliganga	eb4fd39056	docs: TEST_HISTORY updated with Exp8 results and Exp9 plan Exp8 results: 567 reward peak at step 60k, policy diverged after. Best_model correctly saved. mini_monaco crashed at 91 steps (mean) at same corner every time — throttle min=0.5 baked into action space. Exp9 plan: throttle_min=0.2, v5 reward unchanged. Tests hypothesis that v5 gradient is sufficient for hill without forced 0.5 minimum. Agent: pi Tests: 102 passed Tests-Added: 0 TypeScript: N/A	2026-04-18 13:40:45 -04:00
Paul Huliganga	041481916d	docs: TEST_HISTORY.md — comprehensive record of all experiments Every mountain track experiment (Exp1-8) and Wave 4 trials documented: - What was changed from previous test - Key observation from simulator - Root cause of failure - What was learned Also documents: what we keep, open problems, next steps. Exp 8 currently running (PID 2941877). Agent: pi Tests: 102 passed Tests-Added: 0 TypeScript: N/A	2026-04-18 11:18:53 -04:00

10 Commits