donkeycar-rl-autoresearch/docs
Paul Huliganga eb4fd39056 docs: TEST_HISTORY updated with Exp8 results and Exp9 plan
Exp8 results: 567 reward peak at step 60k, policy diverged after.
Best_model correctly saved. mini_monaco crashed at 91 steps (mean)
at same corner every time — throttle min=0.5 baked into action space.

Exp9 plan: throttle_min=0.2, v5 reward unchanged. Tests hypothesis
that v5 gradient is sufficient for hill without forced 0.5 minimum.

Agent: pi
Tests: 102 passed
Tests-Added: 0
TypeScript: N/A
2026-04-18 13:40:45 -04:00
..
track-screenshots wave3: add multi-track autoresearch system (83 tests passing) 2026-04-14 12:47:12 -04:00
ARCHITECTURE.md docs: ARCHITECTURE.md — complete system architecture guide 2026-04-17 14:06:38 -04:00
RESEARCH_LOG.md wave3: add multi-track autoresearch system (83 tests passing) 2026-04-14 12:47:12 -04:00
STATE.md docs: STATE.md updated with April 16 test results 2026-04-16 20:45:45 -04:00
TEST_HISTORY.md docs: TEST_HISTORY updated with Exp8 results and Exp9 plan 2026-04-18 13:40:45 -04:00