Research-grade DonkeyCar RL autoresearch and sweep system.

Go to file

Paul Huliganga 138c65270f feat(exp22): add solid-hit/wedge/high-CTE exploit fixes and generated-pair warm experiments - reward_wrapper: detect barrier/wall/tree solid hits, terminate on head-on impact or 4 sustained solid-hit frames; prevents car wedging against invisible barriers - reward_wrapper: add low-speed/wedge termination — kills episode when car is pinned motionless (below threshold, no displacement) after grace period - reward_wrapper: high-CTE exploit fix — return -0.25 immediately when CTE > max_cte_terminate (not after patience), so PPO cannot collect positive speed rewards while driving the large outside-road circle - tests: 23 passing unit tests covering all new termination paths - exp20/21/22: add parallel DummyVecEnv experiments on generated_road+generated_track with warm-start from champion model; exp22 is current active run - SESSION_HANDOFF.md: live handoff doc for next session continuity Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>		2026-05-05 14:46:13 -04:00
.harness	feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing	2026-04-13 10:03:15 -04:00
agent	feat(exp22): add solid-hit/wedge/high-CTE exploit fixes and generated-pair warm experiments	2026-05-05 14:46:13 -04:00
docs	feat: add exp17 parallel DummyVecEnv 450k training + strategy docs	2026-04-28 02:42:20 -04:00
tests	feat(exp22): add solid-hit/wedge/high-CTE exploit fixes and generated-pair warm experiments	2026-05-05 14:46:13 -04:00
.gitignore	feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing	2026-04-13 10:03:15 -04:00
AGENT.md	feat(exp22): add solid-hit/wedge/high-CTE exploit fixes and generated-pair warm experiments	2026-05-05 14:46:13 -04:00
DECISIONS.md	feat: add exp17 parallel DummyVecEnv 450k training + strategy docs	2026-04-28 02:42:20 -04:00
IMPLEMENTATION_PLAN.md	feat: Phase 3 — behavioral control, enhanced evaluator, 53 tests	2026-04-14 09:28:43 -04:00
PROJECT-KICKOFF.md	feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing	2026-04-13 10:03:15 -04:00
PROJECT-SPEC.md	feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing	2026-04-13 10:03:15 -04:00
README.md	feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing	2026-04-13 10:03:15 -04:00
create_gitea_repo.py	Initial commit	2026-04-12 23:44:36 -04:00
monitor_training.sh	fix: exp19 — hard episode time limit to stop minutes-long stuck cars	2026-04-28 09:18:04 -04:00
ralph-loop.sh	feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing	2026-04-13 10:03:15 -04:00

README.md

donkeycar-rl-autoresearch

Purpose

Status

Scaffolded with the agent harness
Spec not filled yet

Runbook

Fill PROJECT-SPEC.md
Create IMPLEMENTATION_PLAN.md from the spec
Start the implementation loop