Research-grade DonkeyCar RL autoresearch and sweep system.

Go to file

Paul Huliganga 0993d4f1e7 docs: Exp 11 + 11b results — parallel envs work, v6 prevents circles, but plateaus at ~194 steps Exp 11 (v5 reward): aborted at 66k — circular driving returned without efficiency term Exp 11b (v6 reward): completed 90k — no circles but plateaus at 170-195 steps All 4 tracks eval: remarkably consistent ~194 steps (including zero-shot) Parallel DummyVecEnv infrastructure proven stable. Next: increase training budget (90k may be insufficient for 2 parallel envs).		2026-04-19 13:26:29 -04:00
.harness	feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing	2026-04-13 10:03:15 -04:00
agent	feat: Exp 11b — parallel DummyVecEnv + v6 reward (anti-circle gate) + built-in eval	2026-04-19 12:03:46 -04:00
docs	docs: Exp 11 + 11b results — parallel envs work, v6 prevents circles, but plateaus at ~194 steps	2026-04-19 13:26:29 -04:00
tests	fix: reward v6 — efficiency gate prevents circular driving, stuck_steps 80→40	2026-04-19 12:02:55 -04:00
.gitignore	feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing	2026-04-13 10:03:15 -04:00
AGENT.md	feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing	2026-04-13 10:03:15 -04:00
DECISIONS.md	docs: session log + ADR-019 — parallel DummyVecEnv for multi-track training	2026-04-19 10:50:11 -04:00
IMPLEMENTATION_PLAN.md	feat: Phase 3 — behavioral control, enhanced evaluator, 53 tests	2026-04-14 09:28:43 -04:00
PROJECT-KICKOFF.md	feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing	2026-04-13 10:03:15 -04:00
PROJECT-SPEC.md	feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing	2026-04-13 10:03:15 -04:00
README.md	feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing	2026-04-13 10:03:15 -04:00
create_gitea_repo.py	Initial commit	2026-04-12 23:44:36 -04:00
ralph-loop.sh	feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing	2026-04-13 10:03:15 -04:00

README.md

donkeycar-rl-autoresearch

Purpose

Status

Scaffolded with the agent harness
Spec not filled yet

Runbook

Fill PROJECT-SPEC.md
Create IMPLEMENTATION_PLAN.md from the spec
Start the implementation loop