Research-grade DonkeyCar RL autoresearch and sweep system.
Go to file
Paul Huliganga 7b8830f0cb milestone: Phase 1 complete — genuine driving confirmed; launch Phase 2 corner learning
PHASE 1 MILESTONE:
- Champion model drives the track for 599 steps (mean_reward=1022.78, std=0.45)
- Path efficiency 96-100% throughout — genuine forward motion confirmed
- Navigates first right-hand curve successfully
- Fails at S-curve (right->left) at step ~560: speed too high for tight corners
- Root cause: only 4787 training timesteps — model never sees S-curve enough to learn it

PHASE 2 CONFIG (corner learning):
- timesteps: 10,000-50,000 (10x more — model must experience S-curve many times)
- learning_rate: 0.00005-0.002 (tightened around Phase 1 winning region)
- eval_episodes: 5 (more reliable corner stats)
- JOB_TIMEOUT: 3600s (50k steps on CPU needs time)
- Results: autoresearch_results_phase2.jsonl (clean separation from Phase 1)

Research documentation:
- Phase 1 milestone added to docs/RESEARCH_LOG.md
- Full trajectory analysis: start -> first corner -> S-curve crash position logged
- Reward shaping v3 path efficiency victory documented
- evaluate_champion.py added for visual + diagnostic evaluation

Agent: pi/claude-sonnet
Tests: 40/40 passing
Tests-Added: 0
TypeScript: N/A
2026-04-13 19:33:06 -04:00
.harness feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing 2026-04-13 10:03:15 -04:00
agent milestone: Phase 1 complete — genuine driving confirmed; launch Phase 2 corner learning 2026-04-13 19:33:06 -04:00
docs milestone: Phase 1 complete — genuine driving confirmed; launch Phase 2 corner learning 2026-04-13 19:33:06 -04:00
tests fix: path-efficiency reward (v3) defeats circular driving exploit 2026-04-13 13:36:17 -04:00
.gitignore feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing 2026-04-13 10:03:15 -04:00
AGENT.md feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing 2026-04-13 10:03:15 -04:00
DECISIONS.md feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing 2026-04-13 10:03:15 -04:00
IMPLEMENTATION_PLAN.md feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing 2026-04-13 10:03:15 -04:00
PROJECT-KICKOFF.md feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing 2026-04-13 10:03:15 -04:00
PROJECT-SPEC.md feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing 2026-04-13 10:03:15 -04:00
README.md feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing 2026-04-13 10:03:15 -04:00
create_gitea_repo.py Initial commit 2026-04-12 23:44:36 -04:00
ralph-loop.sh feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing 2026-04-13 10:03:15 -04:00

README.md

donkeycar-rl-autoresearch

Purpose

Status

  • Scaffolded with the agent harness
  • Spec not filled yet

Runbook

  • Fill PROJECT-SPEC.md
  • Create IMPLEMENTATION_PLAN.md from the spec
  • Start the implementation loop