Research-grade DonkeyCar RL autoresearch and sweep system.

Go to file

Paul Huliganga 26251c7d0c results: complete multi-track generalization baseline — 1/10 tracks drivable pre-Wave3 RESULTS: T20 (champion): ✅ Generated Road only (1/10 tracks) T08: ✅ Generated Road only (1/10 tracks) T18: ❌ All tracks crash (0/10) — even new Generated Road layout! Robo Racing League: best unseen result (116 steps) — visual similarity to generated_road? Thunderhill: not available in this simulator version KEY FINDING: Models are visually overfit to generated_road CNN features. All unseen tracks crash within 40-116 steps (vs 2200+ on trained track). This is the expected Phase 2→3 transition point. WAVE 3 STRATEGY (documented in RESEARCH_LOG.md): Stage 1: generated_road ↔ generated_track (same geometry, different visuals) Stage 2: + mountain_track (different geometry) Stage 3: all tracks rotation (true generalization) Also fixed: multitrack_eval.py updated with only valid scene names (thunderhill removed — not in this simulator version) Agent: pi/claude-sonnet Tests: 53/53 passing TypeScript: N/A		2026-04-14 11:31:08 -04:00
.harness	feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing	2026-04-13 10:03:15 -04:00
agent	results: complete multi-track generalization baseline — 1/10 tracks drivable pre-Wave3	2026-04-14 11:31:08 -04:00
docs	results: complete multi-track generalization baseline — 1/10 tracks drivable pre-Wave3	2026-04-14 11:31:08 -04:00
tests	feat: Phase 3 — behavioral control, enhanced evaluator, 53 tests	2026-04-14 09:28:43 -04:00
.gitignore	feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing	2026-04-13 10:03:15 -04:00
AGENT.md	feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing	2026-04-13 10:03:15 -04:00
DECISIONS.md	feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing	2026-04-13 10:03:15 -04:00
IMPLEMENTATION_PLAN.md	feat: Phase 3 — behavioral control, enhanced evaluator, 53 tests	2026-04-14 09:28:43 -04:00
PROJECT-KICKOFF.md	feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing	2026-04-13 10:03:15 -04:00
PROJECT-SPEC.md	feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing	2026-04-13 10:03:15 -04:00
README.md	feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing	2026-04-13 10:03:15 -04:00
create_gitea_repo.py	Initial commit	2026-04-12 23:44:36 -04:00
ralph-loop.sh	feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing	2026-04-13 10:03:15 -04:00

README.md

donkeycar-rl-autoresearch

Purpose

Status

Scaffolded with the agent harness
Spec not filled yet

Runbook

Fill PROJECT-SPEC.md
Create IMPLEMENTATION_PLAN.md from the spec
Start the implementation loop