Research-grade DonkeyCar RL autoresearch and sweep system.
Go to file
Paul Huliganga 26251c7d0c results: complete multi-track generalization baseline — 1/10 tracks drivable pre-Wave3
RESULTS:
  T20 (champion):  Generated Road only (1/10 tracks)
  T08:             Generated Road only (1/10 tracks)
  T18:             All tracks crash (0/10) — even new Generated Road layout!

  Robo Racing League: best unseen result (116 steps) — visual similarity to generated_road?
  Thunderhill: not available in this simulator version

KEY FINDING: Models are visually overfit to generated_road CNN features.
All unseen tracks crash within 40-116 steps (vs 2200+ on trained track).
This is the expected Phase 2→3 transition point.

WAVE 3 STRATEGY (documented in RESEARCH_LOG.md):
  Stage 1: generated_road ↔ generated_track (same geometry, different visuals)
  Stage 2: + mountain_track (different geometry)
  Stage 3: all tracks rotation (true generalization)

Also fixed: multitrack_eval.py updated with only valid scene names
(thunderhill removed — not in this simulator version)

Agent: pi/claude-sonnet
Tests: 53/53 passing
TypeScript: N/A
2026-04-14 11:31:08 -04:00
.harness feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing 2026-04-13 10:03:15 -04:00
agent results: complete multi-track generalization baseline — 1/10 tracks drivable pre-Wave3 2026-04-14 11:31:08 -04:00
docs results: complete multi-track generalization baseline — 1/10 tracks drivable pre-Wave3 2026-04-14 11:31:08 -04:00
tests feat: Phase 3 — behavioral control, enhanced evaluator, 53 tests 2026-04-14 09:28:43 -04:00
.gitignore feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing 2026-04-13 10:03:15 -04:00
AGENT.md feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing 2026-04-13 10:03:15 -04:00
DECISIONS.md feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing 2026-04-13 10:03:15 -04:00
IMPLEMENTATION_PLAN.md feat: Phase 3 — behavioral control, enhanced evaluator, 53 tests 2026-04-14 09:28:43 -04:00
PROJECT-KICKOFF.md feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing 2026-04-13 10:03:15 -04:00
PROJECT-SPEC.md feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing 2026-04-13 10:03:15 -04:00
README.md feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing 2026-04-13 10:03:15 -04:00
create_gitea_repo.py Initial commit 2026-04-12 23:44:36 -04:00
ralph-loop.sh feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing 2026-04-13 10:03:15 -04:00

README.md

donkeycar-rl-autoresearch

Purpose

Status

  • Scaffolded with the agent harness
  • Spec not filled yet

Runbook

  • Fill PROJECT-SPEC.md
  • Create IMPLEMENTATION_PLAN.md from the spec
  • Start the implementation loop