Research-grade DonkeyCar RL autoresearch and sweep system.
Go to file
Paul Huliganga 650f893d2d fix: complete LR override — must patch lr_schedule, not just param_groups
PPO.load() bakes lr_schedule=FloatSchedule(saved_lr) into the model.
train() calls _update_learning_rate() which reads lr_schedule, not
model.learning_rate.  So even with param_groups patched, the first
gradient step reverts the optimizer to the saved LR.

Complete 3-part fix in create_or_load_model():
  model.learning_rate = lr          # attribute
  model.lr_schedule = get_schedule_fn(lr)  # prevents train() reverting
  for pg in optimizer.param_groups: pg['lr'] = lr  # immediate effect

Also:
- SEED_PARAMS: second seed now uses LR=0.001 (was 0.000225) so GP
  starts with real LR diversity instead of two identical seeds
- tests/test_end_to_end.py: 13 new tests covering the full LR override
  path including a live learn() call; would have caught both bugs
- Phase 3 results re-cleared (seed trial 1 ran with half-fix)
- 96 tests total, all passing

Agent: pi
Tests: 96 passed
Tests-Added: 13
TypeScript: N/A
2026-04-14 21:27:43 -04:00
.harness feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing 2026-04-13 10:03:15 -04:00
agent fix: complete LR override — must patch lr_schedule, not just param_groups 2026-04-14 21:27:43 -04:00
docs wave3: add multi-track autoresearch system (83 tests passing) 2026-04-14 12:47:12 -04:00
tests fix: complete LR override — must patch lr_schedule, not just param_groups 2026-04-14 21:27:43 -04:00
.gitignore feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing 2026-04-13 10:03:15 -04:00
AGENT.md feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing 2026-04-13 10:03:15 -04:00
DECISIONS.md wave3: add multi-track autoresearch system (83 tests passing) 2026-04-14 12:47:12 -04:00
IMPLEMENTATION_PLAN.md feat: Phase 3 — behavioral control, enhanced evaluator, 53 tests 2026-04-14 09:28:43 -04:00
PROJECT-KICKOFF.md feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing 2026-04-13 10:03:15 -04:00
PROJECT-SPEC.md feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing 2026-04-13 10:03:15 -04:00
README.md feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing 2026-04-13 10:03:15 -04:00
create_gitea_repo.py Initial commit 2026-04-12 23:44:36 -04:00
ralph-loop.sh feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing 2026-04-13 10:03:15 -04:00

README.md

donkeycar-rl-autoresearch

Purpose

Status

  • Scaffolded with the agent harness
  • Spec not filled yet

Runbook

  • Fill PROJECT-SPEC.md
  • Create IMPLEMENTATION_PLAN.md from the spec
  • Start the implementation loop