donkeycar-rl-autoresearch/agent
Paul Huliganga 4ca5304a71 wave3: add multi-track autoresearch system (83 tests passing)
New files:
- agent/multitrack_runner.py: trains PPO round-robin across generated_road,
  generated_track, mountain_track; zero-shot evaluates on mini_monaco + warren
- agent/wave3_controller.py: GP+UCB outer loop optimising combined test score
- tests/test_wave3.py: 30 new tests (83 total)

Track classification (from visual analysis of all 10 screenshots):
  Training  : generated_road, generated_track, mountain_track
  Test (ZSL): mini_monaco, warren (pseudo-outdoor — proper road markings)
  Skip      : warehouse, robo_racing_league, waveshare, circuit_launch (indoor floor)
              avc_sparkfun (orange markings — different visual domain)

Key design decisions:
  ADR-010: Warren = pseudo-outdoor track (proper road lines, not floor marks)
  ADR-011: Test tracks NEVER used in training; GP optimises test score only
  ADR-012: All trials warm-start from Phase 2 champion model
  Switching: env.close() + send_exit_scene_raw() + 4s wait + gym.make()

Pre-Wave-3 baseline: 1/10 tracks drivable (0/2 held-out test tracks)
Wave 3 goal: 2/2 test tracks drivable (mini_monaco + warren)

Agent: pi
Tests: 83 passed
Tests-Added: 30
TypeScript: N/A
2026-04-14 12:47:12 -04:00
..
__pycache__ Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
bin Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
models feat: Phase 3 — behavioral control, enhanced evaluator, 53 tests 2026-04-14 09:28:43 -04:00
outerloop-results wave3: add multi-track autoresearch system (83 tests passing) 2026-04-14 12:47:12 -04:00
sb3-models Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
sessions Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
AUTORESEARCH_README.txt AUTORESEARCH: Full Karpathy-style GP+UCB meta-controller, clean base data, fixed all paths, ready to run 2026-04-13 00:52:00 -04:00
README.md Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
SETUP_QUICKSTART.md Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
TROUBLESHOOT.md Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
analysis_circular_driving.py fix: path-efficiency reward (v3) defeats circular driving exploit 2026-04-13 13:36:17 -04:00
auth.json Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
autoresearch Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
autoresearch_controller.py milestone: Phase 1 complete — genuine driving confirmed; launch Phase 2 corner learning 2026-04-13 19:33:06 -04:00
behavioral_wrappers.py feat: Phase 3 — behavioral control, enhanced evaluator, 53 tests 2026-04-14 09:28:43 -04:00
champion_tracker.py feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing 2026-04-13 10:03:15 -04:00
check_envs.py Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
choose_and_run_track.py Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
discretize_action.py Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
donkeycar_autoresearch.py Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
donkeycar_outer_loop.py AUTORESEARCH: Full Karpathy-style GP+UCB meta-controller, clean base data, fixed all paths, ready to run 2026-04-13 00:52:00 -04:00
donkeycar_sb3_runner.py fix: reduce timesteps to 1k-5k for Phase 1 CPU training; add sim health/stuck detection; fix PPO throttle clamp 2026-04-13 11:17:08 -04:00
donkeycar_sb3_runner.py.README.txt Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
evaluate_champion.py fix: track switching via unwrapped viewer.exit_scene() — automatic scene changes work 2026-04-14 10:04:15 -04:00
list_tracks.py Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
manual-multiepisode-2.log Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
manual-multiepisode-3.log Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
manual-multiepisode-batch.log Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
manual-multiepisode.log Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
manual_multiepisode-batch-run.log Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
manual_multiepisode_batch.sh Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
multitrack_eval.py feat: comprehensive multi-track evaluation script + research log updates 2026-04-14 10:11:47 -04:00
multitrack_runner.py wave3: add multi-track autoresearch system (83 tests passing) 2026-04-14 12:47:12 -04:00
park_at_start.py wave3: add multi-track autoresearch system (83 tests passing) 2026-04-14 12:47:12 -04:00
reward_wrapper.py fix: reward v4 — full sim bypass kills circular driving at root 2026-04-13 20:56:32 -04:00
run_all_known_tracks.py Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
run_circuit_launch.py Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
run_donkeycar_test.sh Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
settings.json Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
test_donkeycar.py Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
test_donkeycar_gymnasium.py Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
test_sdsandbox.py Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
track_switcher.py fix: track switching via unwrapped viewer.exit_scene() — automatic scene changes work 2026-04-14 10:04:15 -04:00
wave3_controller.py wave3: add multi-track autoresearch system (83 tests passing) 2026-04-14 12:47:12 -04:00

README.md

DonkeyCar RL Batch Automation and Setup Guide

System Requirements

  • Windows 10/11 machine (with DonkeyCar Unity Simulator installed)
  • WSL2 with Ubuntu (Python 3.x installed)
  • DonkeyCar Unity Simulator running in Windows, Remote Control enabled

1. WSL (Linux) Setup

A. Install Python, pip, and core packages

sudo apt update && sudo apt install -y python3-pip git
pip3 install --upgrade pip
pip3 install stable-baselines3 gymnasium gym-donkeycar numpy matplotlib
# If pip install gym-donkeycar fails, use:
pip3 install git+https://github.com/tawnkramer/gym-donkeycar.git

B. Place scripts

Copy these into /home/paulh/.pi/agent/:

  • donkeycar_sb3_runner.py
  • manual_multiepisode_batch.sh (and make executable)
  • Any grid/outer loop script (donkeycar_outer_loop.py, as needed)

2. Unity DonkeyCar Simulator Setup (Windows)

  • Download/install Unity DonkeyCar sim (from DonkeyCar/tawnkramer Github or releases page)
  • Open simulator, select "Donkey Generated Track" and enable Remote (SocketAPI) mode
  • Ensure port 9091 is listening (default); leave sim running and visible

3. Running Robust Batches (Best Practice)

Clean multi-episode approach:

Do NOT rapidly kill/restart agents. Use scripts that:

  • Open one connection, run multiple episodes using env.reset() in a loop
  • Cleanly call env.close()
  • Batch script launches process, waits a couple seconds, repeats

Sample batch script (manual_multiepisode_batch.sh):

#!/bin/bash
for i in {1..20}; do
  echo "===== RUN $i ===== $(date)" | tee -a manual-multiepisode-batch.log
  python3 donkeycar_sb3_runner.py >> manual-multiepisode-batch.log 2>&1
  echo "===== END RUN $i ===== $(date)" | tee -a manual-multiepisode-batch.log
  sleep 2
done

Make executable:

chmod +x manual_multiepisode_batch.sh

Run in background:

nohup ./manual_multiepisode_batch.sh &

4. Debugging/Recovery

  • If sim blanks/hangs, first check for stuck agent processes: ps aux | grep donkeycar_sb3_runner
  • Always allow scripts to call env.close()
  • If hang persists: restart Unity DonkeyCar Sim in Windows

5. Automation Flow Summary

Step Where Details
Install WSL/Linux pip install ...
Prepare WSL/Linux Place scripts, ensure Python dependencies
Sim start Windows Start DonkeyCar Unity Sim
Run batch WSL/Linux nohup ./manual_multiepisode_batch.sh ...
Monitor Both Car resets, batch log grows, check hangs

See included Python scripts for reusable RL runner and grid search outer loop logic.