donkeycar-rl-autoresearch/agent
Paul Huliganga 9ffe1c5d40 fix: efficiency gate now TERMINATES after 20 low-efficiency steps (was zero-reward only)
Previously circles ran 20+ seconds because the efficiency gate only returned
0 reward without terminating. After 20 consecutive steps of efficiency < 0.15
(~0.7 seconds at 27 steps/sec), episode now terminates with -1.0.

Also confirmed from telemetry diagnostic: CTE does report correctly when
car goes off-track (rises steadily to 6.2m before tree collision).
The grass exploit runs long only when the open grass area has no obstacles.
Efficiency gate termination is the most reliable catch for both circles
and open-grass driving (straight-line grass = high efficiency, but
active_node progress terminator catches that case).
2026-04-19 17:26:38 -04:00
..
__pycache__ Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
bin Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
experiments fix: reward v6.1 — active_node progress terminator kills circle/stuck exploits 2026-04-19 17:01:41 -04:00
models wave3: autoresearch trial 5 results 2026-04-15 07:15:57 -04:00
outerloop-results fix: multitrack_runner must use VecTransposeImage(DummyVecEnv) not plain wrap_env 2026-04-18 18:33:40 -04:00
sb3-models Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
sessions Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
test-results docs: Exp10 eval results — total failure, crashes on all tracks (massive regression from Exp9/W4T9) 2026-04-19 10:19:16 -04:00
AUTORESEARCH_README.txt AUTORESEARCH: Full Karpathy-style GP+UCB meta-controller, clean base data, fixed all paths, ready to run 2026-04-13 00:52:00 -04:00
README.md Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
SETUP_QUICKSTART.md Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
TROUBLESHOOT.md Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
analysis_circular_driving.py fix: path-efficiency reward (v3) defeats circular driving exploit 2026-04-13 13:36:17 -04:00
auth.json Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
autoresearch Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
autoresearch_controller.py milestone: Phase 1 complete — genuine driving confirmed; launch Phase 2 corner learning 2026-04-13 19:33:06 -04:00
behavioral_wrappers.py feat: Phase 3 — behavioral control, enhanced evaluator, 53 tests 2026-04-14 09:28:43 -04:00
champion_tracker.py feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing 2026-04-13 10:03:15 -04:00
check_envs.py Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
choose_and_run_track.py Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
discretize_action.py Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
donkeycar_autoresearch.py Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
donkeycar_outer_loop.py AUTORESEARCH: Full Karpathy-style GP+UCB meta-controller, clean base data, fixed all paths, ready to run 2026-04-13 00:52:00 -04:00
donkeycar_sb3_runner.py fix: reduce timesteps to 1k-5k for Phase 1 CPU training; add sim health/stuck detection; fix PPO throttle clamp 2026-04-13 11:17:08 -04:00
donkeycar_sb3_runner.py.README.txt Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
eval_on_track.py feat: v5 reward — speed × CTE-quality, drop efficiency term 2026-04-17 13:25:38 -04:00
evaluate_champion.py fix: track switching via unwrapped viewer.exit_scene() — automatic scene changes work 2026-04-14 10:04:15 -04:00
list_tracks.py Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
manual-multiepisode-2.log Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
manual-multiepisode-3.log Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
manual-multiepisode-batch.log Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
manual-multiepisode.log Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
manual_multiepisode-batch-run.log Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
manual_multiepisode_batch.sh Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
multitrack_eval.py feat: comprehensive multi-track evaluation script + research log updates 2026-04-14 10:11:47 -04:00
multitrack_runner.py fix: StuckTerminationWrapper — wall-clock timeout (12s) prevents 1min+ stuck episodes 2026-04-19 16:30:50 -04:00
park_at_start.py wave3: add multi-track autoresearch system (83 tests passing) 2026-04-14 12:47:12 -04:00
reward_wrapper.py fix: efficiency gate now TERMINATES after 20 low-efficiency steps (was zero-reward only) 2026-04-19 17:26:38 -04:00
run_all_known_tracks.py Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
run_circuit_launch.py Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
run_donkeycar_test.sh Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
run_eval.py fix: reward v6 — efficiency gate prevents circular driving, stuck_steps 80→40 2026-04-19 12:02:55 -04:00
settings.json Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
test_donkeycar.py Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
test_donkeycar_gymnasium.py Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
test_sdsandbox.py Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
track_switcher.py fix: track switching via unwrapped viewer.exit_scene() — automatic scene changes work 2026-04-14 10:04:15 -04:00
wave3_controller.py fix: complete LR override — must patch lr_schedule, not just param_groups 2026-04-14 21:27:43 -04:00
wave4_controller.py fix: prevent trial timeouts losing all data 2026-04-15 21:54:50 -04:00

README.md

DonkeyCar RL Batch Automation and Setup Guide

System Requirements

  • Windows 10/11 machine (with DonkeyCar Unity Simulator installed)
  • WSL2 with Ubuntu (Python 3.x installed)
  • DonkeyCar Unity Simulator running in Windows, Remote Control enabled

1. WSL (Linux) Setup

A. Install Python, pip, and core packages

sudo apt update && sudo apt install -y python3-pip git
pip3 install --upgrade pip
pip3 install stable-baselines3 gymnasium gym-donkeycar numpy matplotlib
# If pip install gym-donkeycar fails, use:
pip3 install git+https://github.com/tawnkramer/gym-donkeycar.git

B. Place scripts

Copy these into /home/paulh/.pi/agent/:

  • donkeycar_sb3_runner.py
  • manual_multiepisode_batch.sh (and make executable)
  • Any grid/outer loop script (donkeycar_outer_loop.py, as needed)

2. Unity DonkeyCar Simulator Setup (Windows)

  • Download/install Unity DonkeyCar sim (from DonkeyCar/tawnkramer Github or releases page)
  • Open simulator, select "Donkey Generated Track" and enable Remote (SocketAPI) mode
  • Ensure port 9091 is listening (default); leave sim running and visible

3. Running Robust Batches (Best Practice)

Clean multi-episode approach:

Do NOT rapidly kill/restart agents. Use scripts that:

  • Open one connection, run multiple episodes using env.reset() in a loop
  • Cleanly call env.close()
  • Batch script launches process, waits a couple seconds, repeats

Sample batch script (manual_multiepisode_batch.sh):

#!/bin/bash
for i in {1..20}; do
  echo "===== RUN $i ===== $(date)" | tee -a manual-multiepisode-batch.log
  python3 donkeycar_sb3_runner.py >> manual-multiepisode-batch.log 2>&1
  echo "===== END RUN $i ===== $(date)" | tee -a manual-multiepisode-batch.log
  sleep 2
done

Make executable:

chmod +x manual_multiepisode_batch.sh

Run in background:

nohup ./manual_multiepisode_batch.sh &

4. Debugging/Recovery

  • If sim blanks/hangs, first check for stuck agent processes: ps aux | grep donkeycar_sb3_runner
  • Always allow scripts to call env.close()
  • If hang persists: restart Unity DonkeyCar Sim in Windows

5. Automation Flow Summary

Step Where Details
Install WSL/Linux pip install ...
Prepare WSL/Linux Place scripts, ensure Python dependencies
Sim start Windows Start DonkeyCar Unity Sim
Run batch WSL/Linux nohup ./manual_multiepisode_batch.sh ...
Monitor Both Car resets, batch log grows, check hangs

See included Python scripts for reusable RL runner and grid search outer loop logic.