donkeycar-rl-autoresearch/agent
Paul Huliganga b8a13dea81 feat: v5 reward — speed × CTE-quality, drop efficiency term
Problem with v4 on mountain_track: CTE × efficiency × speed all collapse
to zero simultaneously when the car slows on the hill, giving no gradient
signal for 'apply more throttle'.

v5: reward = (speed / 10) × (1 - |CTE| / max_cte)
- Directly rewards going fast while staying centred
- Hill: car slows → reward drops → clear gradient toward more throttle
- Circling protection now entirely handled by lap-time penalty +
  StuckTerminationWrapper (not by the reward formula)

Tests updated to reflect v5 semantics (102 passing).

Agent: pi
Tests: 102 passed
Tests-Added: 0
TypeScript: N/A
2026-04-17 13:25:38 -04:00
..
__pycache__ Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
bin Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
models wave3: autoresearch trial 5 results 2026-04-15 07:15:57 -04:00
outerloop-results feat: v5 reward — speed × CTE-quality, drop efficiency term 2026-04-17 13:25:38 -04:00
sb3-models Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
sessions Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
AUTORESEARCH_README.txt AUTORESEARCH: Full Karpathy-style GP+UCB meta-controller, clean base data, fixed all paths, ready to run 2026-04-13 00:52:00 -04:00
README.md Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
SETUP_QUICKSTART.md Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
TROUBLESHOOT.md Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
analysis_circular_driving.py fix: path-efficiency reward (v3) defeats circular driving exploit 2026-04-13 13:36:17 -04:00
auth.json Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
autoresearch Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
autoresearch_controller.py milestone: Phase 1 complete — genuine driving confirmed; launch Phase 2 corner learning 2026-04-13 19:33:06 -04:00
behavioral_wrappers.py feat: Phase 3 — behavioral control, enhanced evaluator, 53 tests 2026-04-14 09:28:43 -04:00
champion_tracker.py feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing 2026-04-13 10:03:15 -04:00
check_envs.py Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
choose_and_run_track.py Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
discretize_action.py Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
donkeycar_autoresearch.py Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
donkeycar_outer_loop.py AUTORESEARCH: Full Karpathy-style GP+UCB meta-controller, clean base data, fixed all paths, ready to run 2026-04-13 00:52:00 -04:00
donkeycar_sb3_runner.py fix: reduce timesteps to 1k-5k for Phase 1 CPU training; add sim health/stuck detection; fix PPO throttle clamp 2026-04-13 11:17:08 -04:00
donkeycar_sb3_runner.py.README.txt Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
eval_on_track.py feat: v5 reward — speed × CTE-quality, drop efficiency term 2026-04-17 13:25:38 -04:00
evaluate_champion.py fix: track switching via unwrapped viewer.exit_scene() — automatic scene changes work 2026-04-14 10:04:15 -04:00
list_tracks.py Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
manual-multiepisode-2.log Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
manual-multiepisode-3.log Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
manual-multiepisode-batch.log Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
manual-multiepisode.log Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
manual_multiepisode-batch-run.log Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
manual_multiepisode_batch.sh Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
multitrack_eval.py feat: comprehensive multi-track evaluation script + research log updates 2026-04-14 10:11:47 -04:00
multitrack_runner.py feat: shuttle-exploit detection in mini_monaco eval 2026-04-16 17:29:30 -04:00
park_at_start.py wave3: add multi-track autoresearch system (83 tests passing) 2026-04-14 12:47:12 -04:00
reward_wrapper.py feat: v5 reward — speed × CTE-quality, drop efficiency term 2026-04-17 13:25:38 -04:00
run_all_known_tracks.py Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
run_circuit_launch.py Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
run_donkeycar_test.sh Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
settings.json Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
test_donkeycar.py Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
test_donkeycar_gymnasium.py Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
test_sdsandbox.py Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
track_switcher.py fix: track switching via unwrapped viewer.exit_scene() — automatic scene changes work 2026-04-14 10:04:15 -04:00
wave3_controller.py fix: complete LR override — must patch lr_schedule, not just param_groups 2026-04-14 21:27:43 -04:00
wave4_controller.py fix: prevent trial timeouts losing all data 2026-04-15 21:54:50 -04:00

README.md

DonkeyCar RL Batch Automation and Setup Guide

System Requirements

  • Windows 10/11 machine (with DonkeyCar Unity Simulator installed)
  • WSL2 with Ubuntu (Python 3.x installed)
  • DonkeyCar Unity Simulator running in Windows, Remote Control enabled

1. WSL (Linux) Setup

A. Install Python, pip, and core packages

sudo apt update && sudo apt install -y python3-pip git
pip3 install --upgrade pip
pip3 install stable-baselines3 gymnasium gym-donkeycar numpy matplotlib
# If pip install gym-donkeycar fails, use:
pip3 install git+https://github.com/tawnkramer/gym-donkeycar.git

B. Place scripts

Copy these into /home/paulh/.pi/agent/:

  • donkeycar_sb3_runner.py
  • manual_multiepisode_batch.sh (and make executable)
  • Any grid/outer loop script (donkeycar_outer_loop.py, as needed)

2. Unity DonkeyCar Simulator Setup (Windows)

  • Download/install Unity DonkeyCar sim (from DonkeyCar/tawnkramer Github or releases page)
  • Open simulator, select "Donkey Generated Track" and enable Remote (SocketAPI) mode
  • Ensure port 9091 is listening (default); leave sim running and visible

3. Running Robust Batches (Best Practice)

Clean multi-episode approach:

Do NOT rapidly kill/restart agents. Use scripts that:

  • Open one connection, run multiple episodes using env.reset() in a loop
  • Cleanly call env.close()
  • Batch script launches process, waits a couple seconds, repeats

Sample batch script (manual_multiepisode_batch.sh):

#!/bin/bash
for i in {1..20}; do
  echo "===== RUN $i ===== $(date)" | tee -a manual-multiepisode-batch.log
  python3 donkeycar_sb3_runner.py >> manual-multiepisode-batch.log 2>&1
  echo "===== END RUN $i ===== $(date)" | tee -a manual-multiepisode-batch.log
  sleep 2
done

Make executable:

chmod +x manual_multiepisode_batch.sh

Run in background:

nohup ./manual_multiepisode_batch.sh &

4. Debugging/Recovery

  • If sim blanks/hangs, first check for stuck agent processes: ps aux | grep donkeycar_sb3_runner
  • Always allow scripts to call env.close()
  • If hang persists: restart Unity DonkeyCar Sim in Windows

5. Automation Flow Summary

Step Where Details
Install WSL/Linux pip install ...
Prepare WSL/Linux Place scripts, ensure Python dependencies
Sim start Windows Start DonkeyCar Unity Sim
Run batch WSL/Linux nohup ./manual_multiepisode_batch.sh ...
Monitor Both Car resets, batch log grows, check hangs

See included Python scripts for reusable RL runner and grid search outer loop logic.