donkeycar-rl-autoresearch/agent
Paul Huliganga 7b8830f0cb milestone: Phase 1 complete — genuine driving confirmed; launch Phase 2 corner learning
PHASE 1 MILESTONE:
- Champion model drives the track for 599 steps (mean_reward=1022.78, std=0.45)
- Path efficiency 96-100% throughout — genuine forward motion confirmed
- Navigates first right-hand curve successfully
- Fails at S-curve (right->left) at step ~560: speed too high for tight corners
- Root cause: only 4787 training timesteps — model never sees S-curve enough to learn it

PHASE 2 CONFIG (corner learning):
- timesteps: 10,000-50,000 (10x more — model must experience S-curve many times)
- learning_rate: 0.00005-0.002 (tightened around Phase 1 winning region)
- eval_episodes: 5 (more reliable corner stats)
- JOB_TIMEOUT: 3600s (50k steps on CPU needs time)
- Results: autoresearch_results_phase2.jsonl (clean separation from Phase 1)

Research documentation:
- Phase 1 milestone added to docs/RESEARCH_LOG.md
- Full trajectory analysis: start -> first corner -> S-curve crash position logged
- Reward shaping v3 path efficiency victory documented
- evaluate_champion.py added for visual + diagnostic evaluation

Agent: pi/claude-sonnet
Tests: 40/40 passing
Tests-Added: 0
TypeScript: N/A
2026-04-13 19:33:06 -04:00
..
__pycache__ Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
bin Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
models autoresearch: phase1 trial 10 results 2026-04-13 13:11:06 -04:00
outerloop-results milestone: Phase 1 complete — genuine driving confirmed; launch Phase 2 corner learning 2026-04-13 19:33:06 -04:00
sb3-models Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
sessions Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
AUTORESEARCH_README.txt AUTORESEARCH: Full Karpathy-style GP+UCB meta-controller, clean base data, fixed all paths, ready to run 2026-04-13 00:52:00 -04:00
README.md Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
SETUP_QUICKSTART.md Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
TROUBLESHOOT.md Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
analysis_circular_driving.py fix: path-efficiency reward (v3) defeats circular driving exploit 2026-04-13 13:36:17 -04:00
auth.json Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
autoresearch Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
autoresearch_controller.py milestone: Phase 1 complete — genuine driving confirmed; launch Phase 2 corner learning 2026-04-13 19:33:06 -04:00
champion_tracker.py feat: Wave 1 complete — real PPO training, model save, GP+UCB autoresearch, 37 tests passing 2026-04-13 10:03:15 -04:00
check_envs.py Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
choose_and_run_track.py Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
discretize_action.py Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
donkeycar_autoresearch.py Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
donkeycar_outer_loop.py AUTORESEARCH: Full Karpathy-style GP+UCB meta-controller, clean base data, fixed all paths, ready to run 2026-04-13 00:52:00 -04:00
donkeycar_sb3_runner.py fix: reduce timesteps to 1k-5k for Phase 1 CPU training; add sim health/stuck detection; fix PPO throttle clamp 2026-04-13 11:17:08 -04:00
donkeycar_sb3_runner.py.README.txt Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
evaluate_champion.py milestone: Phase 1 complete — genuine driving confirmed; launch Phase 2 corner learning 2026-04-13 19:33:06 -04:00
list_tracks.py Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
manual-multiepisode-2.log Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
manual-multiepisode-3.log Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
manual-multiepisode-batch.log Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
manual-multiepisode.log Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
manual_multiepisode-batch-run.log Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
manual_multiepisode_batch.sh Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
reward_wrapper.py fix: path-efficiency reward (v3) defeats circular driving exploit 2026-04-13 13:36:17 -04:00
run_all_known_tracks.py Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
run_circuit_launch.py Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
run_donkeycar_test.sh Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
settings.json Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
test_donkeycar.py Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
test_donkeycar_gymnasium.py Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00
test_sdsandbox.py Initial commit: stable RL sweep runner, legacy and new scripts, full docs included 2026-04-12 22:57:50 -04:00

README.md

DonkeyCar RL Batch Automation and Setup Guide

System Requirements

  • Windows 10/11 machine (with DonkeyCar Unity Simulator installed)
  • WSL2 with Ubuntu (Python 3.x installed)
  • DonkeyCar Unity Simulator running in Windows, Remote Control enabled

1. WSL (Linux) Setup

A. Install Python, pip, and core packages

sudo apt update && sudo apt install -y python3-pip git
pip3 install --upgrade pip
pip3 install stable-baselines3 gymnasium gym-donkeycar numpy matplotlib
# If pip install gym-donkeycar fails, use:
pip3 install git+https://github.com/tawnkramer/gym-donkeycar.git

B. Place scripts

Copy these into /home/paulh/.pi/agent/:

  • donkeycar_sb3_runner.py
  • manual_multiepisode_batch.sh (and make executable)
  • Any grid/outer loop script (donkeycar_outer_loop.py, as needed)

2. Unity DonkeyCar Simulator Setup (Windows)

  • Download/install Unity DonkeyCar sim (from DonkeyCar/tawnkramer Github or releases page)
  • Open simulator, select "Donkey Generated Track" and enable Remote (SocketAPI) mode
  • Ensure port 9091 is listening (default); leave sim running and visible

3. Running Robust Batches (Best Practice)

Clean multi-episode approach:

Do NOT rapidly kill/restart agents. Use scripts that:

  • Open one connection, run multiple episodes using env.reset() in a loop
  • Cleanly call env.close()
  • Batch script launches process, waits a couple seconds, repeats

Sample batch script (manual_multiepisode_batch.sh):

#!/bin/bash
for i in {1..20}; do
  echo "===== RUN $i ===== $(date)" | tee -a manual-multiepisode-batch.log
  python3 donkeycar_sb3_runner.py >> manual-multiepisode-batch.log 2>&1
  echo "===== END RUN $i ===== $(date)" | tee -a manual-multiepisode-batch.log
  sleep 2
done

Make executable:

chmod +x manual_multiepisode_batch.sh

Run in background:

nohup ./manual_multiepisode_batch.sh &

4. Debugging/Recovery

  • If sim blanks/hangs, first check for stuck agent processes: ps aux | grep donkeycar_sb3_runner
  • Always allow scripts to call env.close()
  • If hang persists: restart Unity DonkeyCar Sim in Windows

5. Automation Flow Summary

Step Where Details
Install WSL/Linux pip install ...
Prepare WSL/Linux Place scripts, ensure Python dependencies
Sim start Windows Start DonkeyCar Unity Sim
Run batch WSL/Linux nohup ./manual_multiepisode_batch.sh ...
Monitor Both Car resets, batch log grows, check hangs

See included Python scripts for reusable RL runner and grid search outer loop logic.