ROOT CAUSE: donkey_sim.py calc_reward() uses forward_vel = dot(heading, velocity). A spinning car ALWAYS has forward_vel > 0 (always moving 'forward' relative to its own heading), so it earned positive reward indefinitely while circling. v3 WAS INSUFFICIENT: v3 applied efficiency only to the speed BONUS: original × (1 + speed×eff×scale) But 'original' from sim was still exploitable: CTE≈0 while spinning → original=1.0/step Efficiency killed the speed bonus but not the base reward. 47k-step run: spinning = 1.0/step × 47k = 47k reward (never crashes in circle) v4 FIX — base × efficiency × speed: reward = (1 - abs(cte)/max_cte) × efficiency × (1 + speed_scale × speed) Completely ignores sim's bogus forward_vel reward. Spinning (eff≈0): reward ≈ 0 regardless of CTE or speed. ALL three terms must be high to earn reward — cannot be gamed. Key new test: test_circling_at_zero_cte_gives_near_zero_reward Worst-case exploit (CTE=0 spinning) → avg reward < 0.15 (was 1.0 in v3) forward_beats_circling_by_3x confirmed. Also: update Phase 2 autoresearch timesteps test, research log updated. Agent: pi/claude-sonnet Tests: 40/40 passing Tests-Added: +1 (core v4 circling guarantee) TypeScript: N/A |
||
|---|---|---|
| .. | ||
| __pycache__ | ||
| bin | ||
| models | ||
| outerloop-results | ||
| sb3-models | ||
| sessions | ||
| AUTORESEARCH_README.txt | ||
| README.md | ||
| SETUP_QUICKSTART.md | ||
| TROUBLESHOOT.md | ||
| analysis_circular_driving.py | ||
| auth.json | ||
| autoresearch | ||
| autoresearch_controller.py | ||
| champion_tracker.py | ||
| check_envs.py | ||
| choose_and_run_track.py | ||
| discretize_action.py | ||
| donkeycar_autoresearch.py | ||
| donkeycar_outer_loop.py | ||
| donkeycar_sb3_runner.py | ||
| donkeycar_sb3_runner.py.README.txt | ||
| evaluate_champion.py | ||
| list_tracks.py | ||
| manual-multiepisode-2.log | ||
| manual-multiepisode-3.log | ||
| manual-multiepisode-batch.log | ||
| manual-multiepisode.log | ||
| manual_multiepisode-batch-run.log | ||
| manual_multiepisode_batch.sh | ||
| reward_wrapper.py | ||
| run_all_known_tracks.py | ||
| run_circuit_launch.py | ||
| run_donkeycar_test.sh | ||
| settings.json | ||
| test_donkeycar.py | ||
| test_donkeycar_gymnasium.py | ||
| test_sdsandbox.py | ||
README.md
DonkeyCar RL Batch Automation and Setup Guide
System Requirements
- Windows 10/11 machine (with DonkeyCar Unity Simulator installed)
- WSL2 with Ubuntu (Python 3.x installed)
- DonkeyCar Unity Simulator running in Windows, Remote Control enabled
1. WSL (Linux) Setup
A. Install Python, pip, and core packages
sudo apt update && sudo apt install -y python3-pip git
pip3 install --upgrade pip
pip3 install stable-baselines3 gymnasium gym-donkeycar numpy matplotlib
# If pip install gym-donkeycar fails, use:
pip3 install git+https://github.com/tawnkramer/gym-donkeycar.git
B. Place scripts
Copy these into /home/paulh/.pi/agent/:
donkeycar_sb3_runner.pymanual_multiepisode_batch.sh(and make executable)- Any grid/outer loop script (
donkeycar_outer_loop.py, as needed)
2. Unity DonkeyCar Simulator Setup (Windows)
- Download/install Unity DonkeyCar sim (from DonkeyCar/tawnkramer Github or releases page)
- Open simulator, select "Donkey Generated Track" and enable Remote (SocketAPI) mode
- Ensure port 9091 is listening (default); leave sim running and visible
3. Running Robust Batches (Best Practice)
Clean multi-episode approach:
Do NOT rapidly kill/restart agents. Use scripts that:
- Open one connection, run multiple episodes using
env.reset()in a loop - Cleanly call
env.close() - Batch script launches process, waits a couple seconds, repeats
Sample batch script (manual_multiepisode_batch.sh):
#!/bin/bash
for i in {1..20}; do
echo "===== RUN $i ===== $(date)" | tee -a manual-multiepisode-batch.log
python3 donkeycar_sb3_runner.py >> manual-multiepisode-batch.log 2>&1
echo "===== END RUN $i ===== $(date)" | tee -a manual-multiepisode-batch.log
sleep 2
done
Make executable:
chmod +x manual_multiepisode_batch.sh
Run in background:
nohup ./manual_multiepisode_batch.sh &
4. Debugging/Recovery
- If sim blanks/hangs, first check for stuck agent processes:
ps aux | grep donkeycar_sb3_runner - Always allow scripts to call
env.close() - If hang persists: restart Unity DonkeyCar Sim in Windows
5. Automation Flow Summary
| Step | Where | Details |
|---|---|---|
| Install | WSL/Linux | pip install ... |
| Prepare | WSL/Linux | Place scripts, ensure Python dependencies |
| Sim start | Windows | Start DonkeyCar Unity Sim |
| Run batch | WSL/Linux | nohup ./manual_multiepisode_batch.sh ... |
| Monitor | Both | Car resets, batch log grows, check hangs |
See included Python scripts for reusable RL runner and grid search outer loop logic.