CRITICAL BUG FIX — Circular Driving: - v2 reward still hackable: car circles at starting line with low CTE + positive speed - Confirmed in data: trial 5 mean_reward=4582, cv=0.0% (physically impossible for genuine driving) - Statistical signature: cv <1% with high reward = consistent exploit, not genuine driving ROOT CAUSE: Neither CTE nor raw speed can distinguish forward vs circular motion. Both have: low CTE (on centerline) + positive speed (moving) = same reward. Missing dimension: TRACK PROGRESS (net advance along track) FIX — Path Efficiency Reward (v3): efficiency = net_displacement / total_path_length (sliding window of 30 steps) shaped = original x (1 + speed_scale x speed x efficiency) - Forward driving: efficiency ≈ 1.0 → full speed bonus - Circular driving: efficiency ≈ 0.0 → speed bonus disappears - Cannot be hacked: circling means returning to same positions (low net_displacement) Tests: - test_efficiency_near_zero_for_circular_driving: confirmed <0.2 efficiency for circles - test_efficiency_near_one_for_straight_driving: confirmed >0.90 for straight line - test_straight_driving_gets_higher_reward_than_circular: KEY guarantee - test_speed_bonus_disappears_when_circling: bonus suppressed after window fills Research documentation: - Full analysis with data table added to docs/RESEARCH_LOG.md - cv% identified as reward hacking indicator - Archived circular data + models Clean start: new autoresearch_results_phase1.jsonl, new champion dir Agent: pi/claude-sonnet Tests: 40/40 passing Tests-Added: +6 (path efficiency, anti-circular) TypeScript: N/A |
||
|---|---|---|
| .. | ||
| __pycache__ | ||
| bin | ||
| models | ||
| outerloop-results | ||
| sb3-models | ||
| sessions | ||
| AUTORESEARCH_README.txt | ||
| README.md | ||
| SETUP_QUICKSTART.md | ||
| TROUBLESHOOT.md | ||
| analysis_circular_driving.py | ||
| auth.json | ||
| autoresearch | ||
| autoresearch_controller.py | ||
| champion_tracker.py | ||
| check_envs.py | ||
| choose_and_run_track.py | ||
| discretize_action.py | ||
| donkeycar_autoresearch.py | ||
| donkeycar_outer_loop.py | ||
| donkeycar_sb3_runner.py | ||
| donkeycar_sb3_runner.py.README.txt | ||
| list_tracks.py | ||
| manual-multiepisode-2.log | ||
| manual-multiepisode-3.log | ||
| manual-multiepisode-batch.log | ||
| manual-multiepisode.log | ||
| manual_multiepisode-batch-run.log | ||
| manual_multiepisode_batch.sh | ||
| reward_wrapper.py | ||
| run_all_known_tracks.py | ||
| run_circuit_launch.py | ||
| run_donkeycar_test.sh | ||
| settings.json | ||
| test_donkeycar.py | ||
| test_donkeycar_gymnasium.py | ||
| test_sdsandbox.py | ||
README.md
DonkeyCar RL Batch Automation and Setup Guide
System Requirements
- Windows 10/11 machine (with DonkeyCar Unity Simulator installed)
- WSL2 with Ubuntu (Python 3.x installed)
- DonkeyCar Unity Simulator running in Windows, Remote Control enabled
1. WSL (Linux) Setup
A. Install Python, pip, and core packages
sudo apt update && sudo apt install -y python3-pip git
pip3 install --upgrade pip
pip3 install stable-baselines3 gymnasium gym-donkeycar numpy matplotlib
# If pip install gym-donkeycar fails, use:
pip3 install git+https://github.com/tawnkramer/gym-donkeycar.git
B. Place scripts
Copy these into /home/paulh/.pi/agent/:
donkeycar_sb3_runner.pymanual_multiepisode_batch.sh(and make executable)- Any grid/outer loop script (
donkeycar_outer_loop.py, as needed)
2. Unity DonkeyCar Simulator Setup (Windows)
- Download/install Unity DonkeyCar sim (from DonkeyCar/tawnkramer Github or releases page)
- Open simulator, select "Donkey Generated Track" and enable Remote (SocketAPI) mode
- Ensure port 9091 is listening (default); leave sim running and visible
3. Running Robust Batches (Best Practice)
Clean multi-episode approach:
Do NOT rapidly kill/restart agents. Use scripts that:
- Open one connection, run multiple episodes using
env.reset()in a loop - Cleanly call
env.close() - Batch script launches process, waits a couple seconds, repeats
Sample batch script (manual_multiepisode_batch.sh):
#!/bin/bash
for i in {1..20}; do
echo "===== RUN $i ===== $(date)" | tee -a manual-multiepisode-batch.log
python3 donkeycar_sb3_runner.py >> manual-multiepisode-batch.log 2>&1
echo "===== END RUN $i ===== $(date)" | tee -a manual-multiepisode-batch.log
sleep 2
done
Make executable:
chmod +x manual_multiepisode_batch.sh
Run in background:
nohup ./manual_multiepisode_batch.sh &
4. Debugging/Recovery
- If sim blanks/hangs, first check for stuck agent processes:
ps aux | grep donkeycar_sb3_runner - Always allow scripts to call
env.close() - If hang persists: restart Unity DonkeyCar Sim in Windows
5. Automation Flow Summary
| Step | Where | Details |
|---|---|---|
| Install | WSL/Linux | pip install ... |
| Prepare | WSL/Linux | Place scripts, ensure Python dependencies |
| Sim start | Windows | Start DonkeyCar Unity Sim |
| Run batch | WSL/Linux | nohup ./manual_multiepisode_batch.sh ... |
| Monitor | Both | Car resets, batch log grows, check hangs |
See included Python scripts for reusable RL runner and grid search outer loop logic.