donkeycar-rl-autoresearch/agent/SESSION_HANDOFF.md

7.3 KiB
Raw Blame History

RL Donkeycar Session Handoff

Last updated: 2026-05-05 America/Toronto (updated during exp23)

Autonomy Instruction

Use this as the standing instruction for follow-on sessions:

Continue the Donkeycar RL/sim work autonomously. Rebuild, sync, relaunch, run diagnostics, patch code, and restart experiments as needed. Keep going until you either have a verified fix and a running experiment, or a concrete blocker that truly requires the user. Do not stop just to ask for permission on ordinary reversible steps. Only pause for real risk of data loss, destructive actions, missing credentials/access, or major strategy tradeoffs that require a user decision.

If the user says only continue, interpret it using the instruction above.

Current Goal

Run exp24 on generated_road with:

  • Discrete steering (7 bins) — smoother, less oscillation than continuous
  • Speed-based stuck detection — catches car pressed nose-first against barrier
  • Road regeneration — sim reconnects between segments so each eval is a fresh road
  • Unity raycast fix — Car.cs detects nose-first barrier contact via forward raycast

Important Paths

Project:

  • /home/paulh/projects/donkeycar-rl-autoresearch

Unity source project:

  • /mnt/c/Users/Paul/Documents/projects/sdsandbox/sdsim

Unity build output:

  • /mnt/c/Users/Paul/Documents/projects/sdsandbox/sdsim/Builds/DonkeySimWin

Current runtime simulator folders in use:

  • /mnt/c/Users/Paul/Downloads/DonkeySimWin/DonkeySimWin
  • /mnt/c/Users/Paul/Downloads/DonkeySimWin/DonkeySimWin - Copy

Unity build log:

  • C:\Users\Paul\AppData\Local\Temp\unity_rebuild.log

What Was Fixed This Session

Barrier physics (previous session, still in effect)

  • RoadBuilder.cs: BoxCollider per segment with overlap to close corner gaps
  • Car.cs: CCD mode prevents tunneling through thin geometry
  • showBarrierMeshes=false in scene YAML files (barriers invisible to RL camera)

Nose-first stuck detection root cause identified and fixed

Why collision detection never fires when perpendicular to barrier:

Unity WheelColliders (used for suspension/steering) don't fire OnCollisionEnter or OnCollisionStay on the car's MonoBehaviour. When the car is nose-first into a barrier, only the front WHEELS make physical contact. The car BODY collider (which Car.cs's callbacks are attached to) is slightly behind the wheels and never touches. At an angle, the car body eventually makes contact — which is why collision detection works for side-contact but NOT for perpendicular contact.

Car.cs: forward raycast fix

/mnt/c/Users/Paul/Documents/projects/sdsandbox/sdsim/Assets/Scripts/Car.cs

  • Added short forward raycast in FixedUpdate():
    • When requestTorque > 0.05f (throttle applied)
    • Casts from car center + 0.3m up, 0.8m forward
    • Calls RegisterCollision() if anything hit
    • This fires whether wheels or body made contact
  • Requires a Unity rebuild + sync before exp24

StuckTerminationWrapper: speed-based check

agent/multitrack_runner.py — new params low_speed_threshold, max_low_speed_seconds:

  • If info['speed'] < 0.5 for 2 wall-clock seconds → terminate
  • Catches car pressed against barrier regardless of lateral sliding
  • info['speed'] = rb.velocity.magnitude / 8.0 in Unity → stuck car = ~0
  • Smoke-tested and committed

Exp24: discrete steering + road regeneration

agent/experiments/exp24_generated_road_discrete.py:

  • Action space: Discrete(7) via DiscretizedActionWrapper(n_steer=7, n_throttle=1)
  • Steering bins: -1, -0.67, -0.33, 0, 0.33, 0.67, 1 (throttle fixed at 0.2)
  • Road regeneration: after each 10k-step segment, env.close() + reconnect
    • Reconnecting reloads the scene → sdsandbox generates new random road
    • Each eval runs on a freshly generated road (proper generalization test)
    • ~5s overhead per checkpoint = ~100s total for 200k run
  • Speed-based stuck: LOW_SPEED_THRESHOLD=0.5, MAX_LOW_SPEED_SECONDS=2.0
  • MAX_EPISODE_SECONDS=30 (reduced from 120s — speed check handles barrier stuck)
  • LR=0.0003, 200k total steps

Road generation clarification

The road is NOT regenerated each episode reset. generated_road creates a fixed random layout when the scene LOADS (i.e., when you gym.make()). Within a session, all episodes use the same road. The exp23 eval variance (1071951 steps) was due to Unity physics non-determinism, NOT road variety.

Reward wrapper

agent/reward_wrapper.py v7 — unchanged, still in effect:

  • Reward: speed_norm × CTE_quality gated by efficiency
  • No Python-side exploit bandaids (physics enforces containment)

Current State

Exp 23 status

  • Running (PID 649531), at ~140k/200k steps as of last check
  • Will finish on its own — DO NOT kill it
  • Log: agent/models/exp23-generated-road-clean/run_2026-05-05_160718_clean.log

Unity build status

  • Needs rebuild — Car.cs raycast fix not yet compiled
  • Car.cs was modified at: /mnt/c/Users/Paul/Documents/projects/sdsandbox/sdsim/Assets/Scripts/Car.cs

To launch exp24

  1. Wait for exp23 to finish (or confirm it has)
  2. Rebuild Unity (Car.cs raycast fix)
  3. Stop sim on port 9091
  4. Rsync build to runtime folders (both, or just the one on 9091)
  5. Restart sim on port 9091
  6. Launch exp24:
    cd /home/paulh/projects/donkeycar-rl-autoresearch/agent
    nohup python3 experiments/exp24_generated_road_discrete.py > /tmp/exp24.out 2>&1 &
    tail -f /home/paulh/projects/donkeycar-rl-autoresearch/agent/models/exp24-discrete/run_*_discrete.log
    

Useful Commands

Check build log

tail -20 /mnt/c/Users/Paul/AppData/Local/Temp/unity_rebuild.log
grep "Exiting batchmode\|Build failed\|error\|Error" /mnt/c/Users/Paul/AppData/Local/Temp/unity_rebuild.log | tail -5

Monitor exp23 (while still running)

tail -f agent/models/exp23-generated-road-clean/run_2026-05-05_160718_clean.log

Monitor exp24

tail -f agent/models/exp24-discrete/run_*_discrete.log

Verify port 9091

python3 - <<'PY'
import socket
for p in (9091,):
    s = socket.socket(); s.settimeout(3)
    try: s.connect(('127.0.0.1', p)); print(f'PORT {p}: OK')
    except Exception as e: print(f'PORT {p}: FAIL {e}')
    finally: s.close()
PY

Check exp23 progress

grep "Eval\|BEST" agent/models/exp23-generated-road-clean/run_2026-05-05_160718_clean.log | tail -20

Success Criteria (exp24)

  • Steering is visually smoother (7 discrete bins vs continuous Gaussian)
  • Car stuck against barrier terminates within 2-3 seconds (speed check)
  • Eval scores are more meaningful — each checkpoint tests a DIFFERENT road layout
  • ep_len_mean should continue increasing from the baseline exp23 established

Notes for Next Session

  • Unity rebuild is required before exp24 — Car.cs raycast fix won't be in effect until the build is done and synced.
  • The second sim (port 9093) is not needed — only port 9091.
  • Do NOT kill exp23 — let it run to completion.
  • Exp24's road regeneration adds ~5s per checkpoint = ~100s extra total. This is by design. The "Reconnecting for fresh road" log lines confirm it's working.
  • info['speed'] from telemetry = rb.velocity.magnitude / 8.0. The LOW_SPEED_THRESHOLD=0.5 corresponds to 4 Unity m/s, which is slow but not zero. A truly stuck car reads ~0.0. Tight corners might temporarily be 0.10.3. The 2-second timer provides enough grace for normal slow driving.