donkeycar-rl-autoresearch/agent/SESSION_HANDOFF.md

9.6 KiB
Raw Blame History

RL Donkeycar Session Handoff

Last updated: 2026-05-05 20:09 America/Toronto (updated during exp24 launch)

Autonomy Instruction

Use this as the standing instruction for follow-on sessions:

Continue the Donkeycar RL/sim work autonomously. Rebuild, sync, relaunch, run diagnostics, patch code, and restart experiments as needed. Keep going until you either have a verified fix and a running experiment, or a concrete blocker that truly requires the user. Do not stop just to ask for permission on ordinary reversible steps. Only pause for real risk of data loss, destructive actions, missing credentials/access, or major strategy tradeoffs that require a user decision.

If the user says only continue, interpret it using the instruction above.

Current Goal

Monitor exp24 on generated_road. All setup complete and running.

  • Exp24 is RUNNING (PID 733053) since 20:09 on 2026-05-05
  • Log: agent/models/exp24-discrete/run_2026-05-05_200903_discrete.log
  • All fixes are in effect: discrete steering (7 bins), speed-based stuck detection, road regeneration, Unity raycast fix compiled into Assembly-CSharp.dll (20:05)

Important Paths

Project:

  • /home/paulh/projects/donkeycar-rl-autoresearch

Unity source project:

  • /mnt/c/Users/Paul/Documents/projects/sdsandbox/sdsim

Unity build output:

  • /mnt/c/Users/Paul/Documents/projects/sdsandbox/sdsim/Builds/DonkeySimWin

Current runtime simulator folders in use:

  • /mnt/c/Users/Paul/Downloads/DonkeySimWin/DonkeySimWin
  • /mnt/c/Users/Paul/Downloads/DonkeySimWin/DonkeySimWin - Copy

Unity build log:

  • C:\Users\Paul\AppData\Local\Temp\unity_rebuild.log

Car.cs Collision Fix (this session)

Per-wheel OverlapSphere (exp25+)

/mnt/c/Users/Paul/Documents/projects/sdsandbox/sdsim/Assets/Scripts/Car.cs

Root cause of remaining stuck episodes: The forward raycast added in the previous session only covered nose-first contact when requestTorque > 0.05. Side contact, diagonal contact, rear contact, and low-throttle nose contact were all missed.

Fix: Added per-wheel OverlapSphere check in FixedUpdate():

  • Calls wc.GetWorldPose() to get each wheel's world position
  • Runs Physics.OverlapSphereNonAlloc(wheelCenter, wc.radius + 0.05f, _wheelOverlapBuf)
  • Filters hits by name containing "barrier" (matches left_barrier_seg*, right_barrier_seg*)
  • Calls RegisterCollision() on first barrier hit → episode terminates immediately
  • Pre-allocated Collider[8] buffer as a field (no GC per frame)
  • Both forward raycast AND OverlapSphere are active (complementary)

Build: Unity 6000.4.4f1, succeeded at ~20:35 on 2026-05-05 Rsync: Done — runtime folder has new Assembly-CSharp.dll (298,496 bytes) Effect: Takes effect after sim restart (pending exp24 completion)

What Was Fixed This Session

Barrier physics (previous session, still in effect)

  • RoadBuilder.cs: BoxCollider per segment with overlap to close corner gaps
  • Car.cs: CCD mode prevents tunneling through thin geometry
  • showBarrierMeshes=false in scene YAML files (barriers invisible to RL camera)

Nose-first stuck detection root cause identified and fixed

Why collision detection never fires when perpendicular to barrier:

Unity WheelColliders (used for suspension/steering) don't fire OnCollisionEnter or OnCollisionStay on the car's MonoBehaviour. When the car is nose-first into a barrier, only the front WHEELS make physical contact. The car BODY collider (which Car.cs's callbacks are attached to) is slightly behind the wheels and never touches. At an angle, the car body eventually makes contact — which is why collision detection works for side-contact but NOT for perpendicular contact.

Car.cs: forward raycast fix

/mnt/c/Users/Paul/Documents/projects/sdsandbox/sdsim/Assets/Scripts/Car.cs

  • Added short forward raycast in FixedUpdate():
    • When requestTorque > 0.05f (throttle applied)
    • Casts from car center + 0.3m up, 0.8m forward
    • Calls RegisterCollision() if anything hit
    • This fires whether wheels or body made contact
  • Requires a Unity rebuild + sync before exp24

StuckTerminationWrapper: speed-based check

agent/multitrack_runner.py — new params low_speed_threshold, max_low_speed_seconds:

  • If info['speed'] < 0.5 for 2 wall-clock seconds → terminate
  • Catches car pressed against barrier regardless of lateral sliding
  • info['speed'] = rb.velocity.magnitude / 8.0 in Unity → stuck car = ~0
  • Smoke-tested and committed

Exp24: discrete steering + road regeneration

agent/experiments/exp24_generated_road_discrete.py:

  • Action space: Discrete(7) via DiscretizedActionWrapper(n_steer=7, n_throttle=1)
  • Steering bins: -1, -0.67, -0.33, 0, 0.33, 0.67, 1 (throttle fixed at 0.2)
  • Road regeneration: after each 10k-step segment, env.close() + reconnect
    • Reconnecting reloads the scene → sdsandbox generates new random road
    • Each eval runs on a freshly generated road (proper generalization test)
    • ~5s overhead per checkpoint = ~100s total for 200k run
  • Speed-based stuck: LOW_SPEED_THRESHOLD=0.5, MAX_LOW_SPEED_SECONDS=2.0
  • MAX_EPISODE_SECONDS=30 (reduced from 120s — speed check handles barrier stuck)
  • LR=0.0003, 200k total steps

Road generation clarification

The road is NOT regenerated each episode reset. generated_road creates a fixed random layout when the scene LOADS (i.e., when you gym.make()). Within a session, all episodes use the same road. The exp23 eval variance (1071951 steps) was due to Unity physics non-determinism, NOT road variety.

Reward wrapper

agent/reward_wrapper.py v7 — unchanged, still in effect:

  • Reward: speed_norm × CTE_quality gated by efficiency
  • No Python-side exploit bandaids (physics enforces containment)

Current State

Exp 23 status — COMPLETE

  • Finished at 18:12 on 2026-05-05
  • Final mean: 2000 steps / 408.6 reward
  • High variance throughout: some crashes in 105171 steps (nose-first barrier), others ran 16841951 steps. Best eval: 403.8r / 2000s at ~40k steps.
  • This confirmed the nose-first stuck issue that exp24 is designed to fix.

Unity build status — DONE

  • Rebuilt with Unity 6000.4.4f1 at ~20:05 on 2026-05-05
  • Assembly-CSharp.dll updated: includes Car.cs forward raycast fix
  • Rsync'd to: /mnt/c/Users/Paul/Downloads/DonkeySimWin/DonkeySimWin/
  • Sim restarted on port 9091 with new binary

Exp 24 status — RUNNING (~100 min remaining as of 20:22)

  • PID 733053, launched at 20:09 on 2026-05-05
  • Monitor: tail -f /tmp/exp24.out
  • Results so far: 10k→250.4r/2000s , 20k→320.9r/2000s (NEW BEST both times)
  • NOTE: log file is 0 bytes (logging.basicConfig no-op). All output in /tmp/exp24.out.
  • Auto-transition to exp25 is armed — when exp24 finishes, monitor kills sim, restarts with wheel-fix binary, and launches exp25 automatically.

Exp 25 status — PENDING

  • Script: agent/experiments/exp25_wheel_collision_fix.py
  • Model dir: agent/models/exp25-wheel-fix/
  • Monitor: tail -f /tmp/exp25.out (once launched)
  • Key fix: per-wheel OverlapSphere in Car.cs catches any-angle barrier contact
  • Binary: already rsync'd, sim restart needed (auto-transition handles this)

Useful Commands

Check build log

tail -20 /mnt/c/Users/Paul/AppData/Local/Temp/unity_rebuild.log
grep "Exiting batchmode\|Build failed\|error\|Error" /mnt/c/Users/Paul/AppData/Local/Temp/unity_rebuild.log | tail -5

Monitor exp23 (while still running)

tail -f agent/models/exp23-generated-road-clean/run_2026-05-05_160718_clean.log

Monitor exp24

tail -f agent/models/exp24-discrete/run_*_discrete.log

Verify port 9091

python3 - <<'PY'
import socket
for p in (9091,):
    s = socket.socket(); s.settimeout(3)
    try: s.connect(('127.0.0.1', p)); print(f'PORT {p}: OK')
    except Exception as e: print(f'PORT {p}: FAIL {e}')
    finally: s.close()
PY

Check exp23 progress

grep "Eval\|BEST" agent/models/exp23-generated-road-clean/run_2026-05-05_160718_clean.log | tail -20

Success Criteria (exp24)

  • Steering is visually smoother (7 discrete bins vs continuous Gaussian)
  • Car stuck against barrier terminates within 2-3 seconds (speed check)
  • Eval scores are more meaningful — each checkpoint tests a DIFFERENT road layout
  • ep_len_mean should continue increasing from the baseline exp23 established

Notes for Next Session

  • Unity rebuild is DONE. No rebuild needed unless Car.cs or other scripts change.
  • Assembly-CSharp.dll confirmed in runtime folder. Port 9091 sim uses new binary.
  • The second sim (port 9093) is not needed — only port 9091.
  • Exp24 is running. DO NOT kill it. Let it run to 200k steps.
  • Exp24's road regeneration adds ~5s per checkpoint = ~100s extra total. This is by design. The "Reconnecting for fresh road" log lines confirm it's working.
  • info['speed'] from telemetry = rb.velocity.magnitude / 8.0. The LOW_SPEED_THRESHOLD=0.5 corresponds to 4 Unity m/s, which is slow but not zero. A truly stuck car reads ~0.0. Tight corners might temporarily be 0.10.3. The 2-second timer provides enough grace for normal slow driving.
  • Unity version used: 6000.4.4f1 (NOT 2020.3.x — the project is on Unity 6). Build command: "/mnt/c/Program Files/Unity/Hub/Editor/6000.4.4f1/Editor/Unity.exe" -quit -batchmode -projectPath "C:/Users/Paul/Documents/projects/sdsandbox/sdsim" -executeMethod PlayerBuilder.WinBuild -logFile "C:/Users/Paul/AppData/Local/Temp/unity_rebuild.log"
  • To kill/restart sim from WSL2: taskkill.exe is at /mnt/c/Windows/System32/taskkill.exe (not in PATH — use full path). Restart: /mnt/c/Windows/System32/cmd.exe /c start "" "C:\...\donkey_sim.exe" --port 9091