donkeycar-rl-autoresearch/agent/SESSION_HANDOFF.md

215 lines
9.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# RL Donkeycar Session Handoff
Last updated: 2026-05-05 20:09 America/Toronto (updated during exp24 launch)
## Autonomy Instruction
Use this as the standing instruction for follow-on sessions:
`Continue the Donkeycar RL/sim work autonomously. Rebuild, sync, relaunch, run diagnostics, patch code, and restart experiments as needed. Keep going until you either have a verified fix and a running experiment, or a concrete blocker that truly requires the user. Do not stop just to ask for permission on ordinary reversible steps. Only pause for real risk of data loss, destructive actions, missing credentials/access, or major strategy tradeoffs that require a user decision.`
If the user says only `continue`, interpret it using the instruction above.
## Current Goal
Monitor exp24 on `generated_road`. All setup complete and running.
- **Exp24 is RUNNING** (PID 733053) since 20:09 on 2026-05-05
- Log: `agent/models/exp24-discrete/run_2026-05-05_200903_discrete.log`
- All fixes are in effect: discrete steering (7 bins), speed-based stuck detection,
road regeneration, Unity raycast fix compiled into Assembly-CSharp.dll (20:05)
## Important Paths
Project:
- `/home/paulh/projects/donkeycar-rl-autoresearch`
Unity source project:
- `/mnt/c/Users/Paul/Documents/projects/sdsandbox/sdsim`
Unity build output:
- `/mnt/c/Users/Paul/Documents/projects/sdsandbox/sdsim/Builds/DonkeySimWin`
Current runtime simulator folders in use:
- `/mnt/c/Users/Paul/Downloads/DonkeySimWin/DonkeySimWin`
- `/mnt/c/Users/Paul/Downloads/DonkeySimWin/DonkeySimWin - Copy`
Unity build log:
- `C:\Users\Paul\AppData\Local\Temp\unity_rebuild.log`
## Car.cs Collision Fix (this session)
### Per-wheel OverlapSphere (exp25+)
`/mnt/c/Users/Paul/Documents/projects/sdsandbox/sdsim/Assets/Scripts/Car.cs`
**Root cause of remaining stuck episodes:** The forward raycast added in the previous
session only covered nose-first contact when `requestTorque > 0.05`. Side contact,
diagonal contact, rear contact, and low-throttle nose contact were all missed.
**Fix:** Added per-wheel `OverlapSphere` check in `FixedUpdate()`:
- Calls `wc.GetWorldPose()` to get each wheel's world position
- Runs `Physics.OverlapSphereNonAlloc(wheelCenter, wc.radius + 0.05f, _wheelOverlapBuf)`
- Filters hits by name containing `"barrier"` (matches `left_barrier_seg*`, `right_barrier_seg*`)
- Calls `RegisterCollision()` on first barrier hit → episode terminates immediately
- Pre-allocated `Collider[8]` buffer as a field (no GC per frame)
- Both forward raycast AND OverlapSphere are active (complementary)
**Build:** Unity 6000.4.4f1, succeeded at ~20:35 on 2026-05-05
**Rsync:** Done — runtime folder has new Assembly-CSharp.dll (298,496 bytes)
**Effect:** Takes effect after sim restart (pending exp24 completion)
## What Was Fixed This Session
### Barrier physics (previous session, still in effect)
- `RoadBuilder.cs`: BoxCollider per segment with overlap to close corner gaps
- `Car.cs`: CCD mode prevents tunneling through thin geometry
- `showBarrierMeshes=false` in scene YAML files (barriers invisible to RL camera)
### Nose-first stuck detection root cause identified and fixed
**Why collision detection never fires when perpendicular to barrier:**
Unity `WheelColliders` (used for suspension/steering) don't fire `OnCollisionEnter`
or `OnCollisionStay` on the car's `MonoBehaviour`. When the car is nose-first into
a barrier, only the front WHEELS make physical contact. The car BODY collider (which
`Car.cs`'s callbacks are attached to) is slightly behind the wheels and never touches.
At an angle, the car body eventually makes contact — which is why collision detection
works for side-contact but NOT for perpendicular contact.
### Car.cs: forward raycast fix
`/mnt/c/Users/Paul/Documents/projects/sdsandbox/sdsim/Assets/Scripts/Car.cs`
- Added short forward raycast in `FixedUpdate()`:
- When `requestTorque > 0.05f` (throttle applied)
- Casts from car center + 0.3m up, 0.8m forward
- Calls `RegisterCollision()` if anything hit
- This fires whether wheels or body made contact
- **Requires a Unity rebuild + sync before exp24**
### StuckTerminationWrapper: speed-based check
`agent/multitrack_runner.py` — new params `low_speed_threshold`, `max_low_speed_seconds`:
- If `info['speed'] < 0.5` for 2 wall-clock seconds → terminate
- Catches car pressed against barrier regardless of lateral sliding
- `info['speed']` = `rb.velocity.magnitude / 8.0` in Unity → stuck car = ~0
- Smoke-tested and committed
### Exp24: discrete steering + road regeneration
`agent/experiments/exp24_generated_road_discrete.py`:
- Action space: `Discrete(7)` via `DiscretizedActionWrapper(n_steer=7, n_throttle=1)`
- Steering bins: -1, -0.67, -0.33, 0, 0.33, 0.67, 1 (throttle fixed at 0.2)
- **Road regeneration**: after each 10k-step segment, `env.close()` + reconnect
- Reconnecting reloads the scene → sdsandbox generates new random road
- Each eval runs on a freshly generated road (proper generalization test)
- ~5s overhead per checkpoint = ~100s total for 200k run
- Speed-based stuck: `LOW_SPEED_THRESHOLD=0.5`, `MAX_LOW_SPEED_SECONDS=2.0`
- `MAX_EPISODE_SECONDS=30` (reduced from 120s — speed check handles barrier stuck)
- LR=0.0003, 200k total steps
### Road generation clarification
The road is NOT regenerated each episode reset. `generated_road` creates a fixed
random layout when the scene LOADS (i.e., when you `gym.make()`). Within a session,
all episodes use the same road. The exp23 eval variance (1071951 steps) was due
to Unity physics non-determinism, NOT road variety.
### Reward wrapper
`agent/reward_wrapper.py` v7 — unchanged, still in effect:
- Reward: `speed_norm × CTE_quality` gated by efficiency
- No Python-side exploit bandaids (physics enforces containment)
## Current State
### Exp 23 status — COMPLETE
- Finished at 18:12 on 2026-05-05
- Final mean: 2000 steps / 408.6 reward
- High variance throughout: some crashes in 105171 steps (nose-first barrier),
others ran 16841951 steps. Best eval: 403.8r / 2000s ✅ at ~40k steps.
- This confirmed the nose-first stuck issue that exp24 is designed to fix.
### Unity build status — DONE
- Rebuilt with Unity 6000.4.4f1 at ~20:05 on 2026-05-05
- Assembly-CSharp.dll updated: includes Car.cs forward raycast fix
- Rsync'd to: `/mnt/c/Users/Paul/Downloads/DonkeySimWin/DonkeySimWin/`
- Sim restarted on port 9091 with new binary
### Exp 24 status — COMPLETE
- Finished at 22:03 on 2026-05-05
- 19 consecutive full episodes (2000 steps each), zero crashes during training
- Best checkpoint: 170k → 365.5r / 2000s ✅
- Full eval curve: 250→320→333→327→323→344→352→346→340→334→347→355→356→345→363→353→365→354→354→354
- Final 3-road eval (best_model): Set1=305r/1680s❌, Set2=368r/2000s✅, Set3=365r/2000s✅
Mean: 1893 steps / 346.5 reward
- Log: `/tmp/exp24.out` (log file was 0 bytes — logging.basicConfig no-op fixed in exp25)
### Exp 25 status — RUNNING
- PID 776352, launched at 22:12 on 2026-05-05
- Log: `agent/models/exp25-wheel-fix/run_2026-05-05_221255_wheel_fix.log` ← writes correctly!
- Monitor: `tail -f agent/models/exp25-wheel-fix/run_2026-05-05_221255_wheel_fix.log`
- Running on patched sim with wheel OverlapSphere fix (any-angle barrier detection)
## Useful Commands
### Check build log
```bash
tail -20 /mnt/c/Users/Paul/AppData/Local/Temp/unity_rebuild.log
grep "Exiting batchmode\|Build failed\|error\|Error" /mnt/c/Users/Paul/AppData/Local/Temp/unity_rebuild.log | tail -5
```
### Monitor exp23 (while still running)
```bash
tail -f agent/models/exp23-generated-road-clean/run_2026-05-05_160718_clean.log
```
### Monitor exp24
```bash
tail -f agent/models/exp24-discrete/run_*_discrete.log
```
### Verify port 9091
```bash
python3 - <<'PY'
import socket
for p in (9091,):
s = socket.socket(); s.settimeout(3)
try: s.connect(('127.0.0.1', p)); print(f'PORT {p}: OK')
except Exception as e: print(f'PORT {p}: FAIL {e}')
finally: s.close()
PY
```
### Check exp23 progress
```bash
grep "Eval\|BEST" agent/models/exp23-generated-road-clean/run_2026-05-05_160718_clean.log | tail -20
```
## Success Criteria (exp24)
- Steering is visually smoother (7 discrete bins vs continuous Gaussian)
- Car stuck against barrier terminates within 2-3 seconds (speed check)
- Eval scores are more meaningful — each checkpoint tests a DIFFERENT road layout
- ep_len_mean should continue increasing from the baseline exp23 established
## Notes for Next Session
- Unity rebuild is DONE. No rebuild needed unless Car.cs or other scripts change.
- Assembly-CSharp.dll confirmed in runtime folder. Port 9091 sim uses new binary.
- The second sim (port 9093) is not needed — only port 9091.
- Exp24 is running. DO NOT kill it. Let it run to 200k steps.
- Exp24's road regeneration adds ~5s per checkpoint = ~100s extra total. This is
by design. The "Reconnecting for fresh road" log lines confirm it's working.
- `info['speed']` from telemetry = `rb.velocity.magnitude / 8.0`. The
`LOW_SPEED_THRESHOLD=0.5` corresponds to 4 Unity m/s, which is slow but not zero.
A truly stuck car reads ~0.0. Tight corners might temporarily be 0.10.3.
The 2-second timer provides enough grace for normal slow driving.
- Unity version used: 6000.4.4f1 (NOT 2020.3.x — the project is on Unity 6).
Build command: `"/mnt/c/Program Files/Unity/Hub/Editor/6000.4.4f1/Editor/Unity.exe"
-quit -batchmode -projectPath "C:/Users/Paul/Documents/projects/sdsandbox/sdsim"
-executeMethod PlayerBuilder.WinBuild -logFile "C:/Users/Paul/AppData/Local/Temp/unity_rebuild.log"`
- To kill/restart sim from WSL2: `taskkill.exe` is at `/mnt/c/Windows/System32/taskkill.exe`
(not in PATH — use full path). Restart: `/mnt/c/Windows/System32/cmd.exe /c start "" "C:\...\donkey_sim.exe" --port 9091`