250 lines
10 KiB
Markdown
250 lines
10 KiB
Markdown
# RL Donkeycar Session Handoff
|
|
|
|
Last updated: 2026-05-05 America/Toronto
|
|
|
|
## Autonomy Instruction
|
|
|
|
Use this as the standing instruction for follow-on sessions:
|
|
|
|
`Continue the Donkeycar RL/sim work autonomously. Rebuild, sync, relaunch, run diagnostics, patch code, and restart experiments as needed. Keep going until you either have a verified fix and a running experiment, or a concrete blocker that truly requires the user. Do not stop just to ask for permission on ordinary reversible steps. Only pause for real risk of data loss, destructive actions, missing credentials/access, or major strategy tradeoffs that require a user decision.`
|
|
|
|
If the user says only `continue`, interpret it using the instruction above.
|
|
|
|
## Current Goal
|
|
|
|
Stabilize the Unity simulator geometry and collision behavior enough that:
|
|
|
|
- `generated_road` and `generated_track` both run without bad invisible barrier placement
|
|
- barrier contacts terminate episodes appropriately
|
|
- RL can restart from a trustworthy simulator build
|
|
|
|
## Important Paths
|
|
|
|
Project:
|
|
|
|
- `/home/paulh/projects/donkeycar-rl-autoresearch`
|
|
|
|
Unity source project:
|
|
|
|
- `/mnt/c/Users/Paul/Documents/projects/sdsandbox/sdsim`
|
|
|
|
Unity build output:
|
|
|
|
- `/mnt/c/Users/Paul/Documents/projects/sdsandbox/sdsim/Builds/DonkeySimWin`
|
|
|
|
Current runtime simulator folders in use:
|
|
|
|
- `/mnt/c/Users/Paul/Downloads/DonkeySimWin/DonkeySimWin`
|
|
- `/mnt/c/Users/Paul/Downloads/DonkeySimWin/DonkeySimWin - Copy`
|
|
|
|
## Current RL Experiment Files
|
|
|
|
- `agent/experiments/exp21_generated_pair_warm_v4.py`
|
|
- `agent/experiments/exp22_generated_pair_warm_v6.py`
|
|
|
|
Latest model/output folder:
|
|
|
|
- `agent/models/exp22-generated-pair-warm-v6`
|
|
|
|
Current training run:
|
|
|
|
- launched `agent/experiments/exp22_generated_pair_warm_v6.py`
|
|
- PID file: `agent/models/exp22-generated-pair-warm-v6/current.pid`
|
|
- current PID at launch time: `609054`
|
|
- log: `agent/models/exp22-generated-pair-warm-v6/run_2026-05-05_141929_strictcte.log`
|
|
- startup verified: connected to `localhost:9091` and `localhost:9093`, loaded `generated_road` and `generated_track`, attached warm-start model, reached `Starting training...`
|
|
|
|
Latest urgent exploit fix:
|
|
|
|
- User observed generated_road still doing the large outside circle exploit.
|
|
- Stopped the previous run immediately.
|
|
- Patched `agent/reward_wrapper.py` so high CTE receives negative reward immediately during the patience window instead of falling through to positive speed reward.
|
|
- Patched `agent/experiments/exp22_generated_pair_warm_v6.py`:
|
|
- `MAX_CTE_TERMINATE = 2.5`
|
|
- `CTE_PATIENCE = 3`
|
|
- Added regression test `test_high_cte_never_gets_positive_speed_reward_before_termination`.
|
|
- Verified `python3 -m pytest -q tests/test_reward_wrapper.py`: `21 passed`.
|
|
|
|
## What Was Learned
|
|
|
|
### Training status
|
|
|
|
The latest meaningful `exp22` run was poor and should not be resumed as-is.
|
|
|
|
From `agent/models/exp22-generated-pair-warm-v6/run_2026-04-28_2132_openfix.log`:
|
|
|
|
- best `generated_track` eval reached only about `92` steps
|
|
- run was not trustworthy due to ongoing barrier-placement concerns
|
|
|
|
### Simulator behavior
|
|
|
|
- Invisible barriers are collider-only by default, so the user cannot see them in the standalone player
|
|
- Diagnostic probe showed both tracks could advance from the start before hitting `left_barrier`, so there was no obvious full-width blocker across the road start
|
|
- User screenshot suggested the car was getting trapped near the shoulder/edge, consistent with barrier corridor too close to the drivable edge
|
|
- User also reported that barrier contact sometimes blocks the car without promptly ending the episode
|
|
|
|
### Collision semantics
|
|
|
|
The user does **not** want every barrier brush to terminate the episode.
|
|
|
|
Desired behavior:
|
|
|
|
- light brush: can continue
|
|
- sustained contact: terminate
|
|
- head-on / abrupt stop: terminate quickly
|
|
|
|
## Code Changes Already Made
|
|
|
|
### Unity / simulator side
|
|
|
|
`/mnt/c/Users/Paul/Documents/projects/sdsandbox/sdsim/Assets/Scripts/RoadBuilder.cs`
|
|
|
|
Implemented structural refactor:
|
|
|
|
- explicit `closeLoop` support
|
|
- explicit road-edge generation
|
|
- barrier edges derived from left/right road edges instead of guessed centerline offset
|
|
- open tracks do not force wraparound
|
|
- debug polyline support via gizmos
|
|
|
|
Added runtime-visible debug barrier support:
|
|
|
|
- `showBarrierMeshes`
|
|
- `barrierDebugColor`
|
|
- barrier objects now include `MeshFilter`
|
|
- optional `MeshRenderer` added for visible translucent barriers
|
|
|
|
`/mnt/c/Users/Paul/Documents/projects/sdsandbox/sdsim/Assets/Scenes/generated_road.unity`
|
|
|
|
- `closeLoop = 0`
|
|
- `doAddBarriers = 1`
|
|
- `showBarrierMeshes = 1`
|
|
- pinned road variation arrays to one entry
|
|
- `roadOffsets.Array.data[0] = 2.2`
|
|
|
|
`/mnt/c/Users/Paul/Documents/projects/sdsandbox/sdsim/Assets/Scenes/generated_track.unity`
|
|
|
|
- `showBarrierMeshes = 1`
|
|
- `roadOffsetW = 2.2`
|
|
- barriers still enabled
|
|
|
|
### Python / RL side
|
|
|
|
`/home/paulh/projects/donkeycar-rl-autoresearch/agent/reward_wrapper.py`
|
|
|
|
Latest intent:
|
|
|
|
- do **not** terminate instantly on every barrier hit
|
|
- terminate on sustained obstacle contact
|
|
- terminate on head-on style stop
|
|
|
|
Current patch in file:
|
|
|
|
- tracks `_solid_hit_steps`
|
|
- tracks `_prev_speed`
|
|
- classifies solid hits via `hit` containing `barrier`, `wall`, or `tree`
|
|
- immediate terminate on abrupt speed collapse while colliding
|
|
- terminate after several consecutive solid-hit frames
|
|
|
|
This was meant to replace the too-aggressive “any barrier hit = immediate death” logic.
|
|
|
|
## Most Recent Verified Build Status
|
|
|
|
Unity batch build for the debug-visible barrier version completed successfully.
|
|
|
|
Evidence:
|
|
|
|
- build log ended with `Exiting batchmode successfully now!`
|
|
- return code `0`
|
|
|
|
The successful build has now been synced into both `Downloads` runtime folders and both simulators have been relaunched.
|
|
|
|
Current verified runtime state:
|
|
|
|
- main folder process owns port `9091`
|
|
- main folder also owns private API port `9092`
|
|
- copy folder process owns port `9093`
|
|
- copy folder also owns private API port `9094`
|
|
- Linux socket probe reported `PORT 9091: OK`, `PORT 9092: OK`, `PORT 9093: OK`, and `PORT 9094: OK`
|
|
- latest runtime build includes double-sided barrier mesh triangles for visual/debug barrier rendering
|
|
|
|
Note: the Windows profile uses shared Unity PlayerPrefs/registry values under `HKCU:\Software\DonkeyCar\donkey_sim`. Explicit `--port` args bind the servers correctly, but the in-sim UI can still show the saved PlayerPrefs value. Before launch, set `port_h2088097884`/`portPrivateAPI_h1325370089` to `9091`/`9092`, start the main sim, then set them to `9093`/`9094` and start the copy. Also keep passing explicit `--port 9091` and `--port 9093`.
|
|
|
|
Latest user visual inspection before double-sided patch:
|
|
|
|
- `generated_road`: barriers visible on both sides except missing on left side at the very start before the first curve
|
|
- `generated_track`: barrier visible only on the right/inside side when driving clockwise; no visible left/outside barrier
|
|
|
|
Likely diagnosis: barrier mesh was generated as a single-sided vertical plane and the Standard shader culled backfaces, so some debug barrier surfaces existed but were invisible from the road/camera side.
|
|
|
|
Latest simulator-side patch:
|
|
|
|
- `/mnt/c/Users/Paul/Documents/projects/sdsandbox/sdsim/Assets/Scripts/RoadBuilder.cs`
|
|
- `CreateBarrier(...)` now emits reverse-facing triangles for every barrier quad, making debug barrier meshes visible from both sides
|
|
- failed attempt: `Unlit/Transparent` made both tracks' barriers black in the standalone player
|
|
- failed attempt: duplicating reverse-facing triangles made `generated_track` barriers black, likely due coplanar transparent overdraw/z-fighting on the closed/scaled track
|
|
- current debug barrier mesh is back to one triangle set per quad; material uses `Standard` transparent mode with forced pale fallback color, alpha blend, culling off, and emission enabled so barriers should stay light/translucent while remaining visible from both sides
|
|
- Unity Windows batch build succeeded after this patch
|
|
- rebuilt output synced to both runtime folders and relaunched with explicit ports
|
|
|
|
## Immediate Next Steps
|
|
|
|
1. Monitor current exp22 training log/checkpoints.
|
|
|
|
2. Determine:
|
|
- are barriers too close to the road edge globally?
|
|
- or only wrong at specific bends / first-corner geometry?
|
|
|
|
3. Fix geometry if needed before restarting RL.
|
|
|
|
4. Only after geometry is visually verified, restart `exp22` or a successor experiment.
|
|
|
|
## Useful Commands
|
|
|
|
### Sync latest build into runtime folders
|
|
|
|
```bash
|
|
rsync -a --delete '/mnt/c/Users/Paul/Documents/projects/sdsandbox/sdsim/Builds/DonkeySimWin/' '/mnt/c/Users/Paul/Downloads/DonkeySimWin/DonkeySimWin/'
|
|
rsync -a --delete '/mnt/c/Users/Paul/Documents/projects/sdsandbox/sdsim/Builds/DonkeySimWin/' '/mnt/c/Users/Paul/Downloads/DonkeySimWin/DonkeySimWin - Copy/'
|
|
```
|
|
|
|
### Launch sims from Windows side
|
|
|
|
```powershell
|
|
$key = 'HKCU:\Software\DonkeyCar\donkey_sim'
|
|
|
|
Set-ItemProperty -Path $key -Name 'port_h2088097884' -Value 9091 -Type DWord
|
|
Set-ItemProperty -Path $key -Name 'portPrivateAPI_h1325370089' -Value 9092 -Type DWord
|
|
Start-Process -FilePath 'C:\Users\Paul\Downloads\DonkeySimWin\DonkeySimWin\donkey_sim.exe' -ArgumentList '--port','9091' -WorkingDirectory 'C:\Users\Paul\Downloads\DonkeySimWin\DonkeySimWin'
|
|
|
|
Start-Sleep -Seconds 4
|
|
|
|
Set-ItemProperty -Path $key -Name 'port_h2088097884' -Value 9093 -Type DWord
|
|
Set-ItemProperty -Path $key -Name 'portPrivateAPI_h1325370089' -Value 9094 -Type DWord
|
|
Start-Process -FilePath 'C:\Users\Paul\Downloads\DonkeySimWin\DonkeySimWin - Copy\donkey_sim.exe' -ArgumentList '--port','9093' -WorkingDirectory 'C:\Users\Paul\Downloads\DonkeySimWin\DonkeySimWin - Copy'
|
|
```
|
|
|
|
### Verify ports
|
|
|
|
```bash
|
|
python3 - <<'PY'
|
|
import socket
|
|
for p in (9091, 9093):
|
|
s = socket.socket()
|
|
s.settimeout(3)
|
|
try:
|
|
s.connect(('127.0.0.1', p))
|
|
print(f'PORT {p}: OK')
|
|
except Exception as e:
|
|
print(f'PORT {p}: FAIL {e}')
|
|
finally:
|
|
s.close()
|
|
PY
|
|
```
|
|
|
|
## Notes for Next Session
|
|
|
|
- If the user says `continue`, do not ask broad questions. Start with the immediate next steps above.
|
|
- Prefer direct verification over more RL training.
|
|
- Do not restart long training until the user has visually confirmed the debug-visible barriers look correct.
|