10 KiB
RL Donkeycar Session Handoff
Last updated: 2026-05-05 America/Toronto
Autonomy Instruction
Use this as the standing instruction for follow-on sessions:
Continue the Donkeycar RL/sim work autonomously. Rebuild, sync, relaunch, run diagnostics, patch code, and restart experiments as needed. Keep going until you either have a verified fix and a running experiment, or a concrete blocker that truly requires the user. Do not stop just to ask for permission on ordinary reversible steps. Only pause for real risk of data loss, destructive actions, missing credentials/access, or major strategy tradeoffs that require a user decision.
If the user says only continue, interpret it using the instruction above.
Current Goal
Stabilize the Unity simulator geometry and collision behavior enough that:
generated_roadandgenerated_trackboth run without bad invisible barrier placement- barrier contacts terminate episodes appropriately
- RL can restart from a trustworthy simulator build
Important Paths
Project:
/home/paulh/projects/donkeycar-rl-autoresearch
Unity source project:
/mnt/c/Users/Paul/Documents/projects/sdsandbox/sdsim
Unity build output:
/mnt/c/Users/Paul/Documents/projects/sdsandbox/sdsim/Builds/DonkeySimWin
Current runtime simulator folders in use:
/mnt/c/Users/Paul/Downloads/DonkeySimWin/DonkeySimWin/mnt/c/Users/Paul/Downloads/DonkeySimWin/DonkeySimWin - Copy
Current RL Experiment Files
agent/experiments/exp21_generated_pair_warm_v4.pyagent/experiments/exp22_generated_pair_warm_v6.py
Latest model/output folder:
agent/models/exp22-generated-pair-warm-v6
Current training run:
- launched
agent/experiments/exp22_generated_pair_warm_v6.py - PID file:
agent/models/exp22-generated-pair-warm-v6/current.pid - current PID at launch time:
609054 - log:
agent/models/exp22-generated-pair-warm-v6/run_2026-05-05_141929_strictcte.log - startup verified: connected to
localhost:9091andlocalhost:9093, loadedgenerated_roadandgenerated_track, attached warm-start model, reachedStarting training...
Latest urgent exploit fix:
- User observed generated_road still doing the large outside circle exploit.
- Stopped the previous run immediately.
- Patched
agent/reward_wrapper.pyso high CTE receives negative reward immediately during the patience window instead of falling through to positive speed reward. - Patched
agent/experiments/exp22_generated_pair_warm_v6.py:MAX_CTE_TERMINATE = 2.5CTE_PATIENCE = 3
- Added regression test
test_high_cte_never_gets_positive_speed_reward_before_termination. - Verified
python3 -m pytest -q tests/test_reward_wrapper.py:21 passed.
What Was Learned
Training status
The latest meaningful exp22 run was poor and should not be resumed as-is.
From agent/models/exp22-generated-pair-warm-v6/run_2026-04-28_2132_openfix.log:
- best
generated_trackeval reached only about92steps - run was not trustworthy due to ongoing barrier-placement concerns
Simulator behavior
- Invisible barriers are collider-only by default, so the user cannot see them in the standalone player
- Diagnostic probe showed both tracks could advance from the start before hitting
left_barrier, so there was no obvious full-width blocker across the road start - User screenshot suggested the car was getting trapped near the shoulder/edge, consistent with barrier corridor too close to the drivable edge
- User also reported that barrier contact sometimes blocks the car without promptly ending the episode
Collision semantics
The user does not want every barrier brush to terminate the episode.
Desired behavior:
- light brush: can continue
- sustained contact: terminate
- head-on / abrupt stop: terminate quickly
Code Changes Already Made
Unity / simulator side
/mnt/c/Users/Paul/Documents/projects/sdsandbox/sdsim/Assets/Scripts/RoadBuilder.cs
Implemented structural refactor:
- explicit
closeLoopsupport - explicit road-edge generation
- barrier edges derived from left/right road edges instead of guessed centerline offset
- open tracks do not force wraparound
- debug polyline support via gizmos
Added runtime-visible debug barrier support:
showBarrierMeshesbarrierDebugColor- barrier objects now include
MeshFilter - optional
MeshRendereradded for visible translucent barriers
/mnt/c/Users/Paul/Documents/projects/sdsandbox/sdsim/Assets/Scenes/generated_road.unity
closeLoop = 0doAddBarriers = 1showBarrierMeshes = 1- pinned road variation arrays to one entry
roadOffsets.Array.data[0] = 2.2
/mnt/c/Users/Paul/Documents/projects/sdsandbox/sdsim/Assets/Scenes/generated_track.unity
showBarrierMeshes = 1roadOffsetW = 2.2- barriers still enabled
Python / RL side
/home/paulh/projects/donkeycar-rl-autoresearch/agent/reward_wrapper.py
Latest intent:
- do not terminate instantly on every barrier hit
- terminate on sustained obstacle contact
- terminate on head-on style stop
Current patch in file:
- tracks
_solid_hit_steps - tracks
_prev_speed - classifies solid hits via
hitcontainingbarrier,wall, ortree - immediate terminate on abrupt speed collapse while colliding
- terminate after several consecutive solid-hit frames
This was meant to replace the too-aggressive “any barrier hit = immediate death” logic.
Most Recent Verified Build Status
Unity batch build for the debug-visible barrier version completed successfully.
Evidence:
- build log ended with
Exiting batchmode successfully now! - return code
0
The successful build has now been synced into both Downloads runtime folders and both simulators have been relaunched.
Current verified runtime state:
- main folder process owns port
9091 - main folder also owns private API port
9092 - copy folder process owns port
9093 - copy folder also owns private API port
9094 - Linux socket probe reported
PORT 9091: OK,PORT 9092: OK,PORT 9093: OK, andPORT 9094: OK - latest runtime build includes double-sided barrier mesh triangles for visual/debug barrier rendering
Note: the Windows profile uses shared Unity PlayerPrefs/registry values under HKCU:\Software\DonkeyCar\donkey_sim. Explicit --port args bind the servers correctly, but the in-sim UI can still show the saved PlayerPrefs value. Before launch, set port_h2088097884/portPrivateAPI_h1325370089 to 9091/9092, start the main sim, then set them to 9093/9094 and start the copy. Also keep passing explicit --port 9091 and --port 9093.
Latest user visual inspection before double-sided patch:
generated_road: barriers visible on both sides except missing on left side at the very start before the first curvegenerated_track: barrier visible only on the right/inside side when driving clockwise; no visible left/outside barrier
Likely diagnosis: barrier mesh was generated as a single-sided vertical plane and the Standard shader culled backfaces, so some debug barrier surfaces existed but were invisible from the road/camera side.
Latest simulator-side patch:
/mnt/c/Users/Paul/Documents/projects/sdsandbox/sdsim/Assets/Scripts/RoadBuilder.csCreateBarrier(...)now emits reverse-facing triangles for every barrier quad, making debug barrier meshes visible from both sides- failed attempt:
Unlit/Transparentmade both tracks' barriers black in the standalone player - failed attempt: duplicating reverse-facing triangles made
generated_trackbarriers black, likely due coplanar transparent overdraw/z-fighting on the closed/scaled track - current debug barrier mesh is back to one triangle set per quad; material uses
Standardtransparent mode with forced pale fallback color, alpha blend, culling off, and emission enabled so barriers should stay light/translucent while remaining visible from both sides - Unity Windows batch build succeeded after this patch
- rebuilt output synced to both runtime folders and relaunched with explicit ports
Immediate Next Steps
-
Monitor current exp22 training log/checkpoints.
-
Determine:
- are barriers too close to the road edge globally?
- or only wrong at specific bends / first-corner geometry?
-
Fix geometry if needed before restarting RL.
-
Only after geometry is visually verified, restart
exp22or a successor experiment.
Useful Commands
Sync latest build into runtime folders
rsync -a --delete '/mnt/c/Users/Paul/Documents/projects/sdsandbox/sdsim/Builds/DonkeySimWin/' '/mnt/c/Users/Paul/Downloads/DonkeySimWin/DonkeySimWin/'
rsync -a --delete '/mnt/c/Users/Paul/Documents/projects/sdsandbox/sdsim/Builds/DonkeySimWin/' '/mnt/c/Users/Paul/Downloads/DonkeySimWin/DonkeySimWin - Copy/'
Launch sims from Windows side
$key = 'HKCU:\Software\DonkeyCar\donkey_sim'
Set-ItemProperty -Path $key -Name 'port_h2088097884' -Value 9091 -Type DWord
Set-ItemProperty -Path $key -Name 'portPrivateAPI_h1325370089' -Value 9092 -Type DWord
Start-Process -FilePath 'C:\Users\Paul\Downloads\DonkeySimWin\DonkeySimWin\donkey_sim.exe' -ArgumentList '--port','9091' -WorkingDirectory 'C:\Users\Paul\Downloads\DonkeySimWin\DonkeySimWin'
Start-Sleep -Seconds 4
Set-ItemProperty -Path $key -Name 'port_h2088097884' -Value 9093 -Type DWord
Set-ItemProperty -Path $key -Name 'portPrivateAPI_h1325370089' -Value 9094 -Type DWord
Start-Process -FilePath 'C:\Users\Paul\Downloads\DonkeySimWin\DonkeySimWin - Copy\donkey_sim.exe' -ArgumentList '--port','9093' -WorkingDirectory 'C:\Users\Paul\Downloads\DonkeySimWin\DonkeySimWin - Copy'
Verify ports
python3 - <<'PY'
import socket
for p in (9091, 9093):
s = socket.socket()
s.settimeout(3)
try:
s.connect(('127.0.0.1', p))
print(f'PORT {p}: OK')
except Exception as e:
print(f'PORT {p}: FAIL {e}')
finally:
s.close()
PY
Notes for Next Session
- If the user says
continue, do not ask broad questions. Start with the immediate next steps above. - Prefer direct verification over more RL training.
- Do not restart long training until the user has visually confirmed the debug-visible barriers look correct.