donkeycar-rl-autoresearch/agent/SESSION_HANDOFF.md

10 KiB

RL Donkeycar Session Handoff

Last updated: 2026-05-05 America/Toronto

Autonomy Instruction

Use this as the standing instruction for follow-on sessions:

Continue the Donkeycar RL/sim work autonomously. Rebuild, sync, relaunch, run diagnostics, patch code, and restart experiments as needed. Keep going until you either have a verified fix and a running experiment, or a concrete blocker that truly requires the user. Do not stop just to ask for permission on ordinary reversible steps. Only pause for real risk of data loss, destructive actions, missing credentials/access, or major strategy tradeoffs that require a user decision.

If the user says only continue, interpret it using the instruction above.

Current Goal

Stabilize the Unity simulator geometry and collision behavior enough that:

  • generated_road and generated_track both run without bad invisible barrier placement
  • barrier contacts terminate episodes appropriately
  • RL can restart from a trustworthy simulator build

Important Paths

Project:

  • /home/paulh/projects/donkeycar-rl-autoresearch

Unity source project:

  • /mnt/c/Users/Paul/Documents/projects/sdsandbox/sdsim

Unity build output:

  • /mnt/c/Users/Paul/Documents/projects/sdsandbox/sdsim/Builds/DonkeySimWin

Current runtime simulator folders in use:

  • /mnt/c/Users/Paul/Downloads/DonkeySimWin/DonkeySimWin
  • /mnt/c/Users/Paul/Downloads/DonkeySimWin/DonkeySimWin - Copy

Current RL Experiment Files

  • agent/experiments/exp21_generated_pair_warm_v4.py
  • agent/experiments/exp22_generated_pair_warm_v6.py

Latest model/output folder:

  • agent/models/exp22-generated-pair-warm-v6

Current training run:

  • launched agent/experiments/exp22_generated_pair_warm_v6.py
  • PID file: agent/models/exp22-generated-pair-warm-v6/current.pid
  • current PID at launch time: 609054
  • log: agent/models/exp22-generated-pair-warm-v6/run_2026-05-05_141929_strictcte.log
  • startup verified: connected to localhost:9091 and localhost:9093, loaded generated_road and generated_track, attached warm-start model, reached Starting training...

Latest urgent exploit fix:

  • User observed generated_road still doing the large outside circle exploit.
  • Stopped the previous run immediately.
  • Patched agent/reward_wrapper.py so high CTE receives negative reward immediately during the patience window instead of falling through to positive speed reward.
  • Patched agent/experiments/exp22_generated_pair_warm_v6.py:
    • MAX_CTE_TERMINATE = 2.5
    • CTE_PATIENCE = 3
  • Added regression test test_high_cte_never_gets_positive_speed_reward_before_termination.
  • Verified python3 -m pytest -q tests/test_reward_wrapper.py: 21 passed.

What Was Learned

Training status

The latest meaningful exp22 run was poor and should not be resumed as-is.

From agent/models/exp22-generated-pair-warm-v6/run_2026-04-28_2132_openfix.log:

  • best generated_track eval reached only about 92 steps
  • run was not trustworthy due to ongoing barrier-placement concerns

Simulator behavior

  • Invisible barriers are collider-only by default, so the user cannot see them in the standalone player
  • Diagnostic probe showed both tracks could advance from the start before hitting left_barrier, so there was no obvious full-width blocker across the road start
  • User screenshot suggested the car was getting trapped near the shoulder/edge, consistent with barrier corridor too close to the drivable edge
  • User also reported that barrier contact sometimes blocks the car without promptly ending the episode

Collision semantics

The user does not want every barrier brush to terminate the episode.

Desired behavior:

  • light brush: can continue
  • sustained contact: terminate
  • head-on / abrupt stop: terminate quickly

Code Changes Already Made

Unity / simulator side

/mnt/c/Users/Paul/Documents/projects/sdsandbox/sdsim/Assets/Scripts/RoadBuilder.cs

Implemented structural refactor:

  • explicit closeLoop support
  • explicit road-edge generation
  • barrier edges derived from left/right road edges instead of guessed centerline offset
  • open tracks do not force wraparound
  • debug polyline support via gizmos

Added runtime-visible debug barrier support:

  • showBarrierMeshes
  • barrierDebugColor
  • barrier objects now include MeshFilter
  • optional MeshRenderer added for visible translucent barriers

/mnt/c/Users/Paul/Documents/projects/sdsandbox/sdsim/Assets/Scenes/generated_road.unity

  • closeLoop = 0
  • doAddBarriers = 1
  • showBarrierMeshes = 1
  • pinned road variation arrays to one entry
  • roadOffsets.Array.data[0] = 2.2

/mnt/c/Users/Paul/Documents/projects/sdsandbox/sdsim/Assets/Scenes/generated_track.unity

  • showBarrierMeshes = 1
  • roadOffsetW = 2.2
  • barriers still enabled

Python / RL side

/home/paulh/projects/donkeycar-rl-autoresearch/agent/reward_wrapper.py

Latest intent:

  • do not terminate instantly on every barrier hit
  • terminate on sustained obstacle contact
  • terminate on head-on style stop

Current patch in file:

  • tracks _solid_hit_steps
  • tracks _prev_speed
  • classifies solid hits via hit containing barrier, wall, or tree
  • immediate terminate on abrupt speed collapse while colliding
  • terminate after several consecutive solid-hit frames

This was meant to replace the too-aggressive “any barrier hit = immediate death” logic.

Most Recent Verified Build Status

Unity batch build for the debug-visible barrier version completed successfully.

Evidence:

  • build log ended with Exiting batchmode successfully now!
  • return code 0

The successful build has now been synced into both Downloads runtime folders and both simulators have been relaunched.

Current verified runtime state:

  • main folder process owns port 9091
  • main folder also owns private API port 9092
  • copy folder process owns port 9093
  • copy folder also owns private API port 9094
  • Linux socket probe reported PORT 9091: OK, PORT 9092: OK, PORT 9093: OK, and PORT 9094: OK
  • latest runtime build includes double-sided barrier mesh triangles for visual/debug barrier rendering

Note: the Windows profile uses shared Unity PlayerPrefs/registry values under HKCU:\Software\DonkeyCar\donkey_sim. Explicit --port args bind the servers correctly, but the in-sim UI can still show the saved PlayerPrefs value. Before launch, set port_h2088097884/portPrivateAPI_h1325370089 to 9091/9092, start the main sim, then set them to 9093/9094 and start the copy. Also keep passing explicit --port 9091 and --port 9093.

Latest user visual inspection before double-sided patch:

  • generated_road: barriers visible on both sides except missing on left side at the very start before the first curve
  • generated_track: barrier visible only on the right/inside side when driving clockwise; no visible left/outside barrier

Likely diagnosis: barrier mesh was generated as a single-sided vertical plane and the Standard shader culled backfaces, so some debug barrier surfaces existed but were invisible from the road/camera side.

Latest simulator-side patch:

  • /mnt/c/Users/Paul/Documents/projects/sdsandbox/sdsim/Assets/Scripts/RoadBuilder.cs
  • CreateBarrier(...) now emits reverse-facing triangles for every barrier quad, making debug barrier meshes visible from both sides
  • failed attempt: Unlit/Transparent made both tracks' barriers black in the standalone player
  • failed attempt: duplicating reverse-facing triangles made generated_track barriers black, likely due coplanar transparent overdraw/z-fighting on the closed/scaled track
  • current debug barrier mesh is back to one triangle set per quad; material uses Standard transparent mode with forced pale fallback color, alpha blend, culling off, and emission enabled so barriers should stay light/translucent while remaining visible from both sides
  • Unity Windows batch build succeeded after this patch
  • rebuilt output synced to both runtime folders and relaunched with explicit ports

Immediate Next Steps

  1. Monitor current exp22 training log/checkpoints.

  2. Determine:

    • are barriers too close to the road edge globally?
    • or only wrong at specific bends / first-corner geometry?
  3. Fix geometry if needed before restarting RL.

  4. Only after geometry is visually verified, restart exp22 or a successor experiment.

Useful Commands

Sync latest build into runtime folders

rsync -a --delete '/mnt/c/Users/Paul/Documents/projects/sdsandbox/sdsim/Builds/DonkeySimWin/' '/mnt/c/Users/Paul/Downloads/DonkeySimWin/DonkeySimWin/'
rsync -a --delete '/mnt/c/Users/Paul/Documents/projects/sdsandbox/sdsim/Builds/DonkeySimWin/' '/mnt/c/Users/Paul/Downloads/DonkeySimWin/DonkeySimWin - Copy/'

Launch sims from Windows side

$key = 'HKCU:\Software\DonkeyCar\donkey_sim'

Set-ItemProperty -Path $key -Name 'port_h2088097884' -Value 9091 -Type DWord
Set-ItemProperty -Path $key -Name 'portPrivateAPI_h1325370089' -Value 9092 -Type DWord
Start-Process -FilePath 'C:\Users\Paul\Downloads\DonkeySimWin\DonkeySimWin\donkey_sim.exe' -ArgumentList '--port','9091' -WorkingDirectory 'C:\Users\Paul\Downloads\DonkeySimWin\DonkeySimWin'

Start-Sleep -Seconds 4

Set-ItemProperty -Path $key -Name 'port_h2088097884' -Value 9093 -Type DWord
Set-ItemProperty -Path $key -Name 'portPrivateAPI_h1325370089' -Value 9094 -Type DWord
Start-Process -FilePath 'C:\Users\Paul\Downloads\DonkeySimWin\DonkeySimWin - Copy\donkey_sim.exe' -ArgumentList '--port','9093' -WorkingDirectory 'C:\Users\Paul\Downloads\DonkeySimWin\DonkeySimWin - Copy'

Verify ports

python3 - <<'PY'
import socket
for p in (9091, 9093):
    s = socket.socket()
    s.settimeout(3)
    try:
        s.connect(('127.0.0.1', p))
        print(f'PORT {p}: OK')
    except Exception as e:
        print(f'PORT {p}: FAIL {e}')
    finally:
        s.close()
PY

Notes for Next Session

  • If the user says continue, do not ask broad questions. Start with the immediate next steps above.
  • Prefer direct verification over more RL training.
  • Do not restart long training until the user has visually confirmed the debug-visible barriers look correct.