Problem with v4 on mountain_track: CTE × efficiency × speed all collapse to zero simultaneously when the car slows on the hill, giving no gradient signal for 'apply more throttle'. v5: reward = (speed / 10) × (1 - |CTE| / max_cte) - Directly rewards going fast while staying centred - Hill: car slows → reward drops → clear gradient toward more throttle - Circling protection now entirely handled by lap-time penalty + StuckTerminationWrapper (not by the reward formula) Tests updated to reflect v5 semantics (102 passing). Agent: pi Tests: 102 passed Tests-Added: 0 TypeScript: N/A |
||
|---|---|---|
| .harness | ||
| agent | ||
| docs | ||
| tests | ||
| .gitignore | ||
| AGENT.md | ||
| DECISIONS.md | ||
| IMPLEMENTATION_PLAN.md | ||
| PROJECT-KICKOFF.md | ||
| PROJECT-SPEC.md | ||
| README.md | ||
| create_gitea_repo.py | ||
| ralph-loop.sh | ||
README.md
donkeycar-rl-autoresearch
Purpose
Status
- Scaffolded with the agent harness
- Spec not filled yet
Runbook
- Fill PROJECT-SPEC.md
- Create IMPLEMENTATION_PLAN.md from the spec
- Start the implementation loop