Warm-starts from wave4-trial-0009/model.zip (best mini-monaco model, completed laps). Fine-tunes on generated track with continuous Box action space preserved (no DiscretizedActionWrapper) at LR=0.00005. 50k steps, checkpoint every 5k, zero-shot mini-monaco eval at end. Tests whether additional generated-track exposure improves corner handling on mini-monaco without catastrophic forgetting of driving skill. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| current.pid | ||
| run_2026-05-06_225559_wave4_finetune.log | ||
| stdout.log | ||