Warm-starts from wave4-trial-0009/model.zip (best mini-monaco model, completed
laps). Fine-tunes on generated track with continuous Box action space preserved
(no DiscretizedActionWrapper) at LR=0.00005. 50k steps, checkpoint every 5k,
zero-shot mini-monaco eval at end.
Tests whether additional generated-track exposure improves corner handling on
mini-monaco without catastrophic forgetting of driving skill.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>