Every test run now saves to agent/test-results/YYYY-MM-DD_HH-MM_<model>.log
so results are never lost. Also added 3-set Exp9 eval results to TEST_HISTORY.
Usage:
python3 agent/run_eval.py --model models/exp9-.../best_model.zip --sets 3
Agent: pi
Tests: 102 passed
Tests-Added: 0
TypeScript: N/A