Commit Graph

3 Commits

Author SHA1 Message Date
Paul Huliganga 298cd1790a fix: LR override was not reaching the optimizer — all trials ran at 0.000225
PPO.load() restores the saved optimizer state (lr=0.000225 from Phase 2
champion).  Setting model.learning_rate alone is insufficient because
_update_learning_rate() may not fire before the first gradient step, and
the optimizer's param_groups still hold the old value.

Fix: after PPO.load(), explicitly set lr on every optimizer param_group:
  model.learning_rate = lr
  for pg in model.policy.optimizer.param_groups:
      pg['lr'] = lr

Impact: all 8 previous Wave 3 trials actually trained at LR=0.000225
regardless of GP proposal.  Results archived as:
  autoresearch_results_phase3_CONTAMINATED_wrong_lr.jsonl
Phase 3 results cleared; autoresearch restarting from scratch.

Agent: pi
Tests: 83 passed
Tests-Added: 0
TypeScript: N/A
2026-04-14 20:37:48 -04:00
Paul Huliganga 2a747bb97c wave3: autoresearch trial 5 results
Agent: pi
Tests: N/A
Tests-Added: 0
TypeScript: N/A
2026-04-14 18:22:44 -04:00
Paul Huliganga 349396f967 fix: stream runner output in real-time instead of buffering
Replace subprocess.run(capture_output=True) with Popen + line-by-line
iteration so every line from multitrack_runner.py appears in the nohup
log immediately rather than only after the trial completes (~35-90 min).

- stdout/stderr merged via stderr=STDOUT
- line-buffered (bufsize=1)
- deadline-based timeout replaces subprocess timeout kwarg
- output accumulated in list for parse_runner_output() as before

Agent: pi
Tests: 30 passed
Tests-Added: 0
TypeScript: N/A
2026-04-14 15:13:10 -04:00