Compression mismatch analysis
When rollout and training policies stop matching.
This report isolates compression-induced mismatch from anchor staleness and outlines where token importance sampling is enough, and where a stronger correction may be needed.
Report
Train-Inference Mismatch Report
Evidence and candidate mitigations for the regime where compressed rollout behavior diverges from the policy used for training updates.
train_inference_mismatch_report