Compression mismatch analysis

When rollout and training policies stop matching.

This report isolates compression-induced mismatch from anchor staleness and outlines where token importance sampling is enough, and where a stronger correction may be needed.

Report

Train-Inference Mismatch Report

Evidence and candidate mitigations for the regime where compressed rollout behavior diverges from the policy used for training updates.

mismatch compression TIS
train_inference_mismatch_report
Train-inference mismatch overview chart.