Anchor latency, drift, and train-inference mismatch.
Reports on the communication-efficient GRPO circuit: where the anchor merger works, why K=20 collapses, and how compression-induced train-inference mismatch shows up separately.
Topic Areas
Each topic opens to a focused index with direct links to the underlying reports.
Stale anchor gradients drive the K=20 collapse.
The anchor analysis separates delay, cadence, Q-basis stability, and drift evidence across the K=5 and K=20 runs.
Compression changes the policy seen at rollout time.
This report treats mismatch as its own failure mode, separate from anchor staleness and cadence effects.
Reports
Direct entry points for the current report set.
Delay Failure Report
Why the EMA merger stays near dense at K=5 but becomes destructive at K=20.
anchor-delay/delay_failure_report
K-Instability Q-Basis
Short one-pager showing that Q still compresses while stale gradients fail.
anchor-delay/k_instability_q_basis
Dense Drift Joint
Joint drift evidence across GSM8K and Big-Math for the stale-gradient route.
anchor-delay/dense-drift-joint
Train-Inference Mismatch
Evidence that compression-induced rollout mismatch needs its own handling.
train-inference-mis-match/train_inference_mismatch_report