Not All Disagreement Is Learnable: Token Teachability in On-Policy Distillation Paper • 2605.26844 • Published 7 days ago • 22
OScaR: The Occam's Razor for Extreme KV Cache Quantization in LLMs and Beyond Paper • 2605.19660 • Published 14 days ago • 40
Geometry Conflict: Explaining and Controlling Forgetting in LLM Continual Post-Training Paper • 2605.09608 • Published 23 days ago • 52