Not All Disagreement Is Learnable: Token Teachability in On-Policy Distillation Paper • 2605.26844 • Published 8 days ago • 23
OScaR: The Occam's Razor for Extreme KV Cache Quantization in LLMs and Beyond Paper • 2605.19660 • Published 15 days ago • 40
Geometry Conflict: Explaining and Controlling Forgetting in LLM Continual Post-Training Paper • 2605.09608 • Published 24 days ago • 52