STAPO: Stabilizing Reinforcement Learning for LLMs by Silencing Rare Spurious Tokens Paper • 2602.15620 • Published 6 days ago • 3