$D^2$-Monitor: Dynamic Safety Monitoring for Diffusion LLMs via Hesitation-Aware Routing
Paper • 2605.25893 • Published • 39
None defined yet.
Addressing Performance Saturation for LLM RL via Precise Entropy Curve Control
Operator Learning Using Weak Supervision from Walk-on-Spheres