Multi-Turn Reflective Masking Elicits Reasoning in Mask Diffusion Models Paper • 2606.16700 • Published 13 days ago • 14
SDAR: A Synergistic Diffusion-AutoRegression Paradigm for Scalable Sequence Generation Paper • 2510.06303 • Published Oct 7, 2025 • 15
SDAR: A Synergistic Diffusion-AutoRegression Paradigm for Scalable Sequence Generation Paper • 2510.06303 • Published Oct 7, 2025 • 15
SDAR: A Synergistic Diffusion-AutoRegression Paradigm for Scalable Sequence Generation Paper • 2510.06303 • Published Oct 7, 2025 • 15
SDAR: A Synergistic Diffusion-AutoRegression Paradigm for Scalable Sequence Generation Paper • 2510.06303 • Published Oct 7, 2025 • 15
view article Article DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge NormalUhr • Feb 7, 2025 • 295