Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders Paper • 2601.16208 • Published 3 days ago • 49
The Flexibility Trap: Why Arbitrary Order Limits Reasoning Potential in Diffusion Language Models Paper • 2601.15165 • Published 4 days ago • 62
VisGym: Diverse, Customizable, Scalable Environments for Multimodal Agents Paper • 2601.16973 • Published 2 days ago • 13
Toward Efficient Agents: Memory, Tool learning, and Planning Paper • 2601.14192 • Published 5 days ago • 49
Locate, Steer, and Improve: A Practical Survey of Actionable Mechanistic Interpretability in Large Language Models Paper • 2601.14004 • Published 6 days ago • 44
Advances and Frontiers of LLM-based Issue Resolution in Software Engineering: A Comprehensive Survey Paper • 2601.11655 • Published 10 days ago • 59
Being-H0.5: Scaling Human-Centric Robot Learning for Cross-Embodiment Generalization Paper • 2601.12993 • Published 7 days ago • 74
WorldPlay: Towards Long-Term Geometric Consistency for Real-Time Interactive World Modeling Paper • 2512.14614 • Published Dec 16, 2025 • 70
FreqEdit: Preserving High-Frequency Features for Robust Multi-Turn Image Editing Paper • 2512.01755 • Published Dec 1, 2025 • 1
Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer Paper • 2511.22699 • Published Nov 27, 2025 • 229
Unified Diffusion VLA: Vision-Language-Action Model via Joint Discrete Denoising Diffusion Process Paper • 2511.01718 • Published Nov 3, 2025 • 7