DREAM: Dense Retrieval Embeddings via Autoregressive Modeling Paper • 2606.24667 • Published 4 days ago • 4
ImageWAM: Do World Action Models Really Need Video Generation, or Just Image Editing? Paper • 2606.19531 • Published 10 days ago • 19
Qwen-RobotWorld Technical Report: Unifying Embodied World Modeling through Language-Conditioned Video Generation Paper • 2606.17030 • Published 12 days ago • 30
UniDDT: Unifying Multimodal Understanding and Generation with Decoupled Diffusion Transformer Paper • 2606.16255 • Published 12 days ago • 14