TriAttention: Efficient Long Reasoning with Trigonometric KV Compression Paper • 2604.04921 • Published 15 days ago • 109
Sommelier: Scalable Open Multi-turn Audio Pre-processing for Full-duplex Speech Language Models Paper • 2603.25750 • Published Mar 20 • 36
MMFace-DiT: A Dual-Stream Diffusion Transformer for High-Fidelity Multimodal Face Generation Paper • 2603.29029 • Published 21 days ago • 13
DreamLite: A Lightweight On-Device Unified Model for Image Generation and Editing Paper • 2603.28713 • Published 22 days ago • 20
MSA: Memory Sparse Attention for Efficient End-to-End Memory Model Scaling to 100M Tokens Paper • 2603.23516 • Published Mar 6 • 48
OpenResearcher: A Fully Open Pipeline for Long-Horizon Deep Research Trajectory Synthesis Paper • 2603.20278 • Published Mar 17 • 95
VINCIE Collection A diffusion transformer model for in-context image generation and editing • 6 items • Updated Mar 19 • 12
SmoothCache: A Universal Inference Acceleration Technique for Diffusion Transformers Paper • 2411.10510 • Published Nov 15, 2024 • 9
DynamicVLA: A Vision-Language-Action Model for Dynamic Object Manipulation Paper • 2601.22153 • Published Jan 29 • 74
Transition Matching Distillation for Fast Video Generation Paper • 2601.09881 • Published Jan 14 • 34
Reinforcement Learning for Self-Improving Agent with Skill Library Paper • 2512.17102 • Published Dec 18, 2025 • 42