JavisDiT-v1.0 Collection Unified Modeling and Optimization for Joint Audio-Video Generation • 2 items • Updated 7 days ago • 1
JavisDiT++: Unified Modeling and Optimization for Joint Audio-Video Generation Paper • 2602.19163 • Published 11 days ago • 14
SkyReels-V4: Multi-modal Video-Audio Generation, Inpainting and Editing model Paper • 2602.21818 • Published 8 days ago • 52
DreamID-Omni: Unified Framework for Controllable Human-Centric Audio-Video Generation Paper • 2602.12160 • Published 21 days ago • 38
BitDance: Scaling Autoregressive Generative Models with Binary Tokens Paper • 2602.14041 • Published 18 days ago • 52
BitDance Collection BitDance: Open-source autoregressive model with binary visual tokens. A research project for building powerful multimodal autoregressive model. • 10 items • Updated 3 days ago • 11
JUST-DUB-IT: Video Dubbing via Joint Audio-Visual Diffusion Paper • 2601.22143 • Published Jan 29 • 7
Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders Paper • 2601.16208 • Published Jan 22 • 53
SALAD: Achieve High-Sparsity Attention via Efficient Linear Attention Tuning for Video Diffusion Transformer Paper • 2601.16515 • Published Jan 23 • 15
Memory-V2V: Augmenting Video-to-Video Diffusion Models with Memory Paper • 2601.16296 • Published Jan 22 • 28
Skywork UniPic 2.0: Building Kontext Model with Online RL for Unified Multimodal Model Paper • 2509.04548 • Published Sep 4, 2025 • 5
Skywork-Unipic3 Collection Unified Multi-Image Composition with Sequence Modeling • 9 items • Updated 3 days ago • 12
UniVideo: Unified Understanding, Generation, and Editing for Videos Paper • 2510.08377 • Published Oct 9, 2025 • 81
LTX-2 Collection LTX-2 base models and accompanying LoRAs and IC-LoRAs • 13 items • Updated Jan 29 • 55