Song

songhan

3 5

https://songhan.mit.edu

AI & ML interests

efficient AI computing

Recent Activity

authored a paper about 6 hours ago

SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformers

authored a paper about 6 hours ago

SANA 1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion Transformer

authored a paper about 6 hours ago

SANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation

View all activity

Organizations

authored 10 papers about 6 hours ago

SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformers

Paper • 2410.10629 • Published Oct 14, 2024 • 13

SANA 1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion Transformer

Paper • 2501.18427 • Published Jan 30, 2025 • 27

SANA-WM: Efficient Minute-Scale World Modeling with Hybrid Linear Diffusion Transformer

Paper • 2605.15178 • Published May 14 • 91

LongLive-2.0: An NVFP4 Parallel Infrastructure for Long Video Generation

Paper • 2605.18739 • Published May 18 • 116

SANA-Streaming: Real-time Streaming Video Editing with Hybrid Diffusion Transformer

Paper • 2605.30409 • Published May 28 • 42

LongLive-RAG: A General Retrieval-Augmented Framework for Long Video Generation

Paper • 2606.02553 • Published about 1 month ago • 20

Cosmos 3: Omnimodal World Models for Physical AI

Paper • 2606.02800 • Published about 1 month ago • 138

upvoted a collection 8 months ago

SANA-Video

Collection

🎬 SANA-Video: Efficient Video Generation with Block Linear Diffusion Transformer • 10 items • Updated Mar 16 • 11

upvoted a paper 8 months ago

OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM

Paper • 2510.15870 • Published Oct 17, 2025 • 93

upvoted a paper 9 months ago

SANA-Video: Efficient Video Generation with Block Linear Diffusion Transformer

Paper • 2509.24695 • Published Sep 29, 2025 • 54

upvoted a paper 12 months ago

Scaling RL to Long Videos

Paper • 2507.07966 • Published Jul 10, 2025 • 161

authored a paper over 1 year ago

LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention

Paper • 2502.14866 • Published Feb 20, 2025 • 13

upvoted a paper over 1 year ago

NVILA: Efficient Frontier Visual Language Models

Paper • 2412.04468 • Published Dec 5, 2024 • 62

authored 2 papers almost 2 years ago

Wolf: Captioning Everything with a World Summarization Framework

Paper • 2407.18908 • Published Jul 26, 2024 • 33

$VILA^2$: VILA Augmented VILA

Paper • 2407.17453 • Published Jul 24, 2024 • 41

authored 2 papers over 2 years ago

BitDelta: Your Fine-Tune May Only Be Worth One Bit

Paper • 2402.10193 • Published Feb 15, 2024 • 21

VILA: On Pre-training for Visual Language Models

Paper • 2312.07533 • Published Dec 12, 2023 • 22

Song

AI & ML interests

Recent Activity

Organizations

songhan's activity