Attention
updated
Sparse VideoGen2: Accelerate Video Generation with Sparse Attention via
Semantic-Aware Permutation
Paper
• 2505.18875
• Published • 42
PAROAttention: Pattern-Aware ReOrdering for Efficient Sparse and
Quantized Attention in Visual Generation Models
Paper
• 2506.16054
• Published • 60
SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference
Acceleration
Paper
• 2410.02367
• Published • 50
Radial Attention: O(nlog n) Sparse Attention with Energy Decay for
Long Video Generation
Paper
• 2506.19852
• Published • 42
nablaNABLA: Neighborhood Adaptive Block-Level Attention
Paper
• 2507.13546
• Published • 126
SpargeAttn: Accurate Sparse Attention Accelerating Any Model Inference
Paper
• 2502.18137
• Published • 60
SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable
Sparse-Linear Attention
Paper
• 2509.24006
• Published • 119
SANA-Video: Efficient Video Generation with Block Linear Diffusion
Transformer
Paper
• 2509.24695
• Published • 47
Why Low-Precision Transformer Training Fails: An Analysis on Flash
Attention
Paper
• 2510.04212
• Published • 26
Native Hybrid Attention for Efficient Sequence Modeling
Paper
• 2510.07019
• Published • 17
Sparser Block-Sparse Attention via Token Permutation
Paper
• 2510.21270
• Published • 25
LiteAttention: A Temporal Sparse Attention for Diffusion Transformers
Paper
• 2511.11062
• Published • 32
Fast Autoregressive Video Diffusion and World Models with Temporal Cache Compression and Sparse Attention
Paper
• 2602.01801
• Published • 28
SpargeAttention2: Trainable Sparse Attention via Hybrid Top-k+Top-p Masking and Distillation Fine-Tuning
Paper
• 2602.13515
• Published • 44
SLA2: Sparse-Linear Attention with Learnable Routing and QAT
Paper
• 2602.12675
• Published • 57