MiniCPM-SALA: Hybridizing Sparse and Linear Attention for Efficient Long-Context Modeling Paper • 2602.11761 • Published 1 day ago • 4
SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning Paper • 2602.08234 • Published 5 days ago • 64
Gemma 3 QAT Collection Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory • 15 items • Updated Jul 10, 2025 • 217