TINY MODELS WITH BIG INTELLIGENCE Collection Tiny (<30B) models that tend to outperform their same-parameter counterparts. • 12 items • Updated about 24 hours ago • 3
WTF GENIUS PAPERS Collection Papers that made me appreciate my major and my life a little more. obs=Observation, innov=Innovation. Most papers are abt improving tiny models. • 67 items • Updated 1 day ago • 8
Nanbeige4.1-3B: A Small General Model that Reasons, Aligns, and Acts Paper • 2602.13367 • Published 5 days ago • 17
Horizon-LM: A RAM-Centric Architecture for LLM Training Paper • 2602.04816 • Published 14 days ago • 17
DistiLLM-2: A Contrastive Approach Boosts the Distillation of LLMs Paper • 2503.07067 • Published Mar 10, 2025 • 32
PRIMA.CPP: Speeding Up 70B-Scale LLM Inference on Low-Resource Everyday Home Clusters Paper • 2504.08791 • Published Apr 7, 2025 • 139
UniReason 1.0: A Unified Reasoning Framework for World Knowledge Aligned Image Generation and Editing Paper • 2602.02437 • Published 16 days ago • 76
FineInstructions: Scaling Synthetic Instructions to Pre-Training Scale Paper • 2601.22146 • Published 20 days ago • 9
Self-Distilled Reasoner: On-Policy Self-Distillation for Large Language Models Paper • 2601.18734 • Published 23 days ago • 2
Self-Improving Pretraining: using post-trained models to pretrain better models Paper • 2601.21343 • Published 21 days ago • 17
Cache What Lasts: Token Retention for Memory-Bounded KV Cache in LLMs Paper • 2512.03324 • Published Dec 3, 2025 • 1
Idea2Story: An Automated Pipeline for Transforming Research Concepts into Complete Scientific Narratives Paper • 2601.20833 • Published 21 days ago • 177
Statistical Estimation of Adversarial Risk in Large Language Models under Best-of-N Sampling Paper • 2601.22636 • Published 20 days ago • 21
Pushing the Boundaries of Natural Reasoning: Interleaved Bonus from Formal-Logic Verification Paper • 2601.22642 • Published 20 days ago • 9
DreamActor-M2: Universal Character Image Animation via Spatiotemporal In-Context Learning Paper • 2601.21716 • Published 20 days ago • 13
PaperBanana: Automating Academic Illustration for AI Scientists Paper • 2601.23265 • Published 19 days ago • 187
ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration Paper • 2511.21689 • Published Nov 26, 2025 • 124