view article Article Exploring New Frontiers of LLMs: Adaptive Dual-Search Distillation (ADS) and the 30B Model Open Beta 5 days ago • 2
Quantized Qwen3.5 Collection Verified models. Compatible with Transformers v5.3 and vLLM v0.16.1rc1 (nightly). Under evaluation. • 10 items • Updated 3 days ago • 8
pplx-embed Collection Diffusion-Pretrained Dense and Contextual Embeddings • 7 items • Updated 8 days ago • 85
Claude 4.5 Opus Collection Distilled models and datasets for Claude 4.5 Opus. • 14 items • Updated 4 days ago • 28
view article Article LateOn-Code & ColGrep: LightOn unveils state-of-the-art code retrieval models and code search tooling 22 days ago • 47
Intern-S1: A Scientific Multimodal Foundation Model Paper • 2508.15763 • Published Aug 21, 2025 • 268
PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning Paper • 2601.05593 • Published Jan 9 • 85
Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability Paper • 2601.18778 • Published Jan 26 • 41
OptiMind: Teaching LLMs to Think Like Optimization Experts Paper • 2509.22979 • Published Sep 26, 2025 • 4
The Flexibility Trap: Why Arbitrary Order Limits Reasoning Potential in Diffusion Language Models Paper • 2601.15165 • Published Jan 21 • 72
Clara-Molecular Collection NVIDIA Clara Models for Molecular Science • 8 items • Updated 3 days ago • 7
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization Paper • 2601.05242 • Published Jan 8 • 228