Mercury: Ultra-Fast Language Models Based on Diffusion Paper • 2506.17298 • Published Jun 17, 2025 • 10
view article Article BigCodeArena: Judging code generations end to end with code executions Oct 7, 2025 • 22
Nemotron-Pre-Training-Datasets Collection Large scale pre-training datasets used in the Nemotron family of models. • 12 items • Updated 4 days ago • 126
NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model Paper • 2508.14444 • Published Aug 20, 2025 • 46
Deepseek v3.2 Speciale Collection Distilled models and datasets for Deepseek v3.2 Speciale. • 11 items • Updated Dec 20, 2025 • 8
Gemini 3 Pro Collection Distilled models and datasets for Gemini 3 Pro. • 9 items • Updated Dec 20, 2025 • 7
view article Article Unlocking the conversion of Web Screenshots into HTML Code with the WebSight Dataset +1 Mar 15, 2024 • 13
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models Paper • 2512.02556 • Published Dec 2, 2025 • 263
GPT-4 generated datasets Collection Collection of some GPT-4 generated datasets. It may be useful for those looking for the best-quality datasets to train competitive LLMs. • 18 items • Updated Apr 16, 2024 • 10
DINOv3 Collection DINOv3: foundation models producing excellent dense features, outperforming SotA w/o fine-tuning - https://arxiv.org/abs/2508.10104 • 15 items • Updated 10 days ago • 542
view article Article PaliGemma – Google's Cutting-Edge Open Vision Language Model +1 May 14, 2024 • 285
Cut and Learn for Unsupervised Object Detection and Instance Segmentation Paper • 2301.11320 • Published Jan 26, 2023 • 1
ReVision: High-Quality, Low-Cost Video Generation with Explicit 3D Physics Modeling for Complex Motion and Interaction Paper • 2504.21855 • Published Apr 30, 2025 • 13