Running 37 TRUEBench 🔥 37 Explore and compare language model performance across categories and languages
meta-llama/Llama-3.1-8B-Instruct Text Generation • 8B • Updated Sep 25, 2024 • 7.63M • • 5.58k
meta-llama/Meta-Llama-3-70B-Instruct Text Generation • 71B • Updated Jun 18, 2025 • 85.1k • • 1.51k