1 8 10

Hello_zjt

hellozjt

AI & ML interests

None yet

Recent Activity

upvoted an article about 1 month ago

KV Caching Explained: Optimizing Transformer Inference Efficiency

liked a dataset 5 months ago

Starscream-11813/HaSPeR

upvoted a collection 6 months ago

Qwen2.5-VL

View all activity

Organizations

None yet

upvoted an article about 1 month ago

Article

KV Caching Explained: Optimizing Transformer Inference Efficiency

Jan 30, 2025

•

225

upvoted a collection 6 months ago

Qwen2.5-VL

Collection

Vision-language model series based on Qwen2.5 • 11 items • Updated Dec 31, 2025 • 555

upvoted 2 collections 10 months ago

Open LLM Leaderboard best models ❤️‍🔥

Collection

A daily uploaded list of models with best evaluations on the LLM leaderboard: • 65 items • Updated Mar 20, 2025 • 661

The Big Benchmarks Collection

Collection

Gathering benchmark spaces on the hub (beyond the Open LLM Leaderboard) • 13 items • Updated Nov 18, 2024 • 257

upvoted 2 papers 12 months ago

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22, 2025 • 437

DeepSeek-V3 Technical Report

Paper • 2412.19437 • Published Dec 27, 2024 • 75

upvoted 2 collections over 1 year ago

Llama 3.1 Evals

Collection

This collection provides detailed information on how we derived the reported benchmark metrics for the Llama 3.1 models, including the configurations, • 6 items • Updated Dec 6, 2024 • 19

Llama 3.1

Collection

This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated Dec 6, 2024 • 705

Hello_zjt

AI & ML interests

Recent Activity

Organizations

hellozjt's activity

KV Caching Explained: Optimizing Transformer Inference Efficiency