1 8 3

Hongli Zhou

Joe-Hall-Lee

https://Joe-Hall-Lee.github.io

AI & ML interests

Large Language Models

Recent Activity

upvoted a paper about 15 hours ago

Think-J: Learning to Think for Generative LLM-as-a-Judge

upvoted a paper about 15 hours ago

Mitigating the Bias of Large Language Model Evaluation

upvoted a paper 5 days ago

RM-Distiller: Exploiting Generative LLM for Reward Model Distillation

View all activity

Organizations

None yet

upvoted 2 papers about 15 hours ago

Think-J: Learning to Think for Generative LLM-as-a-Judge

Paper • 2505.14268 • Published May 20, 2025 • 1

Mitigating the Bias of Large Language Model Evaluation

Paper • 2409.16788 • Published Sep 25, 2024 • 1

upvoted 2 papers 5 days ago

RM-Distiller: Exploiting Generative LLM for Reward Model Distillation

Paper • 2601.14032 • Published Jan 20 • 1

Toward Robust LLM-Based Judges: Taxonomic Bias Evaluation and Debiasing Optimization

Paper • 2603.08091 • Published 20 days ago • 1

authored 2 papers 5 days ago

RM-Distiller: Exploiting Generative LLM for Reward Model Distillation

Paper • 2601.14032 • Published Jan 20 • 1

Toward Robust LLM-Based Judges: Taxonomic Bias Evaluation and Debiasing Optimization

Paper • 2603.08091 • Published 20 days ago • 1

upvoted a paper 5 months ago

Lost in Benchmarks? Rethinking Large Language Model Benchmarking with Item Response Theory

Paper • 2505.15055 • Published May 21, 2025 • 1

liked a Space 7 months ago

Reward Bench Leaderboard

📐

423

Explore RewardBench model rankings and scores

authored 3 papers 8 months ago

Lost in Benchmarks? Rethinking Large Language Model Benchmarking with Item Response Theory

Paper • 2505.15055 • Published May 21, 2025 • 1

Mitigating the Bias of Large Language Model Evaluation

Paper • 2409.16788 • Published Sep 25, 2024 • 1

Think-J: Learning to Think for Generative LLM-as-a-Judge

Paper • 2505.14268 • Published May 20, 2025 • 1

upvoted a paper 8 months ago

An Empirical Study of LLM-as-a-Judge for LLM Evaluation: Fine-tuned Judge Models are Task-specific Classifiers

Paper • 2403.02839 • Published Mar 5, 2024 • 2

New activity in hkunlp/instructor-xl 12 months ago

TypeError: INSTRUCTOR._load_sbert_model() got an unexpected keyword argument 'token' when using hkunlp/instructor-xl

#35 opened about 1 year ago by

sethanimesh

upvoted 2 papers over 1 year ago

Iterative Reasoning Preference Optimization

Paper • 2404.19733 • Published Apr 30, 2024 • 49

Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs

Paper • 2406.10216 • Published Jun 14, 2024 • 2

liked a model about 2 years ago

BAAI/JudgeLM-7B-v1.0

Text Generation • Updated Oct 27, 2023 • 260 • 18

liked a dataset about 2 years ago

BAAI/JudgeLM-100K

Preview • Updated Oct 27, 2023 • 59 • 51

Hongli Zhou

AI & ML interests

Recent Activity

Organizations

Joe-Hall-Lee's activity

Reward Bench Leaderboard

TypeError: INSTRUCTOR._load_sbert_model() got an unexpected keyword argument 'token' when using hkunlp/instructor-xl