Dan Zhang's picture

Dan Zhang

zd21

·

https://zhangdan0602.github.io/

AI & ML interests

None yet

Recent Activity

authored a paper 5 days ago

ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search

authored a paper 5 days ago

Parameter-Efficient Fine-Tuning for Foundation Models

authored a paper 5 days ago

ReST-RL: Achieving Accurate Code Reasoning of LLMs with Optimized Self-Training and Decoding

View all activity

Organizations

None yet

upvoted a paper 5 days ago

Unifying Group-Relative and Self-Distillation Policy Optimization via Sample Routing

Paper • 2604.02288 • Published 10 days ago • 27

upvoted a paper 5 months ago

TDRM: Smooth Reward Models with Temporal Difference for LLM RL and Inference

Paper • 2509.15110 • Published Sep 18, 2025 • 1

upvoted 2 collections 9 months ago

TDRM

Learning Smooth Reward Models with Temporal Difference for LLM RL and Inference • 14 items • Updated Mar 2 • 2

GLM-4.5

GLM-4.5: An open-source large language model designed for intelligent agents by Z.ai • 8 items • Updated Mar 2 • 253

upvoted a paper over 1 year ago

AndroidLab: Training and Systematic Benchmarking of Android Autonomous Agents

Paper • 2410.24024 • Published Oct 31, 2024 • 49