Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
9
11
13
Zhanming (Allan) Jie
allanjie
Follow
WhiteGiverPlus's profile picture
GanjinZero's profile picture
yangzhch6's profile picture
7 followers
·
11 following
https://allanj.github.io/
humbnlp
allanj
AI & ML interests
NLP, semantic parsing, named entity recognition
Recent Activity
upvoted
a
paper
1 day ago
Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe
liked
a model
2 months ago
zai-org/GLM-5
upvoted
a
paper
3 months ago
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization
View all activity
Organizations
Papers
11
arxiv:
2512.17260
arxiv:
2507.23726
arxiv:
2407.21018
arxiv:
2401.08967
Expand 11 papers
models
11
Sort: Recently updated
allanjie/agent_reft_warmup_ep5
Text Generation
•
7B
•
Updated
Jul 23, 2024
allanjie/agent_reft_warmup_ep4
Text Generation
•
7B
•
Updated
Jul 23, 2024
allanjie/agent_reft_warmup_ep3
Text Generation
•
7B
•
Updated
Jul 23, 2024
allanjie/agent_reft_warmup_ep2
Text Generation
•
7B
•
Updated
Jul 23, 2024
allanjie/agent_reft_warmup_ep1
Text Generation
•
7B
•
Updated
Jul 23, 2024
allanjie/chat_robot_qwen
Text Generation
•
8B
•
Updated
Jun 29, 2024
•
1
allanjie/chat_robot
Text Generation
•
8B
•
Updated
Jun 29, 2024
allanjie/ppo-LunarLander-v2
Reinforcement Learning
•
Updated
Dec 27, 2022
•
3
allanjie/ppo-LunarLander-v2-test
Updated
Dec 27, 2022
allanjie/math23k_train_test_roberta-base
Updated
Sep 18, 2022
•
3
•
2
View 11 models
datasets
4
Sort: Recently updated
allanjie/obt_and_mma_dataset
Viewer
•
Updated
Sep 9, 2024
•
195k
•
14
•
1
allanjie/mma
Viewer
•
Updated
Aug 9, 2024
•
333k
•
12
allanjie/agent_reft_feedback_based_actor
Viewer
•
Updated
Jul 23, 2024
•
6.2k
•
3
allanjie/agent_reft_feedback_warmup
Viewer
•
Updated
Jul 23, 2024
•
13.2k
•
3