Arabic LLM Checkpoints
Mingzhe Du PRO
AI & ML interests
Code Generation / Preference Alignment
Recent Activity
upvoted
a
paper
3 days ago
Collaborative Multi-Agent Test-Time Reinforcement Learning for Reasoning
upvoted
a
paper
3 days ago
Rewarding the Rare: Uniqueness-Aware RL for Creative Problem Solving in LLMs
updated
a dataset
6 days ago
Elfsong/Qwen3_4B_Arabic_200-responses-Syrian