1 36 5

zuijiang

AI & ML interests

None yet

Recent Activity

upvoted a paper 8 days ago

Breaking Entropy Bounds: Accelerating RL Training via MTP with Rejection Sampling

upvoted a paper 10 days ago

Combinatorial Synthesis: Scaling Code RLVR via Atomic Decomposition and Recombination

upvoted a paper 11 days ago

dots.tts Technical Report

View all activity

Organizations

upvoted a paper 8 days ago

Breaking Entropy Bounds: Accelerating RL Training via MTP with Rejection Sampling

Paper • 2606.12370 • Published 10 days ago • 21

upvoted a paper 10 days ago

Combinatorial Synthesis: Scaling Code RLVR via Atomic Decomposition and Recombination

Paper • 2605.31058 • Published 22 days ago • 2

upvoted a paper 11 days ago

dots.tts Technical Report

Paper • 2606.07080 • Published 15 days ago • 15

upvoted a paper 17 days ago

Domino: Decoupling Causal Modeling from Autoregressive Drafting in Speculative Decoding

Paper • 2605.29707 • Published 23 days ago • 147

upvoted a paper 21 days ago

LiteCoder-Terminal: Scaling Long-Horizon Terminal Environments for Learning Language Agents

Paper • 2605.29559 • Published 23 days ago • 17

upvoted a paper 24 days ago

MetaphorVU: Towards Metaphorical Video Understanding

Paper • 2605.25461 • Published 26 days ago • 8

upvoted a paper 28 days ago

MolmoAct2: Action Reasoning Models for Real-world Deployment

Paper • 2605.02881 • Published May 4 • 353

upvoted 4 papers about 1 month ago

GoLongRL: Capability-Oriented Long Context Reinforcement Learning with Multitask Alignment

Paper • 2605.19577 • Published May 19 • 59

upvoted a paper about 2 months ago

Large Language Models Explore by Latent Distilling

Paper • 2604.24927 • Published Apr 27 • 74

upvoted an article 2 months ago

Article

Releasing LiteCoder-Terminal-SFT

Lite-Coder

•

Apr 13

• 4

upvoted a paper 2 months ago

RAGEN-2: Reasoning Collapse in Agentic RL

Paper • 2604.06268 • Published Apr 7 • 68

liked a model 3 months ago

Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled

Image-Text-to-Text • 28B • Updated Apr 6 • 148k • • 2.88k

upvoted 5 papers 3 months ago

Composer 2 Technical Report

Paper • 2603.24477 • Published Mar 25 • 19

Why Does Self-Distillation (Sometimes) Degrade the Reasoning Capability of LLMs?

Paper • 2603.24472 • Published Mar 25 • 57

Nemotron-Cascade 2: Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation

Paper • 2603.19220 • Published Mar 19 • 69

Complementary Reinforcement Learning

Paper • 2603.17621 • Published Mar 18 • 37

OpenClaw-RL: Train Any Agent Simply by Talking

Paper • 2603.10165 • Published Mar 10 • 157

zuijiang

AI & ML interests

Recent Activity

Organizations

zuijiang's activity

Releasing LiteCoder-Terminal-SFT