3 165 13

Xiaoji Zheng

Student-Xiaoji

https://www.zhihu.com/people/dong-dong-dong-49-89-76

SEU-zxj

AI & ML interests

None yet

Recent Activity

upvoted a paper 13 days ago

MHPO: Modulated Hazard-aware Policy Optimization for Stable Reinforcement Learning

upvoted a paper 16 days ago

Efficient Exploration at Scale

upvoted a paper 16 days ago

GigaWorld-Policy: An Efficient Action-Centered World--Action Model

View all activity

Organizations

None yet

upvoted a paper 13 days ago

MHPO: Modulated Hazard-aware Policy Optimization for Stable Reinforcement Learning

Paper • 2603.16929 • Published 22 days ago • 13

upvoted 3 papers 16 days ago

commented a paper 16 days ago

AdaMem: Adaptive User-Centric Memory for Long-Horizon Dialogue Agents

Paper • 2603.16496 • Published 18 days ago • 13 •

upvoted a paper 16 days ago

AdaMem: Adaptive User-Centric Memory for Long-Horizon Dialogue Agents

Paper • 2603.16496 • Published 18 days ago • 13

upvoted 2 papers 18 days ago

Supervised Fine-Tuning versus Reinforcement Learning: A Study of Post-Training Methods for Large Language Models

Paper • 2603.13985 • Published 21 days ago • 10

RS-WorldModel: a Unified Model for Remote Sensing Understanding and Future Sense Forecasting

Paper • 2603.14941 • Published 20 days ago • 8

liked a dataset 19 days ago

ropedia-ai/xperience-10m

Updated 15 days ago • 2.22M • 154

upvoted a paper 19 days ago

Visual-ERM: Reward Modeling for Visual Equivalence

Paper • 2603.13224 • Published 22 days ago • 21

upvoted 3 papers 22 days ago

Neural Field Thermal Tomography: A Differentiable Physics Framework for Non-Destructive Evaluation

Paper • 2603.11045 • Published 24 days ago • 2

Trust Your Critic: Robust Reward Modeling and Reinforcement Learning for Faithful Image Editing and Generation

Paper • 2603.12247 • Published 23 days ago • 23

Spatial-TTT: Streaming Visual-based Spatial Intelligence with Test-Time Training

Paper • 2603.12255 • Published 23 days ago • 91

liked a model 23 days ago

nvidia/GR00T-N1.6-3B

Robotics • 3B • Updated Dec 15, 2025 • 41.4k • 80

upvoted 2 papers 23 days ago

CLIPO: Contrastive Learning in Policy Optimization Generalizes RLVR

Paper • 2603.10101 • Published 25 days ago • 5

OpenClaw-RL: Train Any Agent Simply by Talking

Paper • 2603.10165 • Published 25 days ago • 148

upvoted a paper 25 days ago

AutoResearch-RL: Perpetual Self-Evaluating Reinforcement Learning Agents for Autonomous Neural Architecture Discovery

Paper • 2603.07300 • Published 28 days ago • 17

upvoted 3 papers 26 days ago

Physics Informed Viscous Value Representations

Paper • 2602.23280 • Published Feb 26 • 1

Physical Simulator In-the-Loop Video Generation

Paper • 2603.06408 • Published 29 days ago • 12

π-StepNFT: Wider Space Needs Finer Steps in Online RL for Flow-based VLAs

Paper • 2603.02083 • Published Mar 2 • 9

Xiaoji Zheng

AI & ML interests

Recent Activity

Organizations

Student-Xiaoji's activity