Runpeng Dai

Leo-Dai

2 34 2

AI & ML interests

None yet

Recent Activity

upvoted a paper 19 days ago

OmniTacTune: Policy-Agnostic Real-World RL for Tactile Residual Adaptation of Visual Policies

upvoted a paper about 1 month ago

Trust the Right Teacher: Quality-Aware Self-Distillation for GUI Grounding

authored a paper about 2 months ago

Small RL Controller, Large Language Model: RL-Guided Adaptive Sampling for Test-Time Scaling

View all activity

Organizations

upvoted a paper 19 days ago

OmniTacTune: Policy-Agnostic Real-World RL for Tactile Residual Adaptation of Visual Policies

Paper • 2607.03723 • Published 25 days ago • 5

upvoted a paper about 1 month ago

Trust the Right Teacher: Quality-Aware Self-Distillation for GUI Grounding

Paper • 2606.18101 • Published Jun 16 • 15

authored a paper about 2 months ago

Small RL Controller, Large Language Model: RL-Guided Adaptive Sampling for Test-Time Scaling

Paper • 2606.03102 • Published Jun 2 • 14

upvoted a paper about 2 months ago

Small RL Controller, Large Language Model: RL-Guided Adaptive Sampling for Test-Time Scaling

Paper • 2606.03102 • Published Jun 2 • 14

authored 3 papers 3 months ago

DeltaRubric: Generative Multimodal Reward Modeling via Joint Planning and Verification

Paper • 2605.09269 • Published May 10 • 6

Reinforcing Multimodal Reasoning Against Visual Degradation

Paper • 2605.09262 • Published May 10 • 7

G-Zero: Self-Play for Open-Ended Generation from Zero Data

Paper • 2605.09959 • Published May 11 • 17

upvoted 3 papers 3 months ago

G-Zero: Self-Play for Open-Ended Generation from Zero Data

Paper • 2605.09959 • Published May 11 • 17

Reinforcing Multimodal Reasoning Against Visual Degradation

Paper • 2605.09262 • Published May 10 • 7

DeltaRubric: Generative Multimodal Reward Modeling via Joint Planning and Verification

Paper • 2605.09269 • Published May 10 • 6

authored a paper 3 months ago

LLMs Improving LLMs: Agentic Discovery for Test-Time Scaling

Paper • 2605.08083 • Published May 8 • 70

upvoted a paper 3 months ago

LLMs Improving LLMs: Agentic Discovery for Test-Time Scaling

Paper • 2605.08083 • Published May 8 • 70

upvoted a paper 4 months ago

MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild

Paper • 2603.17187 • Published Mar 17 • 141

updated 3 datasets 5 months ago

liked a Space 6 months ago

Efficient Reasoning Online Judgement

📉

upvoted a paper 6 months ago

Training Data Efficiency in Multimodal Process Reward Models

Paper • 2602.04145 • Published Feb 4 • 80

authored a paper 6 months ago

Parallel-Probe: Towards Efficient Parallel Thinking via 2D Probing

Paper • 2602.03845 • Published Feb 3 • 27

upvoted a paper 6 months ago

Learning Query-Specific Rubrics from Human Preferences for DeepResearch Report Generation

Paper • 2602.03619 • Published Feb 3 • 28

Runpeng Dai

AI & ML interests

Recent Activity

Organizations

Leo-Dai's activity

Efficient Reasoning Online Judgement