Yi Shan

awangaddd

AI & ML interests

None yet

Recent Activity

upvoted a paper 5 days ago

MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild

upvoted a paper 7 days ago

WebVR: Benchmarking Multimodal LLMs for WebPage Recreation from Videos via Human-Aligned Visual Rubrics

upvoted a paper 8 days ago

Multimodal OCR: Parse Anything from Documents

View all activity

Organizations

None yet

upvoted a paper 5 days ago

MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild

Paper • 2603.17187 • Published 7 days ago • 127

upvoted a paper 7 days ago

WebVR: Benchmarking Multimodal LLMs for WebPage Recreation from Videos via Human-Aligned Visual Rubrics

Paper • 2603.13391 • Published 14 days ago • 19

upvoted 2 papers 8 days ago

Multimodal OCR: Parse Anything from Documents

Paper • 2603.13032 • Published 11 days ago • 34

Visual-ERM: Reward Modeling for Visual Equivalence

Paper • 2603.13224 • Published 11 days ago • 21

upvoted 2 papers 9 days ago

UniG2U-Bench: Do Unified Models Advance Multimodal Understanding?

Paper • 2603.03241 • Published 21 days ago • 86

SWE-rebench V2: Language-Agnostic SWE Task Collection at Scale

Paper • 2602.23866 • Published 26 days ago • 88

upvoted 2 papers 10 days ago

Video-Based Reward Modeling for Computer-Use Agents

Paper • 2603.10178 • Published 14 days ago • 42

T2S-Bench & Structure-of-Thought: Benchmarking and Prompting Comprehensive Text-to-Structure Reasoning

Paper • 2603.03790 • Published 21 days ago • 121

upvoted a paper 12 days ago

OpenClaw-RL: Train Any Agent Simply by Talking

Paper • 2603.10165 • Published 14 days ago • 139

upvoted 2 papers 13 days ago

MiniAppBench: Evaluating the Shift from Text to Interactive HTML Responses in LLM-Powered Assistants

Paper • 2603.09652 • Published 15 days ago • 15

Stepping VLMs onto the Court: Benchmarking Spatial Intelligence in Sports

Paper • 2603.09896 • Published 14 days ago • 26

upvoted a paper 14 days ago

PIRA-Bench: A Transition from Reactive GUI Agents to GUI-based Proactive Intent Recommendation Agents

Paper • 2603.08013 • Published 16 days ago • 14

upvoted a paper 15 days ago

Reasoning Models Struggle to Control their Chains of Thought

Paper • 2603.05706 • Published 19 days ago • 34

upvoted a paper 16 days ago

RubricBench: Aligning Model-Generated Rubrics with Human Standards

Paper • 2603.01562 • Published 23 days ago • 60

upvoted a paper 25 days ago

GUI-Libra: Training Native GUI Agents to Reason and Act with Action-aware Supervision and Partially Verifiable RL

Paper • 2602.22190 • Published 27 days ago • 16

upvoted 3 papers 26 days ago

AutoWebWorld: Synthesizing Infinite Verifiable Web Environments via Finite State Machines

Paper • 2602.14296 • Published Feb 15 • 51

AOrchestra: Automating Sub-Agent Creation for Agentic Orchestration

Paper • 2602.03786 • Published Feb 3 • 89

Weak-Driven Learning: How Weak Agents make Strong Agents Stronger

Paper • 2602.08222 • Published Feb 9 • 283

upvoted 2 papers 27 days ago

GEBench: Benchmarking Image Generation Models as GUI Environments

Paper • 2602.09007 • Published Feb 9 • 39

Mobile-Agent-v3.5: Multi-platform Fundamental GUI Agents

Paper • 2602.16855 • Published Feb 15 • 50

Yi Shan

AI & ML interests

Recent Activity

Organizations

awangaddd's activity