UniPET: a universal network for high-quality PET image denoising across varied dose reduction factors Paper • 2606.11131 • Published 10 days ago • 5
Claw-SWE-Bench: A Benchmark for Evaluating OpenClaw-style Agent Harnesses on Coding Tasks Paper • 2606.12344 • Published 9 days ago • 66
Imaginative Perception Tokens Enhance Spatial Reasoning in Multimodal Language Models Paper • 2606.03988 • Published 16 days ago • 122
Comprehensive Benchmarking of Long-Form Speech Generation in Diverse Scenarios Paper • 2605.28618 • Published 23 days ago • 32
3DCodeBench: Benchmarking Agentic Procedural 3D Modeling Via Code Paper • 2606.01057 • Published 19 days ago • 8
Beyond Final Answers: Auditing Trajectory-Level Hallucinations in Multi-Agent Industrial Workflows Paper • 2605.24219 • Published 24 days ago • 9
junbrro/n1_5_gr1_cotrain_optionY_dust_fk_temp_b636_bsz128x2_15000_layer12_lr3e-5_A_full_scale_mimic Updated 22 days ago • 1
Perception or Prejudice: Can MLLMs Go Beyond First Impressions of Personality? Paper • 2605.22109 • Published 29 days ago • 170
SQuTR: A Robustness Benchmark for Spoken Query to Text Retrieval under Acoustic Noise Paper • 2602.12783 • Published Feb 13 • 246
GameWorld: Towards Standardized and Verifiable Evaluation of Multimodal Game Agents Paper • 2604.07429 • Published Apr 8 • 122
ClawKeeper: Comprehensive Safety Protection for OpenClaw Agents Through Skills, Plugins, and Watchers Paper • 2603.24414 • Published Mar 25 • 183
SocialOmni: Benchmarking Audio-Visual Social Interactivity in Omni Models Paper • 2603.16859 • Published Mar 17 • 249