14 23 26

Bin Wang

wanderkid

https://wangbindl.github.io/

wangbinDL

AI & ML interests

Computer Vision, Multimodal Large Language Model

Recent Activity

liked a model 9 days ago

stepfun-ai/Step-3.7-Flash

liked a model 29 days ago

opendatalab/MinerU2.5-Pro-2605-1.2B

upvoted a paper about 1 month ago

CiteVQA: Benchmarking Evidence Attribution for Trustworthy Document Intelligence

View all activity

Organizations

liked a model 9 days ago

stepfun-ai/Step-3.7-Flash

Image-Text-to-Text • 201B • Updated 16 days ago • 74.9k • • 381

liked a model 29 days ago

opendatalab/MinerU2.5-Pro-2605-1.2B

Image-Text-to-Text • 1B • Updated 3 days ago • 27.6k • 23

upvoted a paper about 1 month ago

CiteVQA: Benchmarking Evidence Attribution for Trustworthy Document Intelligence

Paper • 2605.12882 • Published May 13 • 272

authored a paper 2 months ago

MinerU2.5-Pro: Pushing the Limits of Data-Centric Document Parsing at Scale

Paper • 2604.04771 • Published Apr 6 • 123

liked a model 2 months ago

opendatalab/MinerU2.5-Pro-2604-1.2B

Image-Text-to-Text • 1B • Updated Apr 14 • 463k • 154

authored 4 papers 2 months ago

TRivia: Self-supervised Fine-tuning of Vision-Language Models for Table Recognition

Paper • 2512.01248 • Published Dec 1, 2025 • 12

InternAgent-1.5: A Unified Agentic Framework for Long-Horizon Autonomous Scientific Discovery

Paper • 2602.08990 • Published Feb 9 • 78

MinerU-Diffusion: Rethinking Document OCR as Inverse Rendering via Diffusion Decoding

Paper • 2603.22458 • Published Mar 23 • 137

Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale

Paper • 2603.25040 • Published Mar 26 • 133

upvoted a paper 2 months ago

MinerU2.5-Pro: Pushing the Limits of Data-Centric Document Parsing at Scale

Paper • 2604.04771 • Published Apr 6 • 123

submitted a paper to Daily Papers 2 months ago

MinerU2.5-Pro: Pushing the Limits of Data-Centric Document Parsing at Scale

Paper • 2604.04771 • Published Apr 6 • 123

liked a Space 3 months ago

MinerU Diffusion V1 0320 2.5B

🦀

demo of MinerU-Diffusion

liked a model 3 months ago

opendatalab/MinerU-Diffusion-V1-0320-2.5B

Image-to-Text • 3B • Updated Mar 25 • 4.59k • 23

upvoted a paper 3 months ago

MinerU-Diffusion: Rethinking Document OCR as Inverse Rendering via Diffusion Decoding

Paper • 2603.22458 • Published Mar 23 • 137

upvoted a paper 4 months ago

InternAgent-1.5: A Unified Agentic Framework for Long-Horizon Autonomous Scientific Discovery

Paper • 2602.08990 • Published Feb 9 • 78

upvoted 3 papers 5 months ago

Can LLMs Clean Up Your Mess? A Survey of Application-Ready Data Preparation with LLMs

Paper • 2601.17058 • Published Jan 22 • 190

ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning

Paper • 2512.05111 • Published Dec 4, 2025 • 50

DocDancer: Towards Agentic Document-Grounded Information Seeking

Paper • 2601.05163 • Published Jan 8 • 7

liked a model 6 months ago

opendatalab/TRivia-3B

Image-Text-to-Text • 4B • Updated Dec 2, 2025 • 267 • 10

upvoted a paper 6 months ago

TRivia: Self-supervised Fine-tuning of Vision-Language Models for Table Recognition

Paper • 2512.01248 • Published Dec 1, 2025 • 12

Bin Wang

AI & ML interests

Recent Activity

Organizations

wanderkid's activity

MinerU Diffusion V1 0320 2.5B