7 58 51

Haiwen Diao

Paranioar

https://Paranioar.github.io/

AI & ML interests

Vision-and-Language, Parameter-efficient Transfer Learning, Multi-modal Large Language Model

Recent Activity

authored a paper 5 days ago

From Pixels to Words -- Towards Native One-Vision Models at Scale

commentedon a paper 6 days ago

From Pixels to Words -- Towards Native One-Vision Models at Scale

updated a collection 6 days ago

NEO1_5

View all activity

Organizations

authored a paper 5 days ago

From Pixels to Words -- Towards Native One-Vision Models at Scale

Paper • 2605.28820 • Published 7 days ago • 70

commented a paper 6 days ago

From Pixels to Words -- Towards Native One-Vision Models at Scale

Paper • 2605.28820 • Published 7 days ago • 70 •

updated a collection 6 days ago

NEO1_5

Collection

From Pixels to Words -- Towards Native One-Vision Models at Scale • 3 items • Updated 6 days ago • 6

upvoted a paper 6 days ago

From Pixels to Words -- Towards Native One-Vision Models at Scale

Paper • 2605.28820 • Published 7 days ago • 70

submitted a paper to Daily Papers 6 days ago

From Pixels to Words -- Towards Native One-Vision Models at Scale

Paper • 2605.28820 • Published 7 days ago • 70

liked 2 models 6 days ago

Paranioar/NEO1_5-2B-SFT

Image-Text-to-Text • 3B • Updated 6 days ago • 63 • 2

Paranioar/NEO1_5-9B-SFT

Image-Text-to-Text • 10B • Updated 6 days ago • 65 • 3

upvoted a collection 6 days ago

NEO1_5

Collection

From Pixels to Words -- Towards Native One-Vision Models at Scale • 3 items • Updated 6 days ago • 6

updated 2 models 6 days ago

Paranioar/NEO1_5-9B-SFT

Image-Text-to-Text • 10B • Updated 6 days ago • 65 • 3

Paranioar/NEO1_5-2B-SFT

Image-Text-to-Text • 3B • Updated 6 days ago • 63 • 2

published 2 models 6 days ago

Paranioar/NEO1_5-2B-SFT

Image-Text-to-Text • 3B • Updated 6 days ago • 63 • 2

Paranioar/NEO1_5-9B-SFT

Image-Text-to-Text • 10B • Updated 6 days ago • 65 • 3

upvoted 2 papers 6 days ago

LLaVA-OneVision-2: Towards Next-Generation Perceptual Intelligence

Paper • 2605.25979 • Published 9 days ago • 27

SpatialBench: Is Your Spatial Foundation Model an All-Round Player?

Paper • 2605.27367 • Published 8 days ago • 70

upvoted a paper 12 days ago

PhysX-Omni: Unified Simulation-Ready Physical 3D Generation for Rigid, Deformable, and Articulated Objects

Paper • 2605.21572 • Published 14 days ago • 52

liked a model 17 days ago

sensenova/SenseNova-U1-8B-MoT-Infographic

Any-to-Any • 18B • Updated 18 days ago • 5.5k • 42

authored 3 papers 19 days ago

DynamicVLA: A Vision-Language-Action Model for Dynamic Object Manipulation

Paper • 2601.22153 • Published Jan 29 • 75

VISTA-Bench: Do Vision-Language Models Really Understand Visualized Text as Well as Pure Text?

Paper • 2602.04802 • Published Feb 4 • 2

SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture

Paper • 2605.12500 • Published 22 days ago • 191

updated a collection 21 days ago

SenseNova-U1

Collection

SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-Unify Architecture • 9 items • Updated 6 days ago • 69

Haiwen Diao

AI & ML interests

Recent Activity

Organizations

Paranioar's activity