1 16 16

KaijingMa

fallenleaves

AI & ML interests

None yet

Recent Activity

upvoted a paper 5 days ago

JoyAI-VL-Interaction: Real-Time Vision-Language Interaction Intelligence

upvoted a paper about 1 month ago

SANA-WM: Efficient Minute-Scale World Modeling with Hybrid Linear Diffusion Transformer

upvoted a paper about 1 month ago

Warp-as-History: Generalizable Camera-Controlled Video Generation from One Training Video

View all activity

Organizations

None yet

upvoted a paper 5 days ago

JoyAI-VL-Interaction: Real-Time Vision-Language Interaction Intelligence

Paper • 2606.14777 • Published 12 days ago • 195

upvoted 2 papers about 1 month ago

SANA-WM: Efficient Minute-Scale World Modeling with Hybrid Linear Diffusion Transformer

Paper • 2605.15178 • Published May 14 • 89

Warp-as-History: Generalizable Camera-Controlled Video Generation from One Training Video

Paper • 2605.15182 • Published May 14 • 39

liked a dataset about 2 months ago

lmms-lab-si/EASI-Leaderboard-Data

Preview • Updated Feb 12 • 698 • 1

liked a model about 2 months ago

deepseek-ai/DeepSeek-V4-Pro

Text Generation • 862B • Updated 13 days ago • 2.61M • • 5k

liked a model 5 months ago

internlm/Intern-S1-Pro

Image-Text-to-Text • Updated Mar 30 • 263k • 279

upvoted an article 5 months ago

Article

Build awesome datasets for video generation

sayakpaul

•

Feb 12, 2025

• 36

upvoted a paper 6 months ago

VINO: A Unified Visual Generator with Interleaved OmniModal Context

Paper • 2601.02358 • Published Jan 5 • 30

liked a dataset 6 months ago

HuggingFaceM4/FineVisionMax

Viewer • Updated Oct 21, 2025 • 24.2M • 28.1k • 27

upvoted 2 papers 7 months ago

DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models

Paper • 2512.02556 • Published Dec 2, 2025 • 269

TiDAR: Think in Diffusion, Talk in Autoregression

Paper • 2511.08923 • Published Nov 12, 2025 • 129

upvoted an article 7 months ago

Article

~Don't~ Repeat Yourself

patrickvonplaten

•

Apr 5, 2022

• 55

upvoted 2 papers 8 months ago

Pico-Banana-400K: A Large-Scale Dataset for Text-Guided Image Editing

Paper • 2510.19808 • Published Oct 22, 2025 • 30

FineVision: Open Data Is All You Need

Paper • 2510.17269 • Published Oct 20, 2025 • 81

authored a paper 8 months ago

OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling

Paper • 2509.12201 • Published Sep 15, 2025 • 107

liked a model 8 months ago

Qwen/Qwen3-VL-8B-Thinking

Image-Text-to-Text • 9B • Updated Nov 26, 2025 • 242k • • 211

upvoted a paper 8 months ago

BRIDGE - Building Reinforcement-Learning Depth-to-Image Data Generation Engine for Monocular Depth Estimation

Paper • 2509.25077 • Published Sep 29, 2025 • 15

upvoted a paper 9 months ago

Self-Forcing++: Towards Minute-Scale High-Quality Video Generation

Paper • 2510.02283 • Published Oct 2, 2025 • 98

liked 2 models 9 months ago

tencent/HunyuanImage-3.0

Text-to-Image • 83B • Updated Jan 28 • 696k • • 1.09k

InternRobotics/InternVLA-M1

Robotics • 4B • Updated Oct 15, 2025 • 402 • 28