Karan's picture

1 3

Karan

karansapra

·

https://karansapra.github.io/

AI & ML interests

None yet

Recent Activity

published an article about 2 months ago

Introducing NVIDIA Nemotron 3 Nano Omni: Long-Context Multimodal Intelligence for Documents, Audio and Video Agents

authored a paper 3 months ago

MMOU: A Massive Multi-Task Omni Understanding and Reasoning Benchmark for Long and Complex Real-World Videos

liked a model 7 months ago

nvidia/NVIDIA-Nemotron-Nano-12B-v2-VL-BF16

View all activity

Organizations

published an article about 2 months ago

Article

Introducing NVIDIA Nemotron 3 Nano Omni: Long-Context Multimodal Intelligence for Documents, Audio and Video Agents

nvidia

•

Apr 28

• 62

authored a paper 3 months ago

MMOU: A Massive Multi-Task Omni Understanding and Reasoning Benchmark for Long and Complex Real-World Videos

Paper • 2603.14145 • Published Mar 14 • 15

liked 2 models 7 months ago

nvidia/NVIDIA-Nemotron-Nano-12B-v2-VL-BF16

Image-Text-to-Text • 13B • Updated Dec 2, 2025 • 147k • 84

nvidia/NVIDIA-Nemotron-Parse-v1.1

Image-Text-to-Text • 1.0B • Updated May 7 • 385k • 169

published an article 12 months ago

Article

Welcome the NVIDIA Llama Nemotron Nano VLM to Hugging Face Hub

nvidia

•

Jun 27, 2025

• 31

liked a model about 1 year ago

nvidia/Llama-3.1-Nemotron-Nano-VL-8B-V1

Image-Text-to-Text • 9B • Updated Dec 4, 2025 • 1.15M • 181

upvoted a paper over 1 year ago

Éclair -- Extracting Content and Layout with Integrated Reading Order for Documents

Paper • 2502.04223 • Published Feb 6, 2025 • 10

updated a dataset almost 2 years ago

karansapra/semantic-segmentation

Preview • Updated Sep 19, 2024 • 2