Video-Text-to-Text
Transformers
Safetensors
English
qwen3_5
text-generation
video
multimodal
video-captioning
temporal-grounding
qwen
VLM
custom_code
Instructions to use NemoStation/Marlin-2B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use NemoStation/Marlin-2B with Transformers:
# Load model directly from transformers import AutoProcessor, AutoModelForCausalLM processor = AutoProcessor.from_pretrained("NemoStation/Marlin-2B", trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained("NemoStation/Marlin-2B", trust_remote_code=True) - Notebooks
- Google Colab
- Kaggle
use marlin on a remote GPU machine ? third party providers ?
1
#9 opened 5 days ago
by
smoonz
Inference speed
2
#8 opened 7 days ago
by
tintwotin
This model work by feeding multi sampling frame from video or raw video file?
1
#7 opened 7 days ago
by
CT-Ati
Collaboration Opportunity
#6 opened 8 days ago
by
Phase-Technologies
Train on own videos / labels?
🔥 1
2
#5 opened 9 days ago
by
horsto
Can you use this model with image and text-only inputs apart from video?
3
#4 opened 12 days ago
by
lunahr
Question about the evaluation metrics for captioning benchmarks
2
#3 opened 14 days ago
by
ygyjrc