How to use from
vLLM
Install from pip and serve model
# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "mtgv/MobileVLM-1.7B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "mtgv/MobileVLM-1.7B",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'
Use Docker
docker model run hf.co/mtgv/MobileVLM-1.7B
Quick Links

Model Summery

MobileVLM is a competent multimodal vision language model (MMVLM) targeted to run on mobile devices. It is an amalgamation of a myriad of architectural designs and techniques that are mobile-oriented, which comprises a set of language models at the scale of 1.4B and 2.7B parameters, trained from scratch, a multimodal vision model that is pre-trained in the CLIP fashion, cross-modality interaction via an efficient projector. We evaluate MobileVLM on several typical VLM benchmarks. Our models demonstrate on par performance compared with a few much larger models. More importantly, we measure the inference speed on both a Qualcomm Snapdragon 888 CPU and an NVIDIA Jeston Orin GPU, and we obtain state-of-the-art performance of 21.5 tokens and 65.3 tokens per second, respectively.

The MobileVLM-1.7B was built on our MobileLLaMA-1.4B-Chat to facilitate the off-the-shelf deployment.

Model Sources

How to Get Started with the Model

Inference examples can be found at Github.

Training Details

Please refer to our paper: MobileVLM: A Fast, Strong and Open Vision Language Assistant for Mobile Devices

Downloads last month
1,074
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for mtgv/MobileVLM-1.7B