SmolVLM-256M-Instruct-GGUF
GGUF conversion of HuggingFaceTB/SmolVLM-256M-Instruct for use with llama.cpp and Ollama.
Files
| File | Description | Size |
|---|---|---|
SmolVLM-256M-Instruct-Q4_K_M.gguf |
Main model (Q4_K_M quantized) | 119.3 MB |
SmolVLM-256M-Instruct-f16.gguf |
Main model (F16 full precision) | 312.6 MB |
mmproj-SmolVLM-256M-Instruct-f16.gguf |
Vision projector (F16 full precision) | 181.2 MB |
Usage
With llama.cpp
# Basic inference
./llama-mtmd-cli -m SmolVLM-256M-Instruct-f16.gguf --mmproj mmproj-SmolVLM-256M-Instruct-f16.gguf --image screenshot.png -p "What do you see?"
# With quantized version
./llama-mtmd-cli -m SmolVLM-256M-Instruct-Q4_K_M.gguf --mmproj mmproj-SmolVLM-256M-Instruct-f16.gguf --image screenshot.png -p "Click the Submit button"
With Ollama
# Modelfile
FROM ./SmolVLM-256M-Instruct-Q4_K_M.gguf
PROJECTOR ./mmproj-SmolVLM-256M-Instruct-f16.gguf
PARAMETER num_ctx 4096
PARAMETER temperature 0.1
SYSTEM "You are a GUI grounding assistant. Output click coordinates as JSON."
# Create model
ollama create smolvlm_256m_instruct -f Modelfile
# Run
ollama run smolvlm_256m_instruct --image screenshot.png "Click the Submit button"
For GUI Grounding (Claude Computer Use Format)
The model can be prompted to output coordinates in Claude's computer use format:
{"action": "left_click", "coordinate": [847, 523]}
Model Details
- Base Model: HuggingFaceTB/SmolVLM-256M-Instruct
- Architecture: Idefics3 (SmolVLM)
- License: Apache 2.0
Conversion
Converted using llama.cpp's convert_hf_to_gguf.py:
# Main model
python convert_hf_to_gguf.py HuggingFaceTB/SmolVLM-256M-Instruct --outfile SmolVLM-256M-Instruct-f16.gguf --outtype f16
# Vision projector
python convert_hf_to_gguf.py HuggingFaceTB/SmolVLM-256M-Instruct --mmproj --outfile mmproj-SmolVLM-256M-Instruct-f16.gguf --outtype f16
# Quantize
./llama-quantize SmolVLM-256M-Instruct-f16.gguf SmolVLM-256M-Instruct-Q4_K_M.gguf Q4_K_M
Acknowledgments
- Original model by HuggingFace
- GGUF conversion tools from llama.cpp
- Downloads last month
- 486
Hardware compatibility
Log In to add your hardware
4-bit
Model tree for pierretokns/SmolVLM-256M-Instruct-GGUF
Base model
HuggingFaceTB/SmolLM2-135M Quantized
HuggingFaceTB/SmolLM2-135M-Instruct Quantized
HuggingFaceTB/SmolVLM-256M-Instruct