SmolVLM-256M-Instruct-GGUF

GGUF conversion of HuggingFaceTB/SmolVLM-256M-Instruct for use with llama.cpp and Ollama.

Files

File Description Size
SmolVLM-256M-Instruct-Q4_K_M.gguf Main model (Q4_K_M quantized) 119.3 MB
SmolVLM-256M-Instruct-f16.gguf Main model (F16 full precision) 312.6 MB
mmproj-SmolVLM-256M-Instruct-f16.gguf Vision projector (F16 full precision) 181.2 MB

Usage

With llama.cpp

# Basic inference
./llama-mtmd-cli -m SmolVLM-256M-Instruct-f16.gguf --mmproj mmproj-SmolVLM-256M-Instruct-f16.gguf --image screenshot.png -p "What do you see?"

# With quantized version
./llama-mtmd-cli -m SmolVLM-256M-Instruct-Q4_K_M.gguf --mmproj mmproj-SmolVLM-256M-Instruct-f16.gguf --image screenshot.png -p "Click the Submit button"

With Ollama

# Modelfile
FROM ./SmolVLM-256M-Instruct-Q4_K_M.gguf
PROJECTOR ./mmproj-SmolVLM-256M-Instruct-f16.gguf
PARAMETER num_ctx 4096
PARAMETER temperature 0.1
SYSTEM "You are a GUI grounding assistant. Output click coordinates as JSON."
# Create model
ollama create smolvlm_256m_instruct -f Modelfile

# Run
ollama run smolvlm_256m_instruct --image screenshot.png "Click the Submit button"

For GUI Grounding (Claude Computer Use Format)

The model can be prompted to output coordinates in Claude's computer use format:

{"action": "left_click", "coordinate": [847, 523]}

Model Details

Conversion

Converted using llama.cpp's convert_hf_to_gguf.py:

# Main model
python convert_hf_to_gguf.py HuggingFaceTB/SmolVLM-256M-Instruct --outfile SmolVLM-256M-Instruct-f16.gguf --outtype f16

# Vision projector
python convert_hf_to_gguf.py HuggingFaceTB/SmolVLM-256M-Instruct --mmproj --outfile mmproj-SmolVLM-256M-Instruct-f16.gguf --outtype f16

# Quantize
./llama-quantize SmolVLM-256M-Instruct-f16.gguf SmolVLM-256M-Instruct-Q4_K_M.gguf Q4_K_M

Acknowledgments

Downloads last month
486
GGUF
Model size
0.2B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for pierretokns/SmolVLM-256M-Instruct-GGUF