Instructions to use winninghealth/WiNGPT-DocLoom-lite with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use winninghealth/WiNGPT-DocLoom-lite with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="winninghealth/WiNGPT-DocLoom-lite")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForImageTextToText

processor = AutoProcessor.from_pretrained("winninghealth/WiNGPT-DocLoom-lite")
model = AutoModelForImageTextToText.from_pretrained("winninghealth/WiNGPT-DocLoom-lite")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use winninghealth/WiNGPT-DocLoom-lite with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "winninghealth/WiNGPT-DocLoom-lite"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "winninghealth/WiNGPT-DocLoom-lite",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/winninghealth/WiNGPT-DocLoom-lite

SGLang

How to use winninghealth/WiNGPT-DocLoom-lite with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "winninghealth/WiNGPT-DocLoom-lite" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "winninghealth/WiNGPT-DocLoom-lite",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "winninghealth/WiNGPT-DocLoom-lite" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "winninghealth/WiNGPT-DocLoom-lite",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use winninghealth/WiNGPT-DocLoom-lite with Docker Model Runner:
```
docker model run hf.co/winninghealth/WiNGPT-DocLoom-lite
```

DocLoom-lite

Benchmark Performance （olmOCR-Bench）

OCR Model Performance Table

		Arxiv	Old scans math	Tables	Old scans	Headers and footers	Multi columnn	Long tiny text	Base	Overall
Marker 1.10.1	--	83.8	66.8	72.9	33.5	86.6	80	85.7	99.3	76.1±1.1
MinerU 2.5.4	--	76.6	54.6	84.9	33.7	96.6	78.2	83.5	93.7	75.2±1.1
DeepSeek-OCR	--	77.2	73.6	80.2	33.3	96.1	66.4	79.4	99.8	75.7±1.0
Nanonts-OCR2-3B	3B	75.4	46.1	86.8	40.9	32.1	81.9	93	99.6	69.5±1.1
Mistral OCR	--	77.2	67.5	60.6	29.3	93.6	71.3	77.1	99.4	72.0±1.1
MonkeyOCR-pro-3B	3B	83.8	68.8	74.6	36.1	91.2	76.6	80.1	95.3	75.8±1.0
Qwen3-VL-4B-Instruct	4B	83.1	74.5	83.9	40.5	35.5	81.7	88.7	99.3	73.4±1.0
olmOCR pipeline v0.4.0 with olmOCR-2-7B-1025	7B	82.9	82.1	84.3	48.3	95.7	84.3	81.4	99.7	82.3±1.1
Qwen3-VL-2B-Instruct	2B	66.7	50.9	66.6	36.3	48	63.2	73.5	98.9	63.0±1.2
Qwen3.5-0.8B	0.8B	46.4	32.5	39.8	32.5	67.6	29.6	41	92	46.6±1.1
Qwen3.5-2B	2B	65.8	59	62	33.5	65	53.3	65.6	90.8	61.9±1.2
Qwen3.5-4B	4B	76.7	83.6	78.2	42.6	30.8	76.5	85.3	91.3	70.6±1.0
FireRed-OCR-2B	2B	81.5	75.1	84.1	33.5	26.8	78.6	84.8	97.3	70.2±1.0
Logics-Parsing-v2-4B	4B	79.7	80.3	85.6	37.6	89.6	74.5	91.2	98.9	79.7±1.0
LightOnOCR-2-1B	1B	90.6	83.8	88.4	42.6	19.6	85.1	90	99.6	75.0±1.0
GLM-OCR-0.9B	0.9B	75.8	57.6	43.3	28.7	83.9	69.5	52.5	90	67.2±1.1
DocLoom (2025-12-31)	4B	74.3	66.6	80.9	45.1	91.4	82.9	89.1	99.7	78.8±1.0
DocLoom-lite	2B	71.4	69.4	75	45.2	90.4	79.6	91	98.6	77.6±1.0

How to Use

vLLM (Recommended)

For high-performance inference and deployment, we recommend using vLLM. We also provide a standalone script for efficiently processing multi-page PDF documents. This script operates independently and does not require the official olmOCR toolkit, offering a lightweight and fast way to perform OCR on entire documents.

python DocLoom_test.py <pdf_file_path>

Note: We use olmOCR’s no_anchoring_v4 prompt. Please include the following instruction in the request.

Attached is one page of a document that you must process. Just return the plain text representation of this document as if you were reading it naturally. Convert equations to LateX and tables to HTML.\nIf there are any figures or charts, label them with the following markdown syntax ![Alt text describing the contents of the figure](page_startx_starty_width_height.png)\nReturn your output as markdown.

Acknowledgement

We express our gratitude to the teams that developed olmOCR and Qwen3-VL, which were instrumental in our research.

Downloads last month: 2

Safetensors

Model size

2B params

Tensor type

BF16

Model tree for winninghealth/WiNGPT-DocLoom-lite

Base model

Qwen/Qwen3-VL-2B-Instruct

Finetuned

(218)

this model