Instructions to use winninghealth/WiNGPT-DocLoom-lite with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use winninghealth/WiNGPT-DocLoom-lite with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="winninghealth/WiNGPT-DocLoom-lite") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("winninghealth/WiNGPT-DocLoom-lite") model = AutoModelForImageTextToText.from_pretrained("winninghealth/WiNGPT-DocLoom-lite") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use winninghealth/WiNGPT-DocLoom-lite with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "winninghealth/WiNGPT-DocLoom-lite" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "winninghealth/WiNGPT-DocLoom-lite", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/winninghealth/WiNGPT-DocLoom-lite
- SGLang
How to use winninghealth/WiNGPT-DocLoom-lite with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "winninghealth/WiNGPT-DocLoom-lite" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "winninghealth/WiNGPT-DocLoom-lite", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "winninghealth/WiNGPT-DocLoom-lite" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "winninghealth/WiNGPT-DocLoom-lite", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Docker Model Runner
How to use winninghealth/WiNGPT-DocLoom-lite with Docker Model Runner:
docker model run hf.co/winninghealth/WiNGPT-DocLoom-lite
DocLoom-lite
![]() |
Benchmark Performance (olmOCR-Bench)
| Arxiv | Old scans math | Tables | Old scans | Headers and footers | Multi columnn | Long tiny text | Base | Overall | ||
|---|---|---|---|---|---|---|---|---|---|---|
| Marker 1.10.1 | -- | 83.8 | 66.8 | 72.9 | 33.5 | 86.6 | 80 | 85.7 | 99.3 | 76.1±1.1 |
| MinerU 2.5.4 | -- | 76.6 | 54.6 | 84.9 | 33.7 | 96.6 | 78.2 | 83.5 | 93.7 | 75.2±1.1 |
| DeepSeek-OCR | -- | 77.2 | 73.6 | 80.2 | 33.3 | 96.1 | 66.4 | 79.4 | 99.8 | 75.7±1.0 |
| Nanonts-OCR2-3B | 3B | 75.4 | 46.1 | 86.8 | 40.9 | 32.1 | 81.9 | 93 | 99.6 | 69.5±1.1 |
| Mistral OCR | -- | 77.2 | 67.5 | 60.6 | 29.3 | 93.6 | 71.3 | 77.1 | 99.4 | 72.0±1.1 |
| MonkeyOCR-pro-3B | 3B | 83.8 | 68.8 | 74.6 | 36.1 | 91.2 | 76.6 | 80.1 | 95.3 | 75.8±1.0 |
| Qwen3-VL-4B-Instruct | 4B | 83.1 | 74.5 | 83.9 | 40.5 | 35.5 | 81.7 | 88.7 | 99.3 | 73.4±1.0 |
| olmOCR pipeline v0.4.0 with olmOCR-2-7B-1025 | 7B | 82.9 | 82.1 | 84.3 | 48.3 | 95.7 | 84.3 | 81.4 | 99.7 | 82.3±1.1 |
| Qwen3-VL-2B-Instruct | 2B | 66.7 | 50.9 | 66.6 | 36.3 | 48 | 63.2 | 73.5 | 98.9 | 63.0±1.2 |
| Qwen3.5-0.8B | 0.8B | 46.4 | 32.5 | 39.8 | 32.5 | 67.6 | 29.6 | 41 | 92 | 46.6±1.1 |
| Qwen3.5-2B | 2B | 65.8 | 59 | 62 | 33.5 | 65 | 53.3 | 65.6 | 90.8 | 61.9±1.2 |
| Qwen3.5-4B | 4B | 76.7 | 83.6 | 78.2 | 42.6 | 30.8 | 76.5 | 85.3 | 91.3 | 70.6±1.0 |
| FireRed-OCR-2B | 2B | 81.5 | 75.1 | 84.1 | 33.5 | 26.8 | 78.6 | 84.8 | 97.3 | 70.2±1.0 |
| Logics-Parsing-v2-4B | 4B | 79.7 | 80.3 | 85.6 | 37.6 | 89.6 | 74.5 | 91.2 | 98.9 | 79.7±1.0 |
| LightOnOCR-2-1B | 1B | 90.6 | 83.8 | 88.4 | 42.6 | 19.6 | 85.1 | 90 | 99.6 | 75.0±1.0 |
| GLM-OCR-0.9B | 0.9B | 75.8 | 57.6 | 43.3 | 28.7 | 83.9 | 69.5 | 52.5 | 90 | 67.2±1.1 |
| DocLoom (2025-12-31) | 4B | 74.3 | 66.6 | 80.9 | 45.1 | 91.4 | 82.9 | 89.1 | 99.7 | 78.8±1.0 |
| DocLoom-lite | 2B | 71.4 | 69.4 | 75 | 45.2 | 90.4 | 79.6 | 91 | 98.6 | 77.6±1.0 |
How to Use
vLLM (Recommended)
For high-performance inference and deployment, we recommend using vLLM. We also provide a standalone script for efficiently processing multi-page PDF documents. This script operates independently and does not require the official olmOCR toolkit, offering a lightweight and fast way to perform OCR on entire documents.
python DocLoom_test.py <pdf_file_path>
Note: We use olmOCR’s no_anchoring_v4 prompt. Please include the following instruction in the request.
Attached is one page of a document that you must process. Just return the plain text representation of this document as if you were reading it naturally. Convert equations to LateX and tables to HTML.\nIf there are any figures or charts, label them with the following markdown syntax \nReturn your output as markdown.
Acknowledgement
We express our gratitude to the teams that developed olmOCR and Qwen3-VL, which were instrumental in our research.
- Downloads last month
- 2
Model tree for winninghealth/WiNGPT-DocLoom-lite
Base model
Qwen/Qwen3-VL-2B-Instruct