Instructions to use HuggingFaceM4/idefics2-8b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use HuggingFaceM4/idefics2-8b with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="HuggingFaceM4/idefics2-8b")

# Load model directly
from transformers import AutoProcessor, AutoModelForImageTextToText

processor = AutoProcessor.from_pretrained("HuggingFaceM4/idefics2-8b")
model = AutoModelForImageTextToText.from_pretrained("HuggingFaceM4/idefics2-8b")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use HuggingFaceM4/idefics2-8b with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "HuggingFaceM4/idefics2-8b"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "HuggingFaceM4/idefics2-8b",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/HuggingFaceM4/idefics2-8b

SGLang

How to use HuggingFaceM4/idefics2-8b with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "HuggingFaceM4/idefics2-8b" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "HuggingFaceM4/idefics2-8b",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "HuggingFaceM4/idefics2-8b" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "HuggingFaceM4/idefics2-8b",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use HuggingFaceM4/idefics2-8b with Docker Model Runner:
```
docker model run hf.co/HuggingFaceM4/idefics2-8b
```

Using lora for idefics-8b-chatty finetuning with two RTX4080 32G, gather_map error

#78

by shuminzhou26803586 - opened Oct 12, 2024

Discussion

shuminzhou26803586

Oct 12, 2024

Here is the trace info:
Traceback (most recent call last):
File "D:\vsdev\pythontest\main.py", line 201, in
trainer.train()
File "c:\ProgramData\anaconda3\envs\pytorchcp310cu124\lib\site-packages\transformers\trainer.py", line 1859, in train
return inner_training_loop(
File "c:\ProgramData\anaconda3\envs\pytorchcp310cu124\lib\site-packages\transformers\trainer.py", line 2203, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
File "c:\ProgramData\anaconda3\envs\pytorchcp310cu124\lib\site-packages\transformers\trainer.py", line 3138, in training_step
loss = self.compute_loss(model, inputs)
File "c:\ProgramData\anaconda3\envs\pytorchcp310cu124\lib\site-packages\transformers\trainer.py", line 3161, in compute_loss
outputs = model(**inputs)
File "c:\ProgramData\anaconda3\envs\pytorchcp310cu124\lib\site-packages\torch\nn\modules\module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "c:\ProgramData\anaconda3\envs\pytorchcp310cu124\lib\site-packages\torch\nn\modules\module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "c:\ProgramData\anaconda3\envs\pytorchcp310cu124\lib\site-packages\torch\nn\parallel\data_parallel.py", line 187, in forward
return self.gather(outputs, self.output_device)
File "c:\ProgramData\anaconda3\envs\pytorchcp310cu124\lib\site-packages\torch\nn\parallel\data_parallel.py", line 204, in gather
return gather(outputs, output_device, dim=self.dim)
File "c:\ProgramData\anaconda3\envs\pytorchcp310cu124\lib\site-packages\torch\nn\parallel\scatter_gather.py", line 113, in gather
res = gather_map(outputs)
File "c:\ProgramData\anaconda3\envs\pytorchcp310cu124\lib\site-packages\torch\nn\parallel\scatter_gather.py", line 102, in gather_map
return type(out)((k, gather_map([d[k] for d in outputs]))
File "", line 9, in init
File "c:\ProgramData\anaconda3\envs\pytorchcp310cu124\lib\site-packages\transformers\utils\generic.py", line 393, in post_init
for idx, element in enumerate(iterator):
File "c:\ProgramData\anaconda3\envs\pytorchcp310cu124\lib\site-packages\torch\nn\parallel\scatter_gather.py", line 102, in
return type(out)((k, gather_map([d[k] for d in outputs]))
File "c:\ProgramData\anaconda3\envs\pytorchcp310cu124\lib\site-packages\torch\nn\parallel\scatter_gather.py", line 108, in gather_map
return type(out)(map(gather_map, zip(*outputs)))
TypeError: DynamicCache.init() takes 1 positional argument but 2 were given

I trace the error by step, finding the reason is that idefics2 model implements past_key_values by DynamicCache, which is not supported type(out) with positional parameters. Is there any solution for this?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment