Instructions to use HuggingFaceM4/idefics-9b-instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use HuggingFaceM4/idefics-9b-instruct with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="HuggingFaceM4/idefics-9b-instruct")# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("HuggingFaceM4/idefics-9b-instruct") model = AutoModelForImageTextToText.from_pretrained("HuggingFaceM4/idefics-9b-instruct") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use HuggingFaceM4/idefics-9b-instruct with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "HuggingFaceM4/idefics-9b-instruct" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "HuggingFaceM4/idefics-9b-instruct", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/HuggingFaceM4/idefics-9b-instruct
- SGLang
How to use HuggingFaceM4/idefics-9b-instruct with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "HuggingFaceM4/idefics-9b-instruct" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "HuggingFaceM4/idefics-9b-instruct", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "HuggingFaceM4/idefics-9b-instruct" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "HuggingFaceM4/idefics-9b-instruct", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use HuggingFaceM4/idefics-9b-instruct with Docker Model Runner:
docker model run hf.co/HuggingFaceM4/idefics-9b-instruct
Error while loading the model
I'm coming accross an error while trying to load the idefics-9b-instruct model. Even after downloading the model weights and loading them using model = IdeficsForVisionText2Text.from_pretrained(model_path, torch_dtype=torch.bfloat16).to("cuda:1") I'm getting this error:
OSError: Error no file named pytorch_model.bin, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory model_dir/idefics_9b/.
I haven't changed the name of the weigths after downloading. So not sure what's causing this error.
I got it working by combining the weights into a single pytorch_model.bin file. I'm trying to run the model on v100 gpu but getting Cuda out of memory error. I have two v100 gpus available and i'm loading the model on a single gpu. Is it possible to split the weights among two gpus?
Hi @AL58763
can you ls the content of your model_dir/idefics_9b/ folder? i am assuming that model_path = model_dir/idefics_9b/
Hi @AL58763
can youlsthe content of yourmodel_dir/idefics_9b/folder? i am assuming thatmodel_path = model_dir/idefics_9b/
Thanks for the reply. Yes, model_path = model_dir/idefics_9b/. Here are the contents of the folder:
config.json
model-00002-of-00002.safetensors
pytorch_model-00002-of-00002.bin
tokenizer.json
generation_config.json
preprocessor_config.json
pytorch_model.bin
tokenizer.model
model-00001-of-00002.safetensors
pytorch_model-00001-of-00002.bin
tokenizer_config.json
urls.txt
As I said above, I combined both the pytorch bin files into one pytorch_model.bin
@VictorSanh i can provide local paths of images in the prompts as well right? I provided a local image and then checked inputs['pixel_values'], its just black pixels.
I think you are missing some files in your folder. for instance model.safetensors.index.json is the file that maps each weight to a specific location. Not having this file means that the loading logic does not know where to get the weights (i.e. model-00002-of-00002.safetensors or model-00001-of-00002.safetensors) and as such you had to create a merged pytorch_model.bin. There might be other files missing but that's the first one that came to my mind.
@VictorSanh i can provide local paths of images in the prompts as well right?
On top of my mind, I think you need to load the image (and put it into a pil object) yourself if it's local.
great! i'll close that discussion. feel free to re-open (or create another one) if you have other questions!