Unable to load the model

by anantharamb - opened Jan 12

Jan 12

•

I used the steps (as specified in the documentation) below to load the model. But it complains of the following error: 'stanford-oval/Llama-2-7b-WikiChat does not appear to have a file named pytorch_model-00001-of-00002.bin. ' Appreciate any inputs on this

Load model directly

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("stanford-oval/Llama-2-7b-WikiChat")
model = AutoModelForCausalLM.from_pretrained("stanford-oval/Llama-2-7b-WikiChat")

s-jse

Stanford Open Virtual Assistant Lab (OVAL) org Jan 14

I tested this model with both HuggingFace's TGI (https://github.com/huggingface/text-generation-inference) and vLLM (https://github.com/vllm-project/vllm) and it works just fine.
I'm not sure why it doesn't work directly using transformers. We normally don't test with that because it is much slower at inference.

s-jse

Stanford Open Virtual Assistant Lab (OVAL) org Jan 14

•

edited Jan 14

OK, there seems to have been an issue when converting model weights to the .safetensors format. Apparently, TGI and vLLM don't rely on model.safetensors.index.json, but the transformers library does.
I've fixed both models and they should work with transformers now.

s-jse changed discussion status to closed Jan 14

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment