Runtime issue when deploying on a SageMaker endpoint
#7
by
krokoko
- opened
Hi, I'm trying to deploy the model to a SageMaker endpoint using the SDK. I extended the latest available hugging face DLC to install the correct version of the transformers library (4.28.0). I'm deploying the model with:
hub = {
'HF_MODEL_ID':'nomic-ai/gpt4all-13b-snoozy',
'HF_TASK':'text-generation'
}
# create Hugging Face Model Class
huggingface_model_snoozy = HuggingFaceModel(
image_uri=ecr_image,
transformers_version='4.28.0',
pytorch_version='1.13.1',
py_version='py39',
env=hub,
role=role,
)
When running a prediction, I get:
ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) from primary with message "{
"code": 400,
"type": "InternalServerException",
"message": "Could not load model /.sagemaker/mms/models/nomic-ai__gpt4all-13b-snoozy with any of the following classes: (\u003cclass \u0027transformers.models.auto.modeling_auto.AutoModelForCausalLM\u0027\u003e, \u003cclass \u0027transformers.models.llama.modeling_llama.LlamaForCausalLM\u0027\u003e)."
}
Any idea what could be happening ? Thanks !