Cannot load model on SageMaker

#22
by jamie-relive - opened

I've just deployed the facebook/detr-resnet-50 model via the provided SageMaker SDK python script. I faced the same issue mentioned here so added the 'HF_MODEL_REVISION':'no_timm' parameter. However, when I try and make use of the SageMaker endpoint, I get the following error message:

An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) from primary and could not load the entire response body

Looking through the CloudWatch logs I found these two messages:
Could not load model /.sagemaker/mms/models/facebook__detr-resnet-50.no_timm with any of the following classes: (<class 'transformers.models.auto.modeling_auto.AutoModelForObjectDetection'>, <class 'transformers.models.detr.modeling_detr.DetrForObjectDetection'>).
and
OSError: Error no file named pytorch_model.bin, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory /.sagemaker/mms/models/facebook__detr-resnet-50.no_timm.

Has anyone come across this or know how this can be resolved?

After further investigation, I realised that this model was created using v1 of PyTorch and not v2 which is what I had been trying to use. Using an older version of the Hugging Face inference container that runs PyTorch 1.13.1 loaded the model, but then gave a new error: Object of type ResNetConfig is not JSON serializable which meant I still could not use the model.

This seems to possibly be an issue with the transformers version that is included in this container. Ultimately, it doesn't look like there is a clear way to get this model hosted and working on SageMaker using this method.

@jamie-relive I was able to host this model on Sagemaker with the following steps:

  1. Create custom docker image with the following Dockerfile (just added the missing library)
FROM 763104351884.dkr.ecr.us-east-1.amazonaws.com/huggingface-pytorch-inference:2.1.0-transformers4.37.0-cpu-py310-ubuntu22.04

RUN pip install --no-cache-dir timm==1.0.14
  1. Build the image and push it on ECR
  2. Deploy the model with the URI of the image you just pushed
image_uri = "your_image_uri"
huggingface_model = HuggingFaceModel(
    image_uri = image_uri
    env=hub,
    role=role, 
    sagemaker_session=sagemaker_session
)
  1. invoke the model (need to add deserializer as well)
from sagemaker.predictor import Predictor
from sagemaker.serializers import IdentitySerializer
from sagemaker.deserializers import JSONDeserializer

predictor = Predictor(
    endpoint_name="your_endpoint_name",
    serializer=IdentitySerializer("image/jpeg"),
    deserializer=JSONDeserializer()
)
image_path = "example.jpg"
with open(image_path, "rb") as f:
    image_bytes = f.read()
result = predictor.predict(image_bytes)

Hope this helps!

Sign up or log in to comment