How do I pass the text prompt and the image parameters to the predictor?
The SgaeMaker deploy card shows this invocation:
predictor.predict({
"inputs": "Can you please let us know more details about your ",
})
But this is an image+text multi modal model. How do I pass in both the text prompt and the image?
HF Discussion 1: https://discuss.huggingface.co/t/can-text-to-image-models-be-deployed-to-a-sagemaker-endpoint/20120
HF Discussion 2: https://discuss.huggingface.co/t/how-to-use-llava-with-huggingface/52315
My SO entry: https://stackoverflow.com/questions/77193088/how-to-perform-an-inference-on-a-llava-llama-model-deployed-to-sagemake-from-hug
SO entry about a serverless deployment: https://stackoverflow.com/questions/76197446/how-to-do-model-inference-on-a-multimodal-model-from-hugginface-using-sagemaker
GitHub Discussion: https://github.com/haotian-liu/LLaVA/discussions/454
I got fed up and used replicate.com: https://stackoverflow.com/a/77364236/292502