the model can't stop generating for content extraction
I asked the model to extract tabular content from a 3x5 area of a spreadsheet image, but the model seems never stopped and went OOM eventually, is it an issue?
I have the same thing, but to describe the photo with tags
@tankstarwar @Alex01837178373 Can you share your image and prompt?
@tankstarwar @Alex01837178373 Can you share your image and prompt?
Thanks for sharing! The Phi-3 Vision model is intended to use in English. In the cases above it looks like the text inputs are in Russian.
Thanks for sharing! The Phi-3 Vision model is intended to use in English. In the cases above it looks like the text inputs are in Russian.
In this case, I used Phi 3 vision to write captions to the photo in the form of tags, I used English language
@tankstarwar @Alex01837178373 Can you share your image and prompt?
Hi Alex,
I used this the dummy tabular image created in spreadsheet.
And single prompt like:
{"role": "user", "content": "<|image_1|>\nextract the sales data from the table above and output in json format."}
Thanks for sharing your example.
I am trying the example on Azure AI https://ai.azure.com/explore/models/Phi-3-vision-128k-instruct/version/2/registry/azureml
It looks pretty reasonable.
Thanks for sharing your example.
I am trying the example on Azure AI https://ai.azure.com/explore/models/Phi-3-vision-128k-instruct/version/2/registry/azureml
It looks pretty reasonable.
OK thanks for the feedback, I tried AzureML version and it does work. Previously, I was testing it with some Huggingface space and also on my local machine (no CUDA so disabled flash attention) and got the issue, it could be something wrong with the environment setup I guess.