the model can't stop generating for content extraction

by tankstarwar - opened May 22

May 22

I asked the model to extract tabular content from a 3x5 area of a spreadsheet image, but the model seems never stopped and went OOM eventually, is it an issue?

donniems

Microsoft org May 22

@tankstarwar We have noticed this looping issue and our language team is actively solving it.

Alex01837178373

May 24

I have the same thing, but to describe the photo with tags

nguyenbh

Microsoft org May 24

@tankstarwar @Alex01837178373 Can you share your image and prompt?

Alex01837178373

May 24

@tankstarwar @Alex01837178373 Can you share your image and prompt?

Alex01837178373

May 24

@tankstarwar @Alex01837178373 Can you share your image and prompt?

nguyenbh

Microsoft org May 24

Thanks for sharing! The Phi-3 Vision model is intended to use in English. In the cases above it looks like the text inputs are in Russian.

Alex01837178373

May 25

Thanks for sharing! The Phi-3 Vision model is intended to use in English. In the cases above it looks like the text inputs are in Russian.

In this case, I used Phi 3 vision to write captions to the photo in the form of tags, I used English language

tankstarwar

May 28

@tankstarwar @Alex01837178373 Can you share your image and prompt?

Hi Alex,

I used this the dummy tabular image created in spreadsheet.

And single prompt like:
{"role": "user", "content": "<|image_1|>\nextract the sales data from the table above and output in json format."}

nguyenbh

Microsoft org Jun 17

Thanks for sharing your example.
I am trying the example on Azure AI https://ai.azure.com/explore/models/Phi-3-vision-128k-instruct/version/2/registry/azureml

It looks pretty reasonable.

tankstarwar

Jun 20

Thanks for sharing your example.
I am trying the example on Azure AI https://ai.azure.com/explore/models/Phi-3-vision-128k-instruct/version/2/registry/azureml

It looks pretty reasonable.

OK thanks for the feedback, I tried AzureML version and it does work. Previously, I was testing it with some Huggingface space and also on my local machine (no CUDA so disabled flash attention) and got the issue, it could be something wrong with the environment setup I guess.

nguyenbh changed discussion status to closed Jul 9

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment