RaushanTurganbay HF staff commited on
Commit
12a6750
·
verified ·
1 Parent(s): 07f3bcb

Update pipeline example

Browse files
Files changed (1) hide show
  1. README.md +7 -20
README.md CHANGED
@@ -10,7 +10,6 @@ tags:
10
  datasets:
11
  - lmms-lab/LLaVA-OneVision-Data
12
  pipeline_tag: image-text-to-text
13
- inference: false
14
  arxiv: 2408.03326
15
  library_name: transformers
16
  ---
@@ -58,36 +57,24 @@ Below we used [`"llava-hf/llava-onevision-qwen2-0.5b-ov-hf"`](https://huggingfac
58
 
59
  ```python
60
  from transformers import pipeline
61
- from PIL import Image
62
- import requests
63
- from transformers import AutoProcessor
64
-
65
-
66
- model_id = "llava-hf/llava-onevision-qwen2-0.5b-ov-hf"
67
- processor = AutoProcessor.from_pretrained(model_id)
68
- pipe = pipeline("image-to-text", model=model_id)
69
- url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/ai2d-demo.jpg"
70
- image = Image.open(requests.get(url, stream=True).raw)
71
 
72
- # Define a chat history and use `apply_chat_template` to get correctly formatted prompt
73
- # Each value in "content" has to be a list of dicts with types ("text", "image")
74
- conversation = [
75
  {
76
-
77
  "role": "user",
78
  "content": [
 
79
  {"type": "text", "text": "What does the label 15 represent? (1) lava (2) core (3) tunnel (4) ash cloud"},
80
- {"type": "image"},
81
  ],
82
  },
83
  ]
84
- prompt = processor.apply_chat_template(conversation, add_generation_prompt=True)
85
 
86
- outputs = pipe(image, prompt=prompt, generate_kwargs={"max_new_tokens": 200})
87
- print(outputs)
88
- >>> {"generated_text": "user\n\nWhat does the label 15 represent? (1) lava (2) core (3) tunnel (4) ash cloud\nassistant\nLava"}
89
  ```
90
 
 
91
  ### Using pure `transformers`:
92
 
93
  Below is an example script to run generation in `float16` precision on a GPU device:
 
10
  datasets:
11
  - lmms-lab/LLaVA-OneVision-Data
12
  pipeline_tag: image-text-to-text
 
13
  arxiv: 2408.03326
14
  library_name: transformers
15
  ---
 
57
 
58
  ```python
59
  from transformers import pipeline
 
 
 
 
 
 
 
 
 
 
60
 
61
+ pipe = pipeline("image-text-to-text", model="llava-onevision-qwen2-0.5b-ov-hf")
62
+ messages = [
 
63
  {
 
64
  "role": "user",
65
  "content": [
66
+ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/ai2d-demo.jpg"},
67
  {"type": "text", "text": "What does the label 15 represent? (1) lava (2) core (3) tunnel (4) ash cloud"},
 
68
  ],
69
  },
70
  ]
 
71
 
72
+ out = pipe(text=messages, max_new_tokens=20)
73
+ print(out)
74
+ >>> [{'input_text': [{'role': 'user', 'content': [{'type': 'image', 'url': 'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/ai2d-demo.jpg'}, {'type': 'text', 'text': 'What does the label 15 represent? (1) lava (2) core (3) tunnel (4) ash cloud'}]}], 'generated_text': 'Lava'}]
75
  ```
76
 
77
+
78
  ### Using pure `transformers`:
79
 
80
  Below is an example script to run generation in `float16` precision on a GPU device: