jymcc commited on
Commit
18c3026
·
verified ·
1 Parent(s): 46a736b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +101 -3
README.md CHANGED
@@ -1,3 +1,101 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - FreedomIntelligence/PubMedVision
5
+ language:
6
+ - en
7
+ - zh
8
+ pipeline_tag: text-generation
9
+ tags:
10
+ - vision
11
+ - image-text-to-text
12
+ ---
13
+ <div align="center">
14
+ <h1>
15
+ HuatuoGPT-Vision-34B-hf
16
+ </h1>
17
+ </div>
18
+
19
+ <div align="center">
20
+ <a href="https://github.com/FreedomIntelligence/HuatuoGPT-Vision" target="_blank">GitHub</a> | <a href="https://arxiv.org/abs/2406.19280" target="_blank">Paper</a>
21
+ </div>
22
+
23
+ ## Introduction
24
+
25
+ This is the Huggingface LLaVA version of HuatuoGPT-Vision-34B, compatible with VLLM and other frameworks. You can access the original model here: [HuatuoGPT-Vision-34B](https://huggingface.co/FreedomIntelligence/HuatuoGPT-Vision-34B).
26
+
27
+ # Quick Start
28
+
29
+ ### 1. Deploy the model using [VLLM](https://github.com/vllm-project/vllm/tree/main)
30
+ ```bash
31
+ python -m vllm.entrypoints.openai.api_server \
32
+ --model huatuogpt_vision_model_path \
33
+ --tensor_parallel_size 2 \
34
+ --gpu_memory_utilization 0.8 \
35
+ --served-model-name huatuogpt_vision_34b \
36
+ --chat-template "{%- if messages[0]['role'] == 'system' -%}\n {%- set system_message = messages[0]['content'] -%}\n {%- set messages = messages[1:] -%}\n{%- else -%}\n {% set system_message = '' -%}\n{%- endif -%}\n\n{%- for message in messages -%}\n {%- if (message['role'] == 'user') != (loop.index0 % 2 == 0) -%}\n {{ raise_exception('Conversation roles must alternate user/assistant/user/assistant/...') }}\n {%- endif -%}\n\n {%- if message['role'] == 'user' -%}\n {{ '<|user|>\n' + message['content'] + '\n' }}\n {%- elif message['role'] == 'assistant' -%}\n {{ '<|assistant|>\n' + message['content'] + '\n' }}\n {%- endif -%}\n{%- endfor -%}\n\n{%- if add_generation_prompt -%}\n {{ '<|assistant|>' }}\n{% endif %}" \
37
+ --port 9559 --max-model-len 2048 > vllm_openai_server.log 2>&1 &
38
+ ```
39
+
40
+ ### 2. Model inference
41
+ ```python
42
+ from openai import OpenAI
43
+ from PIL import Image
44
+ import base64
45
+ import io
46
+
47
+ def get_image(image_path):
48
+ image = Image.open(image_path).convert('RGB')
49
+ img_type = image.format
50
+ if not img_type:
51
+ img_type = image_path.split('.')[-1]
52
+ byte_arr = io.BytesIO()
53
+ image.save(byte_arr, format=img_type)
54
+ byte_arr.seek(0)
55
+ image = base64.b64encode(byte_arr.getvalue()).decode()
56
+ return image, img_type
57
+
58
+
59
+ client = OpenAI(
60
+ base_url="http://localhost:9559/v1",
61
+ api_key="token-abc123"
62
+ )
63
+ image_path = 'your_image_path'
64
+ image, img_type = get_image(image_path)
65
+
66
+
67
+ inputcontent = [{
68
+ "type": "text",
69
+ "text": '<image>\nWhat does the picture show?'
70
+ }]
71
+
72
+ inputcontent.append({
73
+ "type": "image_url",
74
+ "image_url": {
75
+ "url": f"data:image/{img_type};base64,{image}"
76
+ }
77
+ })
78
+
79
+ response = client.chat.completions.create(
80
+ model="huatuogpt_vision_34b",
81
+ messages=[
82
+ {"role": "user", "content": inputcontent}
83
+ ],
84
+ temperature=0.2
85
+ )
86
+ print(response.choices[0].message.content)
87
+ ```
88
+
89
+ # <span id="Start">Citation</span>
90
+
91
+ ```
92
+ @misc{chen2024huatuogptvisioninjectingmedicalvisual,
93
+ title={HuatuoGPT-Vision, Towards Injecting Medical Visual Knowledge into Multimodal LLMs at Scale},
94
+ author={Junying Chen and Ruyi Ouyang and Anningzhe Gao and Shunian Chen and Guiming Hardy Chen and Xidong Wang and Ruifei Zhang and Zhenyang Cai and Ke Ji and Guangjun Yu and Xiang Wan and Benyou Wang},
95
+ year={2024},
96
+ eprint={2406.19280},
97
+ archivePrefix={arXiv},
98
+ primaryClass={cs.CV},
99
+ url={https://arxiv.org/abs/2406.19280},
100
+ }
101
+ ```