|
--- |
|
license: other |
|
pipeline_tag: text-generation |
|
--- |
|
|
|
|
|
<p align="center"> |
|
<img src="logo_en.png" width="400"/> |
|
<p> |
|
|
|
<p align="center"> |
|
<b><font size="6">InternLM-XComposer2</font></b> |
|
<p> |
|
|
|
<div align="center"> |
|
|
|
[💻Github Repo](https://github.com/InternLM/InternLM-XComposer) |
|
|
|
[Paper](https://arxiv.org/abs/2401.16420) |
|
|
|
</div> |
|
|
|
**InternLM-XComposer2** is a vision-language large model (VLLM) based on [InternLM2](https://github.com/InternLM/InternLM) for advanced text-image comprehension and composition. |
|
|
|
We release InternLM-XComposer2 series in two versions: |
|
|
|
- InternLM-XComposer2-VL: The pretrained VLLM model with InternLM2 as the initialization of the LLM, achieving strong performance on various multimodal benchmarks. |
|
- InternLM-XComposer2: The finetuned VLLM for *Free-from Interleaved Text-Image Composition*. |
|
|
|
### Import from Transformers |
|
To load the InternLM-XComposer2-VL-7B model using Transformers, use the following code: |
|
```python |
|
import torch |
|
from PIL import image |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
ckpt_path = "internlm/internlm-xcomposer2-vl-7b" |
|
tokenizer = AutoTokenizer.from_pretrained(ckpt_path, trust_remote_code=True).cuda() |
|
# Set `torch_dtype=torch.float16` to load model in float16, otherwise it will be loaded as float32 and might cause OOM Error. |
|
model = AutoModelForCausalLM.from_pretrained(ckpt_path, torch_dtype=torch.float16, trust_remote_code=True).cuda() |
|
model = model.eval() |
|
``` |
|
|
|
### 通过 Transformers 加载 |
|
通过以下的代码加载 InternLM-XComposer2-VL-7B 模型 |
|
|
|
```python |
|
import torch |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
ckpt_path = "internlm/internlm-xcomposer2-vl-7b" |
|
tokenizer = AutoTokenizer.from_pretrained(ckpt_path, trust_remote_code=True).cuda() |
|
# `torch_dtype=torch.float16` 可以令模型以 float16 精度加载,否则 transformers 会将模型加载为 float32,导致显存不足 |
|
model = AutoModelForCausalLM.from_pretrained(ckpt_path, torch_dtype=torch.float16, trust_remote_code=True).cuda() |
|
model = model.eval() |
|
``` |
|
|
|
### Open Source License |
|
The code is licensed under Apache-2.0, while model weights are fully open for academic research and also allow free commercial usage. To apply for a commercial license, please fill in the application form (English)/申请表(中文). For other questions or collaborations, please contact [email protected]. |