File size: 6,725 Bytes
feb57de 634b601 17b3b21 feb57de 634b601 feb57de 634b601 b8bccaf 634b601 feb57de 634b601 b8bccaf feb57de 634b601 feb57de c4d83c0 b8bccaf 1198bed 3ce34f4 b8bccaf c4d83c0 634b601 feb57de 634b601 feb57de 634b601 feb57de 634b601 c4d83c0 634b601 feb57de 634b601 feb57de 2854ae6 634b601 feb57de 634b601 feb57de 634b601 feb57de 634b601 feb57de 0bffffd 634b601 feb57de 634b601 feb57de 634b601 feb57de 634b601 c4d83c0 0bffffd 1198bed 8ad47b6 1198bed 0bffffd c4d83c0 634b601 feb57de |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 |
---
language:
- ko
- en
license: cc-by-nc-sa-4.0
library_name: transformers
---
# Llama-3-KoEn-8B-xtuner-llava-preview π
<!-- Provide a quick summary of what the model is/does. -->
Llama-3-KoEn-8B-xtuner-llava-preview π is Korean based MutliModal based on Llava architecture, merged with [ChatVector](https://arxiv.org/abs/2310.04799) methods leveraging 2 models:
1) [beomi/Llama-3-KoEn-8B-preview](https://huggingface.co./beomi/Llama-3-KoEn-8B-preview)
2) [xtuner/llava-llama-3-8b-transformers](https://huggingface.co./xtuner/llava-llama-3-8b-transformers)
## Model Details
### Model Description
- **Developed by:** Junbum Lee (Beomi)
- **Model type:** HuggingFace Llava π
- **Language(s) (NLP):** Korean, English
- **License:** cc-by-nc-sa-4.0 under Llama3 License
- **Merged from model:** [beomi/Llama-3-KoEn-8B-preview](https://huggingface.co./beomi/Llama-3-KoEn-8B-preview) & [xtuner/llava-llama-3-8b-transformers](https://huggingface.co./xtuner/llava-llama-3-8b-transformers)
### Direct Use
<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
![Cat walking on frozen Han-River, Seoul](https://cdn-uploads.huggingface.co/production/uploads/5e56829137cb5b49818287ea/NWfoArWI4UPAxpEnolkwT.jpeg)
> Two version recommended
>
> v1. `revision='a38aac3'`: Basic ChatVector, with [25B+ trained KoEn ckpt(rev. d4d25a2)](https://huggingface.co./beomi/Llama-3-KoEn-8B-preview/commit/d4d25a2).
>
> v1-1. `revision='0224971'`: Basic ChatVector, with [40B+ trained KoEn ckpt(rev. ad39b32)](https://huggingface.co./beomi/Llama-3-KoEn-8B-preview/commit/ad39b32cd4207f37f61f16e79d3f4020c5b744ef).
>
> v1-2. `revision='170746c'`: Basic ChatVector, with [80B+ trained KoEn ckpt(rev. b4c45ab)](https://huggingface.co./beomi/Llama-3-KoEn-8B-preview/commit/b4c45ab3355c6ccb9bb1ecdf8a75ded4d6620c7e).
>
> v2. `revision='4f04d1e'`: Model diff based merging(ref. https://huggingface.co./blog/maywell/llm-feature-transfer), with [25B+ trained KoEn ckpt(rev. d4d25a2)](https://huggingface.co./beomi/Llama-3-KoEn-8B-preview/commit/d4d25a2).
```python
import requests
from PIL import Image
import torch
from transformers import AutoProcessor, LlavaForConditionalGeneration
model_id = "beomi/Llama-3-KoEn-8B-xtuner-llava-preview"
model = LlavaForConditionalGeneration.from_pretrained(
model_id,
torch_dtype='auto',
device_map='auto',
revision='a38aac3', # 'a38aac3' for basic ChatVector, '4f04d1e' for Model diff based merging(ref. https://huggingface.co./blog/maywell/llm-feature-transfer)
)
processor = AutoProcessor.from_pretrained(model_id)
tokenizer = processor.tokenizer
terminators = [
tokenizer.eos_token_id,
tokenizer.convert_tokens_to_ids("<|eot_id|>")
]
prompt = ("<|start_header_id|>user<|end_header_id|>\n\n<image>\nμ΄ μ΄λ―Έμ§μ λν΄μ μ€λͺ
ν΄μ£ΌμΈμ.<|eot_id|>"
"<|start_header_id|>assistant<|end_header_id|>\n\nμ΄ μ΄λ―Έμ§μλ")
image_file = "https://cdn-uploads.huggingface.co/production/uploads/5e56829137cb5b49818287ea/NWfoArWI4UPAxpEnolkwT.jpeg"
raw_image = Image.open(requests.get(image_file, stream=True).raw)
inputs = processor(prompt, raw_image, return_tensors='pt').to(0, torch.float16)
output = model.generate(**inputs, max_new_tokens=400, do_sample=True, eos_token_id=terminators,)
print(processor.decode(output[0][2:], skip_special_tokens=False))
# --- Example Output [v1, Chat Vector] ---
user<|end_header_id|>
<image>
μ΄ μ΄λ―Έμ§μ λν΄μ μ€λͺ
ν΄μ£ΌμΈμ.<|eot_id|><|start_header_id|>assistant<|end_header_id|>
μ΄ μ΄λ―Έμ§μλ κ³ μμ΄ ν λ§λ¦¬κ° κ°λ¬Ό μλ₯Ό κ±Έμ΄κ°λ λͺ¨μ΅μ΄ 보μ¬μ§λλ€. κ³ μμ΄λ κ°λ¬Όμ μλ¬Όκ²°μ λ―ΈλλΌμ νκ³ κ° κ°λ‘λ₯Ό μ§λλ λ° λ₯μνκ² λ³΄μ
λλ€. κ³ μμ΄μ λ°μ κ°λ¬Όλ‘ μ λ€μ΄κ°, κ·Έκ²μ μ¦κΈ°λ©° κ±Έμ΄κ°λλ€.
λν μ΄ μ΄λ―Έμ§λ μμ± λ
Ήμμ νκ±°λ λ
Ήνλ μλ£λ‘ μ μλμμΌλ©°, μ£Όλ‘ κ³ μμ΄μ λͺ¨μ΅μ κ°νκ² λ³΄μ¬μ€λλ€. μ리 ν¨κ³Όλ μ¬λ¬ κ°μ§λ‘ μΆκ°νμ¬ κ³ μμ΄μ μ€ν 리λ₯Ό λ€μνκ² μ λ¬ν©λλ€. κ°λ¬Όμ μλ¬Όκ²°μ λνλ΄λ©° κ°λ¬Ό μλ₯Ό κ±·λ κ³ μμ΄μ λͺ¨μ΅μ λμ± κ°λ ¬νκ² κ°μ‘°νκΈ° μν΄ μλ¬Όκ²°μ ν΅ν΄ λ λν
μΌν μ₯λ©΄μ 보μ¬μ€λλ€.<|eot_id|>
# --- Example Output [v1-1, Chat Vector] ---
user<|end_header_id|>
<image>
μ΄ μ΄λ―Έμ§μ λν΄μ μ€λͺ
ν΄μ£ΌμΈμ.<|eot_id|><|start_header_id|>assistant<|end_header_id|>
μ΄ μ΄λ―Έμ§μμλ ν κ³ μμ΄κ° μν΄μμ μμΉν λ°λ€λ₯Ό κ±·κ³ μλ λͺ¨μ΅μ λ³Ό μ μμ΅λλ€. κ³ μμ΄λ ν΄λ³μμλΆν° λ°λ€λ‘ κ±Έμ΄λ€μ΄κ°λ μ€μ΄λ©°, μ£Όλ³μλ μμν νλκ° λ°λ €μ€λ λͺ¨μ΅μ 보μ¬μ£Όκ³ μμ΅λλ€. μ΄ κ³ μμ΄λ νμ΄λ λλΆν° κ³ μμ΄μ κ°μμ§μλ λ€λ₯΄κ² λ°λ€λ₯Ό κ²½ννκ³ , μ μν΄κ°κ³ μμ΅λλ€. κ³ μμ΄λ λ°λ€λ₯Ό μ’μνκ³ , μ΄ νκ²½μμ ν볡μ λλΌλ κ² κ°μ΅λλ€. μ΄ κ³ μμ΄λ μΈκ°μ΄ μλ μμ°μ μΌλΆλ‘μ¨ μ΄ νκ²½μμ μ΄μκ°κ³ μμ΅λλ€.<|eot_id|>
# --- Example Output [v1-2, Chat Vector] ---
# model.generate(**inputs, max_new_tokens=200, do_sample=True, top_p=0.7, eos_token_id=terminators,)
user<|end_header_id|>
<image>
μ΄ μ΄λ―Έμ§μ λν΄μ μ€λͺ
ν΄μ£ΌμΈμ.<|eot_id|><|start_header_id|>assistant<|end_header_id|>
μ΄ μ΄λ―Έμ§λ ν κ³ μμ΄κ° λ¬Ό μλ₯Ό κ±·κ³ μλ λͺ¨μ΅μ ν¬μ°©ν μ¬μ§μ
λλ€. κ³ μμ΄λ λ λ°λ‘ λ¬Ό μλ₯Ό κ±Έμ΄ κ°κ³ μμ΅λλ€. κ³ μμ΄λ 4κ°μ λ° μ€ 2κ°μ λ°μ λ¬Όμ λΉ μ§μ§ μκ³ 2κ°μ λ°μ λ¬Όμ λΉ μ Έ μμ΅λλ€. κ³ μμ΄μ λ°μ΄ λΉ μ§ λΆλΆμ λ°μλμ΄ λ¬Όμ λΉμ³ μμ΅λλ€. λ¬Ό μλ₯Ό κ±·λ κ³ μμ΄μ λͺ¨μ΅μ΄ μ°ΈμΌλ‘ κ·μ½κ³ μ¬λμ€λ½μ΅λλ€. μ΄ μ¬μ§μ KBS λλ¬Όμ μκ΅μμ λ°©μλμμ΅λλ€. KBS λλ¬Όμ μκ΅μ 1985λ
λΆν° μμνμ¬ 2019λ
κΉμ§ 34λ
λμ λ°©μλ KBSμ λνμ μΈ μμ° λ€νλ©ν°λ¦¬ νλ‘κ·Έλ¨μ
λλ€. KBS λλ¬Όμ μκ΅μ λλ¬Όμ μνμ μ΅μ±, νλ, κ·Έλ¦¬κ³ μμ° νκ²½μ μ΄ν΄νκ³ λ³΄νΈνλ λ° κΈ°μ¬νκ³ μ ν©λλ€.
# --- Example Output [v2, Model diff based merging] ---
user<|end_header_id|>
<image>
μ΄ μ΄λ―Έμ§μ λν΄μ μ€λͺ
ν΄μ£ΌμΈμ.<|eot_id|><|start_header_id|>assistant<|end_header_id|>
μ΄ μ΄λ―Έμ§μλ νκ΅μ΄ μλ§κ³Ό ν¨κ» κ³ μμ΄κ° λ¬Όμ λ°μ λλκ³ κ±·λ λͺ¨μ΅μ΄ λ΄κ²¨ μμ΅λλ€. κ³ μμ΄λ μ€λ₯Έμͺ½ λ°μ λ¬Όμ λ΄κ·Έκ³ κ±·λ μ€μ΄λ©°, νκ΅μ΄ μλ§μ "κ³ μμ΄λ λ¬Όμ μ’μν©λλ€"λΌλ λ¬Έμ₯μ ν¬ν¨νκ³ μμ΅λλ€. μ΄ μλ§μ κ³ μμ΄κ° λ¬Όμ μ’μνλ κ²μ κ°μ‘°νκ³ μμ΅λλ€.<|eot_id|>
```
|