jujbob's picture
Update README.md
3185b9c verified
---
language:
- en
- ko
license: llama3.1
library_name: transformers
base_model:
- meta-llama/Meta-Llama-3.1-405B
---
<a href="https://github.com/MLP-Lab/Bllossom">
<img src="https://github.com/teddysum/bllossom/blob/main//bllossom_icon.png?raw=true" width="30%" height="30%">
</a>
# Update!
* [2024.08.08] preview ๋ชจ๋ธ์ด ์ตœ์ดˆ ์—…๋ฐ์ดํŠธ ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. A100 120๋Œ€ ๊ทœ๋ชจ์˜ ์ปดํ“จํŒ… ํŒŒ์›Œ๋กœ ํ•™์Šต ์ง„ํ–‰์ค‘์œผ๋กœ ๋ชจ๋ธ์€ ๊ณ„์† ์—…๋ฐ์ดํŠธ๋  ์˜ˆ์ •์ž…๋‹ˆ๋‹ค.
# Bllossom | [Demo]() | [Homepage](https://www.bllossom.ai/) | [Github](https://github.com/MLP-Lab/Bllossom) |
<!-- [GPU์šฉ Colab ์ฝ”๋“œ์˜ˆ์ œ](https://colab.research.google.com/drive/1fBOzUVZ6NRKk_ugeoTbAOokWKqSN47IG?usp=sharing) | -->
<!-- [CPU์šฉ Colab ์–‘์žํ™”๋ชจ๋ธ ์ฝ”๋“œ์˜ˆ์ œ](https://colab.research.google.com/drive/129ZNVg5R2NPghUEFHKF0BRdxsZxinQcJ?usp=drive_link) -->
```bash
์ €ํฌ Bllossom ํŒ€์—์„œ llama3.1 ๊ธฐ๋ฐ˜์˜ ํ•œ๊ตญ์–ด-์˜์–ด ์ด์ค‘ ์–ธ์–ด๋ชจ๋ธ Bllossom-405B๋ฅผ ๊ณต๊ฐœํ•ฉ๋‹ˆ๋‹ค.
์ด๋ฒˆ Bllossom3.1-405B๋Š” preview ๋ฒ„์ „์œผ๋กœ ๋‹ค์Œ๊ณผ ๊ฐ™์€ ํŠน์ง•์„ ๋ณด์ž…๋‹ˆ๋‹ค.
- Llama3.1-405B-Inst ๋Œ€๋น„ 5~10% ํ•œ๊ตญ์–ด ์„ฑ๋Šฅ์ด ํ–ฅ์ƒ ๋˜์—ˆ์Šต๋‹ˆ๋‹ค (single turn ๊ธฐ์ค€).
- Llama3.1์˜ ์˜์–ด ์„ฑ๋Šฅ์„ ์ „ํ˜€ ์†์ƒ์‹œํ‚ค์ง€ ์•Š์€ ์™„์ „ํ•œ Bilingual ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.
- ๊ธฐ์กด ๋ชจ๋ธ ๋Œ€๋น„ ์ž์—ฐ์Šค๋Ÿฝ๊ณ  ์นœ์ ˆํ•œ ํ•œ๊ตญ์–ด ๋ฌธ์žฅ์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.
- ์ธ๊ฐ„ํ‰๊ฐ€, GPTํ‰๊ฐ€(MT-Bench, LogicKor 9์  ๋“ฑ) ๊ฒฐ๊ณผ GPT4์™€ ์œ ์‚ฌํ•˜๊ฑฐ๋‚˜ ์•ฝ๊ฐ„ ๋‚ฎ์€ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.
ํ•ด๋‹น ๋ชจ๋ธ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์€ ํ˜‘์—…์„ ํ† ๋Œ€๋กœ ๊ตฌ์ถ• ๋˜์—ˆ์Šต๋‹ˆ๋‹ค!
- ์„œ์šธ๊ณผ๊ธฐ๋Œ€ MLP์—ฐ๊ตฌ์‹ค์˜ ๊ฒฝ๋Ÿ‰ํ™” ์‚ฌ์ „ ํ•™์Šต๊ธฐ๋ฒ•์ด ์ ์šฉ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.
- ํ…Œ๋””์ธ์˜ ์ •๊ตํ•œ Instruction Tuning๊ณผ RAG ๊ธฐ์ˆ ์ด ์ ์šฉ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.
- HP์˜ computing ์ง€์›์ด ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค.
- Common Crawl ์žฌ๋‹จ์˜ OscarํŒ€์—์„œ ์ ๊ทน์ ์ธ ๋ฐ์ดํ„ฐ ์ง€์›์ด ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค
์–ธ์ œ๋‚˜ ๊ทธ๋žฌ๋“ฏ ํ•ด๋‹น ๋ชจ๋ธ์€ ์ƒ์—…์  ์ด์šฉ์ด ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค. A100 6๋Œ€๋งŒ ์ค€๋น„๋˜๋ฉด Bllossom์„ ์ด์šฉํ•ด ์—ฌ๋Ÿฌ๋ถ„๋งŒ์˜ ๋ชจ๋ธ์„ ๋งŒ๋“ค์–ด๋ณด์„ธ์š” GPT4๊ฐ€ ๋”์ด์ƒ ํ•„์š” ์—†์Šต๋‹ˆ๋‹ค.
GPU์ž์›์ด ๋ถ€์กฑํ•˜๋ฉด A100 3๊ฐœ ํ˜น์€ A6000 4๊ฐœ๋กœ ์–‘์žํ™” ๋ชจ๋ธ์„ ์ด์šฉํ•ด ๋ณด์„ธ์š”. [์–‘์žํ™”๋ชจ๋ธ](https://huggingface.co./MLP-KTLim/llama-3.1-Korean-Bllossom-405B-gguf-Q4_K_M)
1. Bllossom-8B๋Š” ์„œ์šธ๊ณผ๊ธฐ๋Œ€, ํ…Œ๋””์ธ, ์—ฐ์„ธ๋Œ€ ์–ธ์–ด์ž์› ์—ฐ๊ตฌ์‹ค์˜ ์–ธ์–ดํ•™์ž์™€ ํ˜‘์—…ํ•ด ๋งŒ๋“  ์‹ค์šฉ์ฃผ์˜๊ธฐ๋ฐ˜ ๋ฌด๋ฃŒ ์–ธ์–ด๋ชจ๋ธ๋กœ 2023๋…„๋ถ€ํ„ฐ ์ง€์†์ ์ธ ์—…๋ฐ์ดํŠธ๋ฅผ ํ†ตํ•ด ๊ด€๋ฆฌํ•ด ์˜ค๊ณ ์žˆ์Šต๋‹ˆ๋‹ค. ๋งŽ์ด ํ™œ์šฉํ•ด์ฃผ์„ธ์š” ๐Ÿ™‚
2. ์ดˆ ๊ฐ•๋ ฅํ•œ Advanced-Bllossom ๋ชจ๋ธ, ์‹œ๊ฐ-์–ธ์–ด ๋ชจ๋ธ์„ ๋ณด์œ ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค! (๊ถ๊ธˆํ•˜์‹ ๋ถ„์€ ๊ฐœ๋ณ„ ์—ฐ๋ฝ์ฃผ์„ธ์š”!!)
3. Bllossom์€ NAACL2024, LREC-COLING2024 (๊ตฌ๋‘) ๋ฐœํ‘œ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.
4. ์ข‹์€ ์–ธ์–ด๋ชจ๋ธ ๊ณ„์† ์—…๋ฐ์ดํŠธ ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค!! ํ•œ๊ตญ์–ด ๊ฐ•ํ™”๋ฅผ์œ„ํ•ด ๊ณต๋™ ์—ฐ๊ตฌํ•˜์‹ค๋ถ„(ํŠนํžˆ๋…ผ๋ฌธ) ์–ธ์ œ๋“  ํ™˜์˜ํ•ฉ๋‹ˆ๋‹ค!!
๊ทธ๋ฆฌ๊ณ  ์†Œ๋Ÿ‰์˜ GPU๋ผ๋„ ๋Œ€์—ฌ ๊ฐ€๋Šฅํ•œํŒ€์€ ์–ธ์ œ๋“  ์—ฐ๋ฝ์ฃผ์„ธ์š”! ๋งŒ๋“ค๊ณ  ์‹ถ์€๊ฑฐ ๋„์™€๋“œ๋ ค์š”.
```
```bash
The Bllossom language model is a Korean-English bilingual language model based on the open-source LLama3.1. It enhances the connection of knowledge between Korean and English. It has the following features:
- Korean performance improved by 5-10% compared to Llama 3.1-405B-Inst (on Single Turn Eval).
- A complete bilingual model that does not compromise the English performance of Llama 3.1.
- Generates more natural and friendly Korean sentences compared to existing models.
- Human evaluations and GPT evaluations (MT-Bench, LogicKor scoring 9, etc.) show performance similar to or slightly lower than GPT-4.
```
**This model developed by [MLPLab at Seoultech](http://mlp.seoultech.ac.kr), [Teddysum](http://teddysum.ai/) and [Yonsei Univ](https://sites.google.com/view/hansaemkim/hansaem-kim)**
## Example code
### Colab Tutorial
- [Inference-Code-Link](https://colab.research.google.com/drive/1fBOzUVZ6NRKk_ugeoTbAOokWKqSN47IG?usp=sharing)
### Install Dependencies
```bash
pip install torch transformers==4.40.0 accelerate
```
### Python code with Pipeline
```python
import transformers
import torch
model_id = "Bllossom/llama-3.1-Korean-Bllossom-405B"
pipeline = transformers.pipeline(
"text-generation",
model=model_id,
model_kwargs={"torch_dtype": torch.bfloat16},
device_map="auto",
)
pipeline.model.eval()
PROMPT = '''You are a helpful AI assistant. Please answer the user's questions kindly. ๋‹น์‹ ์€ ์œ ๋Šฅํ•œ AI ์–ด์‹œ์Šคํ„ดํŠธ ์ž…๋‹ˆ๋‹ค. ์‚ฌ์šฉ์ž์˜ ์งˆ๋ฌธ์— ๋Œ€ํ•ด ์นœ์ ˆํ•˜๊ฒŒ ๋‹ต๋ณ€ํ•ด์ฃผ์„ธ์š”.'''
instruction = "์„œ์šธ์˜ ์œ ๋ช…ํ•œ ๊ด€๊ด‘ ์ฝ”์Šค๋ฅผ ๋งŒ๋“ค์–ด์ค„๋ž˜?"
messages = [
{"role": "system", "content": f"{PROMPT}"},
{"role": "user", "content": f"{instruction}"}
]
prompt = pipeline.tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
terminators = [
pipeline.tokenizer.eos_token_id,
pipeline.tokenizer.convert_tokens_to_ids("<|eot_id|>")
]
outputs = pipeline(
prompt,
max_new_tokens=2048,
eos_token_id=terminators,
do_sample=True,
temperature=0.6,
top_p=0.9
)
print(outputs[0]["generated_text"][len(prompt):])
```
```
# ๋ฌผ๋ก ์ด์ฃ ! ์„œ์šธ์€ ๋‹ค์–‘ํ•œ ๋ฌธํ™”์™€ ์—ญ์‚ฌ, ์ž์—ฐ์„ ๊ฒธ๋น„ํ•œ ๋„์‹œ๋กœ, ๋งŽ์€ ๊ด€๊ด‘ ๋ช…์†Œ๋ฅผ ์ž๋ž‘ํ•ฉ๋‹ˆ๋‹ค. ์—ฌ๊ธฐ ์„œ์šธ์˜ ์œ ๋ช…ํ•œ ๊ด€๊ด‘ ์ฝ”์Šค๋ฅผ ์†Œ๊ฐœํ•ด ๋“œ๋ฆด๊ฒŒ์š”.
### ์ฝ”์Šค 1: ์—ญ์‚ฌ์™€ ๋ฌธํ™” ํƒ๋ฐฉ
1. **๊ฒฝ๋ณต๊ถ**
- ์„œ์šธ์˜ ๋Œ€ํ‘œ์ ์ธ ๊ถ๊ถ๋กœ, ์กฐ์„  ์™•์กฐ์˜ ์—ญ์‚ฌ์™€ ๋ฌธํ™”๋ฅผ ์ฒดํ—˜ํ•  ์ˆ˜ ์žˆ๋Š” ๊ณณ์ž…๋‹ˆ๋‹ค.
2. **๋ถ์ดŒ ํ•œ์˜ฅ๋งˆ์„**
- ์ „ํ†ต ํ•œ์˜ฅ์ด ์ž˜ ๋ณด์กด๋œ ๋งˆ์„๋กœ, ์กฐ์„ ์‹œ๋Œ€์˜ ์ƒํ™œ์ƒ์„ ๋Š๋‚„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
...
```
## Supported by
- Hewlett Packard (HP) Enterprise <img src="https://upload.wikimedia.org/wikipedia/commons/thumb/4/46/Hewlett_Packard_Enterprise_logo.svg/2880px-Hewlett_Packard_Enterprise_logo.svg.png" width="20%" height="20%">
- Common Crawl <img src="https://cdn.prod.website-files.com/6479b8d98bf5dcb4a69c4f31/649b5869af56f6df617cfb1f_CC_Logo_Blue_Auto.svg" width="20%" height="20%">
- AICA
## Citation
**Language Model**
```text
@misc{bllossom,
author = {ChangSu Choi, Yongbin Jeong, Seoyoon Park, InHo Won, HyeonSeok Lim, SangMin Kim, Yejee Kang, Chanhyuk Yoon, Jaewan Park, Yiseul Lee, HyeJin Lee, Younggyun Hahm, Hansaem Kim, KyungTae Lim},
title = {Optimizing Language Augmentation for Multilingual Large Language Models: A Case Study on Korean},
year = {2024},
journal = {LREC-COLING 2024},
paperLink = {\url{https://arxiv.org/pdf/2403.10882}},
},
}
```
**Vision-Language Model**
```text
@misc{bllossom-V,
author = {Dongjae Shin, Hyunseok Lim, Inho Won, Changsu Choi, Minjun Kim, Seungwoo Song, Hangyeol Yoo, Sangmin Kim, Kyungtae Lim},
title = {X-LLaVA: Optimizing Bilingual Large Vision-Language Alignment},
year = {2024},
publisher = {GitHub},
journal = {NAACL 2024 findings},
paperLink = {\url{https://arxiv.org/pdf/2403.11399}},
},
}
```
## Contact
- ์ž„๊ฒฝํƒœ(KyungTae Lim), Professor at Seoultech. `[email protected]`
- ํ•จ์˜๊ท (Younggyun Hahm), CEO of Teddysum. `[email protected]`
- ๊น€ํ•œ์ƒ˜(Hansaem Kim), Professor at Yonsei. `[email protected]`
## Contributor
- ์ตœ์ฐฝ์ˆ˜(Chansu Choi), [email protected]
- ๊น€์ƒ๋ฏผ(Sangmin Kim), [email protected]
- ์›์ธํ˜ธ(Inho Won), [email protected]
- ๊น€๋ฏผ์ค€(Minjun Kim), [email protected]
- ์†ก์Šน์šฐ(Seungwoo Song), [email protected]
- ์‹ ๋™์žฌ(Dongjae Shin), [email protected]
- ์ž„ํ˜„์„(Hyeonseok Lim), [email protected]
- ์œก์ •ํ›ˆ(Jeonghun Yuk), [email protected]
- ์œ ํ•œ๊ฒฐ(Hangyeol Yoo), [email protected]
- ์†ก์„œํ˜„(Seohyun Song), [email protected]