KoreanLM / README.md
quantumaikr's picture
Update README.md
f4351ab
---
language:
- ko
- en
pipeline_tag: text-generation
tags:
- vicuna
- llama
---
<p align="center" width="100%">
<img src="https://i.imgur.com/snFDU0P.png" alt="KoreanLM icon" style="width: 500px; display: block; margin: auto; border-radius: 10%;">
</p>
# KoreanLM: ํ•œ๊ตญ์–ด ์–ธ์–ด๋ชจ๋ธ ํ”„๋กœ์ ํŠธ
KoreanLM์€ ํ•œ๊ตญ์–ด ์–ธ์–ด๋ชจ๋ธ์„ ๊ฐœ๋ฐœํ•˜๊ธฐ ์œ„ํ•œ ์˜คํ”ˆ์†Œ์Šค ํ”„๋กœ์ ํŠธ์ž…๋‹ˆ๋‹ค. ํ˜„์žฌ ๋Œ€๋ถ€๋ถ„์˜ ์–ธ์–ด๋ชจ๋ธ๋“ค์€ ์˜์–ด์— ์ดˆ์ ์„ ๋งž์ถ”๊ณ  ์žˆ์–ด, ํ•œ๊ตญ์–ด์— ๋Œ€ํ•œ ํ•™์Šต์ด ์ƒ๋Œ€์ ์œผ๋กœ ๋ถ€์กฑํ•˜๊ณ  ํ† ํฐํ™” ๊ณผ์ •์—์„œ ๋น„ํšจ์œจ์ ์ธ ๊ฒฝ์šฐ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ณ  ํ•œ๊ตญ์–ด์— ์ตœ์ ํ™”๋œ ์–ธ์–ด๋ชจ๋ธ์„ ์ œ๊ณตํ•˜๊ธฐ ์œ„ํ•ด KoreanLM ํ”„๋กœ์ ํŠธ๋ฅผ ์‹œ์ž‘ํ•˜๊ฒŒ ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.
## ํ”„๋กœ์ ํŠธ ๋ชฉํ‘œ
1. ํ•œ๊ตญ์–ด์— ํŠนํ™”๋œ ์–ธ์–ด๋ชจ๋ธ ๊ฐœ๋ฐœ: ํ•œ๊ตญ์–ด์˜ ๋ฌธ๋ฒ•, ์–ดํœ˜, ๋ฌธํ™”์  ํŠน์„ฑ์„ ๋ฐ˜์˜ํ•˜์—ฌ ํ•œ๊ตญ์–ด๋ฅผ ๋” ์ •ํ™•ํ•˜๊ฒŒ ์ดํ•ดํ•˜๊ณ  ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ๋Š” ์–ธ์–ด๋ชจ๋ธ์„ ๊ฐœ๋ฐœํ•ฉ๋‹ˆ๋‹ค.
2. ํšจ์œจ์ ์ธ ํ† ํฐํ™” ๋ฐฉ์‹ ๋„์ž…: ํ•œ๊ตญ์–ด ํ…์ŠคํŠธ์˜ ํ† ํฐํ™” ๊ณผ์ •์—์„œ ํšจ์œจ์ ์ด๊ณ  ์ •ํ™•ํ•œ ๋ถ„์„์ด ๊ฐ€๋Šฅํ•œ ์ƒˆ๋กœ์šด ํ† ํฐํ™” ๋ฐฉ์‹์„ ๋„์ž…ํ•˜์—ฌ ์–ธ์–ด๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์„ ํ–ฅ์ƒ์‹œํ‚ต๋‹ˆ๋‹ค.
3. ๊ฑฐ๋Œ€ ์–ธ์–ด๋ชจ๋ธ์˜ ์‚ฌ์šฉ์„ฑ ๊ฐœ์„ : ํ˜„์žฌ ๊ฑฐ๋Œ€ํ•œ ์‚ฌ์ด์ฆˆ์˜ ์–ธ์–ด๋ชจ๋ธ๋“ค์€ ๊ธฐ์—…์ด ์ž์‚ฌ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ํŒŒ์ธํŠœ๋‹ํ•˜๊ธฐ ์–ด๋ ค์šด ๋ฌธ์ œ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ํ•œ๊ตญ์–ด ์–ธ์–ด๋ชจ๋ธ์˜ ํฌ๊ธฐ๋ฅผ ์กฐ์ ˆํ•˜์—ฌ ์‚ฌ์šฉ์„ฑ์„ ๊ฐœ์„ ํ•˜๊ณ , ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ ์ž‘์—…์— ๋” ์‰ฝ๊ฒŒ ์ ์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•ฉ๋‹ˆ๋‹ค.
## ์‚ฌ์šฉ ๋ฐฉ๋ฒ•
KoreanLM์€ GitHub ์ €์žฅ์†Œ๋ฅผ ํ†ตํ•ด ๋ฐฐํฌ๋ฉ๋‹ˆ๋‹ค. ํ”„๋กœ์ ํŠธ๋ฅผ ์‚ฌ์šฉํ•˜๋ ค๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™์€ ๋ฐฉ๋ฒ•์œผ๋กœ ์„ค์น˜ํ•˜์‹ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
```bash
git clone https://github.com/quantumaikr/KoreanLM.git
cd KoreanLM
pip install -r requirements.txt
```
## ์˜ˆ์ œ
๋‹ค์Œ์€ transformers ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ํ†ตํ•ด ๋ชจ๋ธ๊ณผ ํ† ํฌ๋‚˜์ด์ €๋ฅผ ๋กœ๋”ฉํ•˜๋Š” ์˜ˆ์ œ์ž…๋‹ˆ๋‹ค.
```python
import transformers
model = transformers.AutoModelForCausalLM.from_pretrained("quantumaikr/KoreanLM")
tokenizer = transformers.AutoTokenizer.from_pretrained("quantumaikr/KoreanLM")
```
## ํ›ˆ๋ จ (ํŒŒ์ธํŠœ๋‹)
```bash
torchrun --nproc_per_node=4 --master_port=1004 train.py \
--model_name_or_path quantumaikr/KoreanLM \
--data_path korean_data.json \
--num_train_epochs 3 \
--cache_dir './data' \
--bf16 True \
--tf32 True \
--per_device_train_batch_size 4 \
--per_device_eval_batch_size 4 \
--gradient_accumulation_steps 8 \
--evaluation_strategy "no" \
--save_strategy "steps" \
--save_steps 500 \
--save_total_limit 1 \
--learning_rate 2e-5 \
--weight_decay 0. \
--warmup_ratio 0.03 \
--lr_scheduler_type "cosine" \
--logging_steps 1 \
--fsdp "full_shard auto_wrap" \
--fsdp_transformer_layer_cls_to_wrap 'OPTDecoderLayer' \
```
```bash
pip install deepspeed
torchrun --nproc_per_node=4 --master_port=1004 train.py \
--deepspeed "./deepspeed.json" \
--model_name_or_path quantumaikr/KoreanLM \
--data_path korean_data.json \
--num_train_epochs 3 \
--cache_dir './data' \
--bf16 True \
--tf32 True \
--per_device_train_batch_size 4 \
--per_device_eval_batch_size 4 \
--gradient_accumulation_steps 8 \
--evaluation_strategy "no" \
--save_strategy "steps" \
--save_steps 2000 \
--save_total_limit 1 \
--learning_rate 2e-5 \
--weight_decay 0. \
--warmup_ratio 0.03 \
```
## ํ›ˆ๋ จ (LoRA)
```bash
python finetune-lora.py \
--base_model 'quantumaikr/KoreanLM' \
--data_path './korean_data.json' \
--output_dir './KoreanLM-LoRA' \
--cache_dir './data'
```
## ์ถ”๋ก 
```bash
python generate.py \
--load_8bit \
--share_gradio \
--base_model 'quantumaikr/KoreanLM' \
--lora_weights 'quantumaikr/KoreanLM-LoRA' \
--cache_dir './data'
```
## ์‚ฌ์ „ํ•™์Šต ๋ชจ๋ธ ๊ณต๊ฐœ ๋ฐ ์›น ๋ฐ๋ชจ
[ํ•™์Šต๋ชจ๋ธ](https://huggingface.co./quantumaikr/KoreanLM/tree/main)
<i>* ๋ฐ๋ชจ ๋งํฌ๋Š” ์ถ”ํ›„ ๊ณต๊ณ„์˜ˆ์ •</i>
## ๊ธฐ์—ฌ๋ฐฉ๋ฒ•
1. ์ด์Šˆ ์ œ๊ธฐ: KoreanLM ํ”„๋กœ์ ํŠธ์™€ ๊ด€๋ จ๋œ ๋ฌธ์ œ์ ์ด๋‚˜ ๊ฐœ์„ ์‚ฌํ•ญ์„ ์ด์Šˆ๋กœ ์ œ๊ธฐํ•ด์ฃผ์„ธ์š”.
2. ์ฝ”๋“œ ์ž‘์„ฑ: ๊ฐœ์„ ์‚ฌํ•ญ์ด๋‚˜ ์ƒˆ๋กœ์šด ๊ธฐ๋Šฅ์„ ์ถ”๊ฐ€ํ•˜๊ธฐ ์œ„ํ•ด ์ฝ”๋“œ๋ฅผ ์ž‘์„ฑํ•˜์‹ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ž‘์„ฑ๋œ ์ฝ”๋“œ๋Š” Pull Request๋ฅผ ํ†ตํ•ด ์ œ์ถœํ•ด์ฃผ์‹œ๊ธฐ ๋ฐ”๋ž๋‹ˆ๋‹ค.
3. ๋ฌธ์„œ ์ž‘์„ฑ ๋ฐ ๋ฒˆ์—ญ: ํ”„๋กœ์ ํŠธ์˜ ๋ฌธ์„œ ์ž‘์„ฑ์ด๋‚˜ ๋ฒˆ์—ญ ์ž‘์—…์— ์ฐธ์—ฌํ•˜์—ฌ ํ”„๋กœ์ ํŠธ์˜ ์งˆ์„ ๋†’์—ฌ์ฃผ์„ธ์š”.
4. ํ…Œ์ŠคํŠธ ๋ฐ ํ”ผ๋“œ๋ฐฑ: ํ”„๋กœ์ ํŠธ๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด์„œ ๋ฐœ๊ฒฌํ•œ ๋ฒ„๊ทธ๋‚˜ ๊ฐœ์„ ์‚ฌํ•ญ์„ ํ”ผ๋“œ๋ฐฑํ•ด์ฃผ์‹œ๋ฉด ํฐ ๋„์›€์ด ๋ฉ๋‹ˆ๋‹ค.
## ๋ผ์ด์„ ์Šค
KoreanLM ํ”„๋กœ์ ํŠธ๋Š” Apache 2.0 License ๋ผ์ด์„ ์Šค๋ฅผ ๋”ฐ๋ฆ…๋‹ˆ๋‹ค. ํ”„๋กœ์ ํŠธ๋ฅผ ์‚ฌ์šฉํ•˜์‹ค ๋•Œ ๋ผ์ด์„ ์Šค์— ๋”ฐ๋ผ ์ฃผ์˜์‚ฌํ•ญ์„ ์ง€์ผœ์ฃผ์‹œ๊ธฐ ๋ฐ”๋ž๋‹ˆ๋‹ค.
## ๊ธฐ์ˆ  ๋ฌธ์˜
KoreanLM ํ”„๋กœ์ ํŠธ์™€ ๊ด€๋ จ๋œ ๋ฌธ์˜์‚ฌํ•ญ์ด ์žˆ์œผ์‹œ๋ฉด ์ด๋ฉ”์ผ ๋˜๋Š” GitHub ์ด์Šˆ๋ฅผ ํ†ตํ•ด ๋ฌธ์˜ํ•ด์ฃผ์‹œ๊ธฐ ๋ฐ”๋ž๋‹ˆ๋‹ค. ์ด ํ”„๋กœ์ ํŠธ๊ฐ€ ํ•œ๊ตญ์–ด ์–ธ์–ด๋ชจ๋ธ์— ๋Œ€ํ•œ ์—ฐ๊ตฌ์™€ ๊ฐœ๋ฐœ์— ๋„์›€์ด ๋˜๊ธธ ๋ฐ”๋ผ๋ฉฐ, ๋งŽ์€ ๊ด€์‹ฌ๊ณผ ์ฐธ์—ฌ ๋ถ€ํƒ๋“œ๋ฆฝ๋‹ˆ๋‹ค.
์ด๋ฉ”์ผ: [email protected]
---
This repository has implementations inspired by [open_llama](https://github.com/openlm-research/open_llama), [Stanford Alpaca](https://github.com/tatsu-lab/stanford_alpaca) and [alpaca-lora](https://github.com/tloen/alpaca-lora) projects.