metadata
language:
- ko
library_name: transformers
pipeline_tag: text-generation
license: cc-by-nc-sa-4.0
datasets:
- kyujinpy/KOR-OpenOrca-Platypus-v3
PracticeLLM/KoSOLAR-Platypus-10.7B
Model Details
Model Developers Kyujin Han (kyujinpy)
Method
LoRA with quantization.
Dataset
kyujinpy/KOR-OpenOrca-Platypus-v3.
Hyperparameters
python finetune.py \
--base_model yanolja/KoSOLAR-10.7B-v0.2 \
--data-path kyujinpy/KOR-OpenOrca-Platypus-v3 \
--output_dir ./Ko-PlatypusSOLAR-10.7B \
--batch_size 64 \
--micro_batch_size 1 \
--num_epochs 5 \
--learning_rate 2e-5 \
--cutoff_len 2048 \
--val_set_size 0 \
--lora_r 64 \
--lora_alpha 64 \
--lora_dropout 0.05 \
--lora_target_modules '[embed_tokens, q_proj, k_proj, v_proj, o_proj, gate_proj, down_proj, up_proj, lm_head]' \
--train_on_inputs False \
--add_eos_token False \
--group_by_length False \
--prompt_template_name en_simple \
--lr_scheduler 'cosine' \
Share all of things. It is my belief.
Model Benchmark
Open Ko-LLM leaderboard & lm-evaluation-harness(zero-shot)
- Follow up as Ko-link.
Model Average ARC HellaSwag MMLU TruthfulQA Ko-CommonGenV2 PracticeLLM/KoSOLAR-Platypus-10.7B --- --- --- --- --- --- LDCC/LDCC-SOLAR-10.7B 59.34 55.38 65.56 53.38 64.39 57.97 yanolja/KoSOLAR-10.7B-v0.2 55.62 50.51 62.29 53.76 47.31 64.23 megastudyedu/M-SOLAR-10.7B-v1.3 56.64 51.37 60.93 54.91 48.45 67.53
Implementation Code
### KO-Platypus
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
repo = "PracticeLLM/KoSOLAR-Platypus-10.7B"
OpenOrca = AutoModelForCausalLM.from_pretrained(
repo,
return_dict=True,
torch_dtype=torch.float16,
device_map='auto'
)
OpenOrca_tokenizer = AutoTokenizer.from_pretrained(repo)