|
--- |
|
library_name: peft |
|
base_model: KT-AI/midm-bitext-S-7B-inst-v1 |
|
datasets: |
|
- nsmc |
|
--- |
|
|
|
### Model Description |
|
|
|
- NSMC λ°μ΄ν°μ λν΄ KT-AI/midm-bitext-S-7B-inst-v1 λ―ΈμΈνλ |
|
- μν 리뷰 ν
μ€νΈλ₯Ό ν둬ννΈμ ν¬ν¨νμ¬ λͺ¨λΈμ μ
λ ₯νλ©΄ 'κΈμ ' λλ 'λΆμ 'μ΄λΌκ³ μμΈ‘ ν
μ€νΈλ₯Ό μ§μ μμ± |
|
- NSMCμ train μ€νλ¦Ώ μμ 2,000κ° μ΄μμ μνμ νμ΅μ μ¬μ© |
|
- test μ€νλ¦Ώ μμ 1,000κ°μ μνλ§ μΈ‘μ |
|
|
|
|
|
## Training procedure |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 0.0001 |
|
- train_batch_size: 1 |
|
- eval_batch_size: 1 |
|
- seed: 42 |
|
- gradient_accumulation_steps: 2 |
|
- total_train_batch_size: 2 |
|
- optimizer: adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, |
|
- lr_scheduler_type: cosine |
|
- lr_scheduler_warmup_ratio: 0.03 |
|
- training_args.logging_steps: 50 |
|
- training_args.max_steps : 300 |
|
- trainable params: 16,744,448 || all params: 7,034,347,520 || trainable%: 0.23803839591934178 |
|
|
|
|
|
### Training Results |
|
|
|
TrainOutput( |
|
global_step=300, training_loss=2.666887741088867, |
|
metrics={'train_runtime': 961.226, |
|
'train_samples_per_second': 0.624, |
|
'train_steps_per_second': 0.312, |
|
'total_flos': 9315508499251200.0, |
|
'train_loss': 2.666887741088867, 'epoch': 0.3}) |
|
|
|
|
|
|
|
### Accuracy |
|
|
|
Midm: μ νλ 0.88 |
|
|
|
| | TP | TN | |
|
|---|---|---| |
|
| PP | 416 | 23 | |
|
| PN | 92 | 469 | |
|
|
|
|
|
|
|
### Model Card Authors |
|
|
|
cxoijve |