hw_midm_7B_nsmc / README.md
cxoijve's picture
Update README.md
544be6c
---
library_name: peft
base_model: KT-AI/midm-bitext-S-7B-inst-v1
datasets:
- nsmc
---
### Model Description
- NSMC 데이터에 λŒ€ν•΄ KT-AI/midm-bitext-S-7B-inst-v1 λ―Έμ„ΈνŠœλ‹
- μ˜ν™” 리뷰 ν…μŠ€νŠΈλ₯Ό ν”„λ‘¬ν”„νŠΈμ— ν¬ν•¨ν•˜μ—¬ λͺ¨λΈμ— μž…λ ₯ν•˜λ©΄ '긍정' λ˜λŠ” 'λΆ€μ •'이라고 예츑 ν…μŠ€νŠΈλ₯Ό 직접 생성
- NSMC의 train μŠ€ν”Œλ¦Ώ μƒμœ„ 2,000개 μ΄μƒμ˜ μƒ˜ν”Œμ„ ν•™μŠ΅μ— μ‚¬μš©
- test μŠ€ν”Œλ¦Ώ μƒμœ„ 1,000개의 μƒ˜ν”Œλ§Œ μΈ‘μ •
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 2
- optimizer: adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08,
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.03
- training_args.logging_steps: 50
- training_args.max_steps : 300
- trainable params: 16,744,448 || all params: 7,034,347,520 || trainable%: 0.23803839591934178
### Training Results
TrainOutput(
global_step=300, training_loss=2.666887741088867,
metrics={'train_runtime': 961.226,
'train_samples_per_second': 0.624,
'train_steps_per_second': 0.312,
'total_flos': 9315508499251200.0,
'train_loss': 2.666887741088867, 'epoch': 0.3})
### Accuracy
Midm: 정확도 0.88
| | TP | TN |
|---|---|---|
| PP | 416 | 23 |
| PN | 92 | 469 |
### Model Card Authors
cxoijve