|
--- |
|
license: cc-by-nc-4.0 |
|
base_model: KT-AI/midm-bitext-S-7B-inst-v1 |
|
tags: |
|
- generated_from_trainer |
|
model-index: |
|
- name: dataset_infos_midm |
|
results: [] |
|
--- |
|
|
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You |
|
should probably proofread and complete it, then remove this comment. --> |
|
|
|
# dataset_infos_midm |
|
|
|
This model is a fine-tuned version of [KT-AI/midm-bitext-S-7B-inst-v1](https://huggingface.co./KT-AI/midm-bitext-S-7B-inst-v1) on an unknown dataset. |
|
|
|
## Model description |
|
|
|
Midmμ KTκ° κ°λ°ν μ¬μ νμ΅ νκ΅μ΄-μμ΄ μΈμ΄λͺ¨λΈ μ
λλ€. λ¬Έμμ΄μ μ
λ ₯μΌλ‘ νλ©°, λ¬Έμμ΄μ μμ±ν©λλ€. |
|
ν΄λΉ λͺ¨λΈ(KT-AI/midm-bitext-S-7B-inst-v1)μ λ² μ΄μ€ λͺ¨λΈλ‘ νμ¬ λ―ΈμΈνλμ μ§ννμμ΅λλ€. |
|
|
|
Midm is a pre-trained Korean-English language model developed by KT. It takes text as input and creates text. |
|
We fine-tuned the model based on KT-AI/midm-bitext-S-7B-inst-v1. |
|
|
|
## Intended uses & limitations |
|
|
|
nsmc λ°μ΄ν°μ
μ μ¬μ©μκ° μ
λ ₯ν 리뷰 λ¬Έμ₯μ λΆλ₯νλ μμ΄μ νΈμ΄λ€. μ¬μ©μ 리뷰 λ¬Έμ₯μΌλ‘λΆν° 'κΈμ ' λλ 'λΆμ 'μ νλ¨ν©λλ€. |
|
|
|
This is an agent that classifies user-input review sentences from NSMC dataset. |
|
It determines whether the user review sentences are 'positive' or 'negative'. |
|
|
|
## Training and test data |
|
|
|
Training λ° test λ°μ΄ν°λ nsmc λ°μ΄ν° μ
μμ λ‘λ©ν΄ μ¬μ©ν©λλ€. (elvaluation λ°μ΄ν°λ μ¬μ©νμ§ μμ΅λλ€.) |
|
|
|
We load and use training and test data from the NSMC dataset. (We do not use an evaluation data.) |
|
|
|
## Training procedure |
|
|
|
μ¬μ©μμ μν 리뷰 λ¬Έμ₯μ μ
λ ₯μΌλ‘ λ°μ λ¬Έμ₯μ 'κΈμ (1)' λλ 'λΆμ (0)'μΌλ‘ λΆλ₯ν©λλ€. |
|
|
|
Accepts movie review sentences from the user as input and classifies the sentences as 'Positive (1)' or 'Negative (0)'. |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 0.0001 |
|
- train_batch_size: 1 |
|
- eval_batch_size: 1 |
|
- seed: 42 |
|
- gradient_accumulation_steps: 2 |
|
- total_train_batch_size: 2 |
|
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 |
|
- lr_scheduler_type: cosine |
|
- lr_scheduler_warmup_ratio: 0.03 |
|
- training_steps: 300 |
|
- mixed_precision_training: Native AMP |
|
|
|
### Training results |
|
|
|
- The following are the results considering incorrectly generated words(e.g., **μ **, **' '**). |
|
- **Binary Confusion Matrix** |
|
| | TP | TN | |
|
|:-----|:------------:|:------------:| |
|
| PP | 443 | 49 | |
|
| PN | 57 | 451 | |
|
|
|
- **Accuracy**: 0.894 |
|
|
|
- The following are the results without considering incorrectly generated words as wrong(e.g., **μ **, **' '**). |
|
- **Binary Confusion Matrix** |
|
| | TP | TN | |
|
|:-----|:------------:|:------------:| |
|
| PP | 443 | 38 | |
|
| PN | 44 | 451 | |
|
|
|
- **Accuracy**: 0.916 |
|
|
|
### Framework versions |
|
|
|
- Transformers 4.35.2 |
|
- Pytorch 2.1.0+cu118 |
|
- Datasets 2.15.0 |
|
- Tokenizers 0.15.0 |
|
|