mistral_7B_KoFLAN_ver2_QLoRA
This model is a fine-tuned version of mistralai/Mistral-7B-v0.1 on the None dataset. It achieves the following results on the evaluation set:
- Loss: 1.3025
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 16
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 2
- total_train_batch_size: 32
- total_eval_batch_size: 16
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 100
- num_epochs: 2.0
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
2.017 | 0.03 | 100 | 1.3341 |
1.7547 | 0.05 | 200 | 1.3018 |
1.5125 | 0.08 | 300 | 1.2914 |
1.4717 | 0.11 | 400 | 1.2833 |
1.4686 | 0.14 | 500 | 1.2790 |
1.4038 | 0.16 | 600 | 1.2751 |
1.4086 | 0.19 | 700 | 1.2803 |
1.4371 | 0.22 | 800 | 1.2843 |
21.1904 | 0.24 | 900 | 12.5749 |
30.0185 | 0.27 | 1000 | 5.5864 |
12.0201 | 0.3 | 1100 | 2.9353 |
5.9096 | 0.32 | 1200 | 2.1182 |
7.0684 | 0.35 | 1300 | 2.4859 |
6.7228 | 0.38 | 1400 | 2.2694 |
4.4835 | 0.41 | 1500 | 1.9221 |
5.2842 | 0.43 | 1600 | 1.9740 |
3.7658 | 0.46 | 1700 | 1.7739 |
3.5637 | 0.49 | 1800 | 1.7292 |
3.1788 | 0.51 | 1900 | 1.6524 |
2.8849 | 0.54 | 2000 | 1.5946 |
2.628 | 0.57 | 2100 | 1.5482 |
2.2247 | 0.59 | 2200 | 1.5133 |
2.1339 | 0.62 | 2300 | 1.4882 |
2.0574 | 0.65 | 2400 | 1.4656 |
1.9366 | 0.68 | 2500 | 1.4454 |
1.8554 | 0.7 | 2600 | 1.4273 |
1.7418 | 0.73 | 2700 | 1.4147 |
1.7082 | 0.76 | 2800 | 1.3997 |
1.6028 | 0.78 | 2900 | 1.3867 |
1.5355 | 0.81 | 3000 | 1.3747 |
1.485 | 0.84 | 3100 | 1.3644 |
1.4463 | 0.86 | 3200 | 1.3573 |
1.3714 | 0.89 | 3300 | 1.3526 |
1.4356 | 0.92 | 3400 | 1.3461 |
1.3017 | 0.95 | 3500 | 1.3405 |
1.4453 | 0.97 | 3600 | 1.3362 |
1.3736 | 1.0 | 3700 | 1.3326 |
1.2048 | 1.03 | 3800 | 1.3297 |
1.2779 | 1.05 | 3900 | 1.3274 |
1.3065 | 1.08 | 4000 | 1.3256 |
1.2698 | 1.11 | 4100 | 1.3229 |
1.237 | 1.13 | 4200 | 1.3215 |
1.1412 | 1.16 | 4300 | 1.3203 |
1.234 | 1.19 | 4400 | 1.3186 |
1.194 | 1.22 | 4500 | 1.3174 |
1.2338 | 1.24 | 4600 | 1.3149 |
1.2936 | 1.27 | 4700 | 1.3143 |
1.2571 | 1.3 | 4800 | 1.3135 |
1.2435 | 1.32 | 4900 | 1.3120 |
1.2681 | 1.35 | 5000 | 1.3107 |
1.2248 | 1.38 | 5100 | 1.3097 |
1.1332 | 1.4 | 5200 | 1.3088 |
1.1791 | 1.43 | 5300 | 1.3075 |
1.1646 | 1.46 | 5400 | 1.3069 |
1.2292 | 1.49 | 5500 | 1.3062 |
1.2554 | 1.51 | 5600 | 1.3054 |
1.1609 | 1.54 | 5700 | 1.3052 |
1.176 | 1.57 | 5800 | 1.3043 |
1.1948 | 1.59 | 5900 | 1.3042 |
1.1441 | 1.62 | 6000 | 1.3036 |
1.1674 | 1.65 | 6100 | 1.3035 |
1.1119 | 1.67 | 6200 | 1.3034 |
1.1469 | 1.7 | 6300 | 1.3032 |
1.1834 | 1.73 | 6400 | 1.3033 |
1.0901 | 1.76 | 6500 | 1.3031 |
1.1771 | 1.78 | 6600 | 1.3030 |
1.2767 | 1.81 | 6700 | 1.3028 |
1.1978 | 1.84 | 6800 | 1.3027 |
1.1495 | 1.86 | 6900 | 1.3026 |
1.1291 | 1.89 | 7000 | 1.3026 |
1.1783 | 1.92 | 7100 | 1.3026 |
1.2189 | 1.94 | 7200 | 1.3025 |
1.3167 | 1.97 | 7300 | 1.3025 |
1.1293 | 2.0 | 7400 | 1.3025 |
Framework versions
- Transformers 4.34.0
- Pytorch 2.1.0+cu118
- Datasets 2.14.5
- Tokenizers 0.14.1
Model tree for Heerak/mistral_7B_KoFLAN_ver2_QLoRA
Base model
mistralai/Mistral-7B-v0.1