metadata
base_model: Daewon0808/prm800k_llama_fulltune
library_name: peft
license: llama3.1
tags:
- generated_from_trainer
model-index:
- name: mmlu_small_noaugse1_llama_lora
results: []
mmlu_small_noaugse1_llama_lora
This model is a fine-tuned version of Daewon0808/prm800k_llama_fulltune on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.3012
- Prm accuracy: 0.8730
- Prm precision: 0.8824
- Prm recall: 0.9813
- Prm specificty: 0.2632
- Prm npv: 0.7143
- Prm f1: 0.9292
- Prm f1 neg: 0.3846
- Prm f1 auc: 0.6222
- Prm f1 auc (fixed): 0.8866
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 2
- eval_batch_size: 4
- seed: 908932403
- distributed_type: multi-GPU
- num_devices: 8
- gradient_accumulation_steps: 2
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 1
Training results
Training Loss | Epoch | Step | Validation Loss | Prm accuracy | Prm precision | Prm recall | Prm specificty | Prm npv | Prm f1 | Prm f1 neg | Prm f1 auc | Prm f1 auc (fixed) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
No log | 0 | 0 | 0.3535 | 0.8333 | 0.8772 | 0.9346 | 0.2632 | 0.4167 | 0.9050 | 0.3226 | 0.5989 | 0.8195 |
0.3732 | 0.0229 | 5 | 0.3504 | 0.8333 | 0.8707 | 0.9439 | 0.2105 | 0.4 | 0.9058 | 0.2759 | 0.5772 | 0.8212 |
0.3481 | 0.0459 | 10 | 0.4012 | 0.8571 | 0.856 | 1.0 | 0.0526 | 1.0 | 0.9224 | 0.1 | 0.5263 | 0.8384 |
0.271 | 0.0688 | 15 | 0.3846 | 0.8571 | 0.856 | 1.0 | 0.0526 | 1.0 | 0.9224 | 0.1 | 0.5263 | 0.8596 |
0.2408 | 0.0917 | 20 | 0.3040 | 0.8492 | 0.8729 | 0.9626 | 0.2105 | 0.5 | 0.9156 | 0.2963 | 0.5866 | 0.8610 |
0.2908 | 0.1147 | 25 | 0.3220 | 0.8571 | 0.8678 | 0.9813 | 0.1579 | 0.6 | 0.9211 | 0.25 | 0.5696 | 0.8714 |
0.2546 | 0.1376 | 30 | 0.2797 | 0.8651 | 0.8814 | 0.9720 | 0.2632 | 0.625 | 0.9244 | 0.3704 | 0.6176 | 0.8778 |
0.1904 | 0.1606 | 35 | 0.3013 | 0.8571 | 0.8618 | 0.9907 | 0.1053 | 0.6667 | 0.9217 | 0.1818 | 0.5480 | 0.8792 |
0.3147 | 0.1835 | 40 | 0.2761 | 0.8651 | 0.8814 | 0.9720 | 0.2632 | 0.625 | 0.9244 | 0.3704 | 0.6176 | 0.8832 |
0.225 | 0.2064 | 45 | 0.2742 | 0.8651 | 0.8947 | 0.9533 | 0.3684 | 0.5833 | 0.9231 | 0.4516 | 0.6608 | 0.8775 |
0.2551 | 0.2294 | 50 | 0.3032 | 0.8413 | 0.8595 | 0.9720 | 0.1053 | 0.4 | 0.9123 | 0.1667 | 0.5386 | 0.8778 |
0.1994 | 0.2523 | 55 | 0.2781 | 0.8730 | 0.8957 | 0.9626 | 0.3684 | 0.6364 | 0.9279 | 0.4667 | 0.6655 | 0.8770 |
0.2473 | 0.2752 | 60 | 0.3117 | 0.8571 | 0.8739 | 0.9720 | 0.2105 | 0.5714 | 0.9204 | 0.3077 | 0.5912 | 0.8812 |
0.1736 | 0.2982 | 65 | 0.3300 | 0.8651 | 0.8689 | 0.9907 | 0.1579 | 0.75 | 0.9258 | 0.2609 | 0.5743 | 0.8815 |
0.1567 | 0.3211 | 70 | 0.2807 | 0.8571 | 0.8739 | 0.9720 | 0.2105 | 0.5714 | 0.9204 | 0.3077 | 0.5912 | 0.8763 |
0.1793 | 0.3440 | 75 | 0.3020 | 0.8571 | 0.8739 | 0.9720 | 0.2105 | 0.5714 | 0.9204 | 0.3077 | 0.5912 | 0.8768 |
0.1223 | 0.3670 | 80 | 0.3180 | 0.8571 | 0.8739 | 0.9720 | 0.2105 | 0.5714 | 0.9204 | 0.3077 | 0.5912 | 0.8790 |
0.1849 | 0.3899 | 85 | 0.3032 | 0.8492 | 0.8729 | 0.9626 | 0.2105 | 0.5 | 0.9156 | 0.2963 | 0.5866 | 0.8834 |
0.2052 | 0.4128 | 90 | 0.2892 | 0.8651 | 0.8879 | 0.9626 | 0.3158 | 0.6 | 0.9238 | 0.4138 | 0.6392 | 0.8856 |
0.2269 | 0.4358 | 95 | 0.2980 | 0.8492 | 0.8729 | 0.9626 | 0.2105 | 0.5 | 0.9156 | 0.2963 | 0.5866 | 0.8861 |
0.169 | 0.4587 | 100 | 0.2658 | 0.8651 | 0.8879 | 0.9626 | 0.3158 | 0.6 | 0.9238 | 0.4138 | 0.6392 | 0.8893 |
0.1243 | 0.4817 | 105 | 0.2611 | 0.8571 | 0.8803 | 0.9626 | 0.2632 | 0.5556 | 0.9196 | 0.3571 | 0.6129 | 0.8866 |
0.2538 | 0.5046 | 110 | 0.2870 | 0.8730 | 0.8760 | 0.9907 | 0.2105 | 0.8 | 0.9298 | 0.3333 | 0.6006 | 0.8837 |
0.1748 | 0.5275 | 115 | 0.2800 | 0.8651 | 0.875 | 0.9813 | 0.2105 | 0.6667 | 0.9251 | 0.32 | 0.5959 | 0.8805 |
0.1598 | 0.5505 | 120 | 0.2816 | 0.8571 | 0.8803 | 0.9626 | 0.2632 | 0.5556 | 0.9196 | 0.3571 | 0.6129 | 0.8780 |
0.2259 | 0.5734 | 125 | 0.2972 | 0.8571 | 0.8678 | 0.9813 | 0.1579 | 0.6 | 0.9211 | 0.25 | 0.5696 | 0.8778 |
0.1927 | 0.5963 | 130 | 0.2995 | 0.8571 | 0.8678 | 0.9813 | 0.1579 | 0.6 | 0.9211 | 0.25 | 0.5696 | 0.8815 |
0.1276 | 0.6193 | 135 | 0.2827 | 0.8571 | 0.8739 | 0.9720 | 0.2105 | 0.5714 | 0.9204 | 0.3077 | 0.5912 | 0.8802 |
0.1068 | 0.6422 | 140 | 0.2854 | 0.8651 | 0.8814 | 0.9720 | 0.2632 | 0.625 | 0.9244 | 0.3704 | 0.6176 | 0.8834 |
0.1804 | 0.6651 | 145 | 0.2889 | 0.8651 | 0.8814 | 0.9720 | 0.2632 | 0.625 | 0.9244 | 0.3704 | 0.6176 | 0.8851 |
0.1829 | 0.6881 | 150 | 0.2849 | 0.8730 | 0.8957 | 0.9626 | 0.3684 | 0.6364 | 0.9279 | 0.4667 | 0.6655 | 0.8834 |
0.1611 | 0.7110 | 155 | 0.2924 | 0.8651 | 0.8879 | 0.9626 | 0.3158 | 0.6 | 0.9238 | 0.4138 | 0.6392 | 0.8834 |
0.1683 | 0.7339 | 160 | 0.3141 | 0.8730 | 0.8824 | 0.9813 | 0.2632 | 0.7143 | 0.9292 | 0.3846 | 0.6222 | 0.8842 |
0.1726 | 0.7569 | 165 | 0.3316 | 0.8571 | 0.8678 | 0.9813 | 0.1579 | 0.6 | 0.9211 | 0.25 | 0.5696 | 0.8832 |
0.1065 | 0.7798 | 170 | 0.3277 | 0.8571 | 0.8678 | 0.9813 | 0.1579 | 0.6 | 0.9211 | 0.25 | 0.5696 | 0.8832 |
0.2203 | 0.8028 | 175 | 0.3200 | 0.8651 | 0.875 | 0.9813 | 0.2105 | 0.6667 | 0.9251 | 0.32 | 0.5959 | 0.8844 |
0.0881 | 0.8257 | 180 | 0.3158 | 0.8651 | 0.875 | 0.9813 | 0.2105 | 0.6667 | 0.9251 | 0.32 | 0.5959 | 0.8859 |
0.2375 | 0.8486 | 185 | 0.3077 | 0.8651 | 0.875 | 0.9813 | 0.2105 | 0.6667 | 0.9251 | 0.32 | 0.5959 | 0.8856 |
0.1716 | 0.8716 | 190 | 0.3030 | 0.8810 | 0.8898 | 0.9813 | 0.3158 | 0.75 | 0.9333 | 0.4444 | 0.6485 | 0.8856 |
0.0486 | 0.8945 | 195 | 0.3034 | 0.8730 | 0.8824 | 0.9813 | 0.2632 | 0.7143 | 0.9292 | 0.3846 | 0.6222 | 0.8856 |
0.1419 | 0.9174 | 200 | 0.3024 | 0.8810 | 0.8898 | 0.9813 | 0.3158 | 0.75 | 0.9333 | 0.4444 | 0.6485 | 0.8837 |
0.116 | 0.9404 | 205 | 0.3042 | 0.8730 | 0.8824 | 0.9813 | 0.2632 | 0.7143 | 0.9292 | 0.3846 | 0.6222 | 0.8871 |
0.1263 | 0.9633 | 210 | 0.3024 | 0.8810 | 0.8898 | 0.9813 | 0.3158 | 0.75 | 0.9333 | 0.4444 | 0.6485 | 0.8842 |
0.1648 | 0.9862 | 215 | 0.3012 | 0.8730 | 0.8824 | 0.9813 | 0.2632 | 0.7143 | 0.9292 | 0.3846 | 0.6222 | 0.8866 |
Framework versions
- PEFT 0.12.0
- Transformers 4.46.0
- Pytorch 2.4.0+cu118
- Datasets 3.0.0
- Tokenizers 0.20.1