metadata
library_name: transformers
license: apache-2.0
base_model: google/flan-t5-large
tags:
- generated_from_trainer
metrics:
- accuracy
- precision
- recall
model-index:
- name: flanT5_large_FINAL_MT
results: []
flanT5_large_FINAL_MT
This model is a fine-tuned version of google/flan-t5-large on the None dataset. It achieves the following results on the evaluation set:
- Loss: 2.2262
- Accuracy: 0.7917
- Precision: 0.7946
- Recall: 0.7867
- F1 score: 0.7906
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 10
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | Precision | Recall | F1 score |
---|---|---|---|---|---|---|---|
1.2184 | 0.3910 | 2500 | 0.9222 | 0.7383 | 0.8824 | 0.55 | 0.6776 |
1.2323 | 0.7820 | 5000 | 1.6840 | 0.6617 | 0.6222 | 0.8233 | 0.7088 |
1.0084 | 1.1730 | 7500 | 0.9030 | 0.7583 | 0.7969 | 0.6933 | 0.7415 |
0.9192 | 1.5640 | 10000 | 1.3138 | 0.6917 | 0.6322 | 0.9167 | 0.7483 |
0.8932 | 1.9550 | 12500 | 1.0240 | 0.7467 | 0.6927 | 0.8867 | 0.7778 |
0.7463 | 2.3459 | 15000 | 0.9096 | 0.765 | 0.7870 | 0.7267 | 0.7556 |
0.7129 | 2.7369 | 17500 | 0.9885 | 0.765 | 0.8046 | 0.7 | 0.7487 |
0.6329 | 3.1279 | 20000 | 1.1713 | 0.78 | 0.78 | 0.78 | 0.78 |
0.5556 | 3.5189 | 22500 | 1.1964 | 0.7783 | 0.7313 | 0.88 | 0.7988 |
0.4961 | 3.9099 | 25000 | 1.0480 | 0.7783 | 0.8081 | 0.73 | 0.7671 |
0.3171 | 4.3009 | 27500 | 1.4131 | 0.785 | 0.8178 | 0.7333 | 0.7733 |
0.2846 | 4.6919 | 30000 | 1.4925 | 0.755 | 0.7429 | 0.78 | 0.7610 |
0.2499 | 5.0829 | 32500 | 1.6449 | 0.77 | 0.7396 | 0.8333 | 0.7837 |
0.159 | 5.4739 | 35000 | 1.5721 | 0.7917 | 0.8136 | 0.7567 | 0.7841 |
0.1929 | 5.8649 | 37500 | 1.7558 | 0.7667 | 0.7721 | 0.7567 | 0.7643 |
0.1236 | 6.2559 | 40000 | 1.9731 | 0.775 | 0.7778 | 0.77 | 0.7739 |
0.0855 | 6.6469 | 42500 | 2.0351 | 0.755 | 0.7179 | 0.84 | 0.7742 |
0.0952 | 7.0378 | 45000 | 1.8523 | 0.7833 | 0.7724 | 0.8033 | 0.7876 |
0.0646 | 7.4288 | 47500 | 1.8575 | 0.775 | 0.7603 | 0.8033 | 0.7812 |
0.0562 | 7.8198 | 50000 | 1.9564 | 0.7783 | 0.7685 | 0.7967 | 0.7823 |
0.0324 | 8.2108 | 52500 | 2.1755 | 0.7767 | 0.7823 | 0.7667 | 0.7744 |
0.0467 | 8.6018 | 55000 | 2.0450 | 0.7733 | 0.8015 | 0.7267 | 0.7622 |
0.0419 | 8.9928 | 57500 | 1.9761 | 0.7917 | 0.7760 | 0.82 | 0.7974 |
0.0222 | 9.3838 | 60000 | 2.2657 | 0.79 | 0.7806 | 0.8067 | 0.7934 |
0.0303 | 9.7748 | 62500 | 2.2262 | 0.7917 | 0.7946 | 0.7867 | 0.7906 |
Framework versions
- Transformers 4.44.2
- Pytorch 2.3.0+cu121
- Datasets 2.19.1
- Tokenizers 0.19.1