--- license: apache-2.0 tags: - generated_from_trainer base_model: t5-small model-index: - name: results_t5small results: [] --- # results_t5small This model is a fine-tuned version of [t5-small](https://huggingface.co./t5-small) on the None dataset. It achieves the following results on the evaluation set: - Loss: 0.0050 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.005 - train_batch_size: 48 - eval_batch_size: 48 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - lr_scheduler_warmup_steps: 500 - num_epochs: 10 - mixed_precision_training: Native AMP ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:-----:|:-----:|:---------------:| | 0.0235 | 0.12 | 500 | 0.0144 | | 0.0151 | 0.25 | 1000 | 0.0119 | | 0.0153 | 0.37 | 1500 | 0.0101 | | 0.0173 | 0.5 | 2000 | 0.0109 | | 0.0132 | 0.62 | 2500 | 0.0100 | | 0.0123 | 0.75 | 3000 | 0.0092 | | 0.0136 | 0.87 | 3500 | 0.0110 | | 0.014 | 0.99 | 4000 | 0.0089 | | 0.0123 | 1.12 | 4500 | 0.0094 | | 0.0155 | 1.24 | 5000 | 0.0088 | | 0.0119 | 1.37 | 5500 | 0.0095 | | 0.0126 | 1.49 | 6000 | 0.0089 | | 0.0111 | 1.61 | 6500 | 0.0083 | | 0.0129 | 1.74 | 7000 | 0.0081 | | 0.0115 | 1.86 | 7500 | 0.0087 | | 0.0133 | 1.99 | 8000 | 0.0085 | | 0.0167 | 2.11 | 8500 | 0.0105 | | 0.0113 | 2.24 | 9000 | 0.0088 | | 0.0106 | 2.36 | 9500 | 0.0077 | | 0.0101 | 2.48 | 10000 | 0.0075 | | 0.0086 | 2.61 | 10500 | 0.0076 | | 0.0112 | 2.73 | 11000 | 0.0075 | | 0.0097 | 2.86 | 11500 | 0.0076 | | 0.0093 | 2.98 | 12000 | 0.0078 | | 0.0086 | 3.1 | 12500 | 0.0076 | | 0.0092 | 3.23 | 13000 | 0.0077 | | 0.0089 | 3.35 | 13500 | 0.0074 | | 0.0086 | 3.48 | 14000 | 0.0072 | | 0.0084 | 3.6 | 14500 | 0.0072 | | 0.0083 | 3.73 | 15000 | 0.0073 | | 0.0076 | 3.85 | 15500 | 0.0072 | | 0.0097 | 3.97 | 16000 | 0.0071 | | 0.0074 | 4.1 | 16500 | 0.0071 | | 0.0074 | 4.22 | 17000 | 0.0070 | | 0.0089 | 4.35 | 17500 | 0.0069 | | 0.0072 | 4.47 | 18000 | 0.0069 | | 0.0072 | 4.6 | 18500 | 0.0068 | | 0.0084 | 4.72 | 19000 | 0.0067 | | 0.0076 | 4.84 | 19500 | 0.0072 | | 0.0084 | 4.97 | 20000 | 0.0067 | | 0.0062 | 5.09 | 20500 | 0.0065 | | 0.007 | 5.22 | 21000 | 0.0064 | | 0.0075 | 5.34 | 21500 | 0.0063 | | 0.0067 | 5.46 | 22000 | 0.0064 | | 0.0063 | 5.59 | 22500 | 0.0065 | | 0.007 | 5.71 | 23000 | 0.0062 | | 0.0064 | 5.84 | 23500 | 0.0062 | | 0.0073 | 5.96 | 24000 | 0.0062 | | 0.0064 | 6.09 | 24500 | 0.0063 | | 0.0059 | 6.21 | 25000 | 0.0060 | | 0.0062 | 6.33 | 25500 | 0.0060 | | 0.0057 | 6.46 | 26000 | 0.0060 | | 0.006 | 6.58 | 26500 | 0.0058 | | 0.0053 | 6.71 | 27000 | 0.0058 | | 0.0064 | 6.83 | 27500 | 0.0057 | | 0.0061 | 6.95 | 28000 | 0.0057 | | 0.0055 | 7.08 | 28500 | 0.0057 | | 0.0065 | 7.2 | 29000 | 0.0057 | | 0.0058 | 7.33 | 29500 | 0.0057 | | 0.0052 | 7.45 | 30000 | 0.0055 | | 0.0058 | 7.58 | 30500 | 0.0056 | | 0.0055 | 7.7 | 31000 | 0.0055 | | 0.0049 | 7.82 | 31500 | 0.0054 | | 0.0053 | 7.95 | 32000 | 0.0053 | | 0.005 | 8.07 | 32500 | 0.0054 | | 0.0051 | 8.2 | 33000 | 0.0053 | | 0.0047 | 8.32 | 33500 | 0.0053 | | 0.0057 | 8.45 | 34000 | 0.0052 | | 0.0049 | 8.57 | 34500 | 0.0053 | | 0.0051 | 8.69 | 35000 | 0.0052 | | 0.0047 | 8.82 | 35500 | 0.0052 | | 0.0051 | 8.94 | 36000 | 0.0051 | | 0.0044 | 9.07 | 36500 | 0.0052 | | 0.0042 | 9.19 | 37000 | 0.0052 | | 0.0045 | 9.31 | 37500 | 0.0051 | | 0.0043 | 9.44 | 38000 | 0.0051 | | 0.0041 | 9.56 | 38500 | 0.0051 | | 0.0042 | 9.69 | 39000 | 0.0051 | | 0.004 | 9.81 | 39500 | 0.0050 | | 0.0038 | 9.94 | 40000 | 0.0050 | ### Framework versions - Transformers 4.37.1 - Pytorch 2.1.2 - Datasets 2.16.1 - Tokenizers 0.15.1