End of training

Files changed (4) hide show

README.md CHANGED Viewed

@@ -15,7 +15,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [google/flan-t5-large](https://huggingface.co/google/flan-t5-large) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.1242
 ## Model description
@@ -35,22 +35,19 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 3e-05
-- train_batch_size: 2
-- eval_batch_size: 8
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- num_epochs: 5
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 0.1789        | 1.0   | 430  | 0.1277          |
-| 0.2154        | 2.0   | 860  | 0.1270          |
-| 0.2004        | 3.0   | 1290 | 0.1255          |
-| 0.2039        | 4.0   | 1720 | 0.1252          |
-| 0.2           | 5.0   | 2150 | 0.1242          |
 ### Framework versions

 This model is a fine-tuned version of [google/flan-t5-large](https://huggingface.co/google/flan-t5-large) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.1216
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 3e-05
+- train_batch_size: 4
+- eval_batch_size: 4
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- num_epochs: 2
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 0.1888        | 1.0   | 215  | 0.1245          |
+| 0.1657        | 2.0   | 430  | 0.1216          |
 ### Framework versions

adapter_config.json CHANGED Viewed

@@ -2,14 +2,14 @@
   "auto_mapping": null,
   "base_model_name_or_path": "google/flan-t5-large",
   "encoder_dropout": 0.0,
-  "encoder_hidden_size": 128,
   "encoder_num_layers": 2,
   "encoder_reparameterization_type": "MLP",
   "inference_mode": true,
   "num_attention_heads": 16,
   "num_layers": 24,
   "num_transformer_submodules": 2,
-  "num_virtual_tokens": 20,
   "peft_type": "P_TUNING",
   "revision": null,
   "task_type": "SEQ_2_SEQ_LM",

   "auto_mapping": null,
   "base_model_name_or_path": "google/flan-t5-large",
   "encoder_dropout": 0.0,
+  "encoder_hidden_size": 1024,
   "encoder_num_layers": 2,
   "encoder_reparameterization_type": "MLP",
   "inference_mode": true,
   "num_attention_heads": 16,
   "num_layers": 24,
   "num_transformer_submodules": 2,
+  "num_virtual_tokens": 32,
   "peft_type": "P_TUNING",
   "revision": null,
   "task_type": "SEQ_2_SEQ_LM",

adapter_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:5973f40559b9507998719e9a0a4d7d11209af8e414fffa32fcfd55b1323b5a48
-size 164605

 version https://git-lfs.github.com/spec/v1
+oid sha256:d6ad6cf6e1c5691e06f395dd7feec208d03062829a2c431db5bf476731cf5449
+size 262973

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:8d4e32eac643d59fc3644a822543bbcd83fab8a4a3c9f6bdf82b660e4c190693
 size 4219

 version https://git-lfs.github.com/spec/v1
+oid sha256:80bd973e4276b0e39c5425190497a1508f752cf6282ab411c0fc7d8f30b19019
 size 4219