End of training

Browse files

Files changed (6) hide show

README.md +45 -26
config.json +1 -1
generation_config.json +0 -1
model.safetensors +2 -2
tokenizer.json +2 -2
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -3,19 +3,26 @@ license: apache-2.0
 base_model: facebook/bart-base
 tags:
 - generated_from_trainer
 model-index:
-- name: Bart-base
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
-# Bart-base
 This model is a fine-tuned version of [facebook/bart-base](https://huggingface.co/facebook/bart-base) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: nan
 ## Model description
@@ -38,34 +45,46 @@ The following hyperparameters were used during training:
 - train_batch_size: 4
 - eval_batch_size: 4
 - seed: 42
-- gradient_accumulation_steps: 4
-- total_train_batch_size: 16
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- num_epochs: 6
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss |
-|:-------------:|:-----:|:----:|:---------------:|
-| 940437.376    | 0.32  | 250  | nan             |
-| 0.0           | 0.64  | 500  | nan             |
-| 0.0           | 0.96  | 750  | nan             |
-| 0.0           | 1.28  | 1000 | nan             |
-| 0.0           | 1.61  | 1250 | nan             |
-| 0.0           | 1.93  | 1500 | nan             |
-| 0.0           | 2.25  | 1750 | nan             |
-| 0.0           | 2.57  | 2000 | nan             |
-| 0.0           | 2.89  | 2250 | nan             |
-| 0.0           | 3.21  | 2500 | nan             |
-| 0.0           | 3.53  | 2750 | nan             |
-| 0.0           | 3.85  | 3000 | nan             |
-| 0.0           | 4.17  | 3250 | nan             |
-| 0.0           | 4.49  | 3500 | nan             |
-| 0.0           | 4.82  | 3750 | nan             |
-| 0.0           | 5.14  | 4000 | nan             |
-| 0.0           | 5.46  | 4250 | nan             |
-| 0.0           | 5.78  | 4500 | nan             |
 ### Framework versions

 base_model: facebook/bart-base
 tags:
 - generated_from_trainer
+metrics:
+- rouge
 model-index:
+- name: bart-base
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
+# bart-base
 This model is a fine-tuned version of [facebook/bart-base](https://huggingface.co/facebook/bart-base) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.4156
+- Rouge1: 41.7881
+- Rouge2: 19.9952
+- Rougel: 36.4308
+- Rougelsum: 38.1089
+- Gen Len: 18.0
 ## Model description
 - train_batch_size: 4
 - eval_batch_size: 4
 - seed: 42
+- gradient_accumulation_steps: 2
+- total_train_batch_size: 8
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- num_epochs: 30
 ### Training results
+| Training Loss | Epoch | Step  | Validation Loss | Rouge1  | Rouge2  | Rougel  | Rougelsum | Gen Len |
+|:-------------:|:-----:|:-----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:|
+| 0.3288        | 1.0   | 1557  | 0.2425          | 41.3515 | 19.3193 | 35.4878 | 38.2323   | 18.0    |
+| 0.2455        | 2.0   | 3115  | 0.2323          | 41.1865 | 19.5825 | 35.7587 | 37.8016   | 18.0    |
+| 0.2097        | 3.0   | 4672  | 0.2333          | 41.4261 | 20.2577 | 36.0437 | 38.1275   | 18.0    |
+| 0.1818        | 4.0   | 6230  | 0.2400          | 42.7857 | 21.7524 | 37.4117 | 39.3948   | 18.0    |
+| 0.1591        | 5.0   | 7787  | 0.2489          | 41.9402 | 21.3931 | 36.7999 | 38.7229   | 18.0    |
+| 0.1392        | 6.0   | 9345  | 0.2530          | 42.1993 | 21.3725 | 36.6614 | 38.5415   | 18.0    |
+| 0.1218        | 7.0   | 10902 | 0.2616          | 42.0991 | 20.7834 | 36.6721 | 38.6425   | 18.0    |
+| 0.1061        | 8.0   | 12460 | 0.2794          | 41.4682 | 20.2185 | 36.0528 | 37.7626   | 18.0    |
+| 0.0929        | 9.0   | 14017 | 0.2858          | 41.5178 | 20.0354 | 36.027  | 37.9562   | 18.0    |
+| 0.0813        | 10.0  | 15575 | 0.3001          | 42.1686 | 20.7936 | 36.7589 | 38.6885   | 18.0    |
+| 0.0715        | 11.0  | 17132 | 0.3113          | 41.5616 | 20.6733 | 36.2947 | 38.1556   | 18.0    |
+| 0.0622        | 12.0  | 18690 | 0.3228          | 41.3672 | 20.0432 | 36.1746 | 38.0949   | 18.0    |
+| 0.0544        | 13.0  | 20247 | 0.3296          | 41.4662 | 19.8484 | 35.9521 | 37.7284   | 18.0    |
+| 0.0478        | 14.0  | 21805 | 0.3373          | 41.1417 | 20.1208 | 36.1864 | 37.9314   | 18.0    |
+| 0.0423        | 15.0  | 23362 | 0.3440          | 41.1174 | 19.551  | 35.7777 | 37.5518   | 18.0    |
+| 0.0373        | 16.0  | 24920 | 0.3581          | 40.7365 | 19.5894 | 35.5672 | 37.4447   | 18.0    |
+| 0.0327        | 17.0  | 26477 | 0.3654          | 41.0895 | 19.4995 | 35.7195 | 37.3265   | 18.0    |
+| 0.0294        | 18.0  | 28035 | 0.3750          | 40.8447 | 19.4098 | 35.557  | 37.3456   | 18.0    |
+| 0.0262        | 19.0  | 29592 | 0.3790          | 41.0388 | 19.8022 | 35.946  | 37.6522   | 18.0    |
+| 0.0237        | 20.0  | 31150 | 0.3841          | 41.6747 | 19.6307 | 35.9938 | 37.6853   | 18.0    |
+| 0.0212        | 21.0  | 32707 | 0.3874          | 40.7796 | 19.2156 | 35.3642 | 37.1609   | 18.0    |
+| 0.0192        | 22.0  | 34265 | 0.3942          | 41.2411 | 19.5756 | 35.8442 | 37.5498   | 18.0    |
+| 0.0173        | 23.0  | 35822 | 0.3974          | 41.112  | 19.7216 | 35.8072 | 37.5629   | 18.0    |
+| 0.0159        | 24.0  | 37380 | 0.4042          | 40.6911 | 19.1988 | 35.5312 | 37.3276   | 18.0    |
+| 0.0144        | 25.0  | 38937 | 0.4090          | 41.0017 | 19.3834 | 35.7806 | 37.6217   | 18.0    |
+| 0.0132        | 26.0  | 40495 | 0.4101          | 41.6159 | 19.4447 | 36.1746 | 37.9271   | 18.0    |
+| 0.012         | 27.0  | 42052 | 0.4117          | 41.4618 | 19.3824 | 36.0425 | 37.8597   | 18.0    |
+| 0.0112        | 28.0  | 43610 | 0.4137          | 41.5302 | 19.565  | 36.1323 | 37.8484   | 18.0    |
+| 0.0105        | 29.0  | 45167 | 0.4147          | 41.5432 | 19.9581 | 36.2526 | 38.0642   | 18.0    |
+| 0.0099        | 29.99 | 46710 | 0.4156          | 41.7881 | 19.9952 | 36.4308 | 38.1089   | 18.0    |
 ### Framework versions

config.json CHANGED Viewed

@@ -68,7 +68,7 @@
       "num_beams": 6
     }
   },
-  "torch_dtype": "float16",
   "transformers_version": "4.39.3",
   "use_cache": true,
   "vocab_size": 50265

       "num_beams": 6
     }
   },
+  "torch_dtype": "float32",
   "transformers_version": "4.39.3",
   "use_cache": true,
   "vocab_size": 50265

generation_config.json CHANGED Viewed

@@ -1,5 +1,4 @@
 {
-  "_from_model_config": true,
   "bos_token_id": 0,
   "decoder_start_token_id": 2,
   "early_stopping": true,

 {
   "bos_token_id": 0,
   "decoder_start_token_id": 2,
   "early_stopping": true,

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:cd21c00f3c7eefa77bd62c95747e9136f29129f9e2409f40f70bdab2315e0b20
-size 278971098

 version https://git-lfs.github.com/spec/v1
+oid sha256:7c65c719fa4d4600a8d66267dcb12fc1d729ea0635496120729377ee26b706f9
+size 557912620

tokenizer.json CHANGED Viewed

@@ -2,13 +2,13 @@
   "version": "1.0",
   "truncation": {
     "direction": "Right",
-    "max_length": 1024,
     "strategy": "LongestFirst",
     "stride": 0
   },
   "padding": {
     "strategy": {
-      "Fixed": 1024
     },
     "direction": "Right",
     "pad_to_multiple_of": null,

   "version": "1.0",
   "truncation": {
     "direction": "Right",
+    "max_length": 176,
     "strategy": "LongestFirst",
     "stride": 0
   },
   "padding": {
     "strategy": {
+      "Fixed": 176
     },
     "direction": "Right",
     "pad_to_multiple_of": null,

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:7d1e018f0fccd1e1ad2431e2273604546a3d0272617cd7ea7c90ddf57852e63d
 size 5048

 version https://git-lfs.github.com/spec/v1
+oid sha256:753e95f52f26453e14355dcf0c9fba19eb08a11e7d1d43d9d463b92ce378acda
 size 5048