sehilnlf
/

model_v3_v2

@@ -18,8 +18,8 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [facebook/bart-large](https://huggingface.co/facebook/bart-large) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.1977
-- Sacrebleu: 66.7256
 ## Model description
@@ -46,53 +46,16 @@ The following hyperparameters were used during training:
 - total_train_batch_size: 256
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- num_epochs: 40
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss | Sacrebleu |
 |:-------------:|:-----:|:----:|:---------------:|:---------:|
-| No log        | 0.99  | 54   | 0.5648          | 65.7974   |
-| No log        | 1.99  | 109  | 0.6224          | 66.8854   |
-| No log        | 3.0   | 164  | 0.6639          | 66.8333   |
-| No log        | 4.0   | 219  | 0.5929          | 66.7857   |
-| No log        | 4.99  | 273  | 0.6427          | 65.8395   |
-| No log        | 5.99  | 328  | 0.6721          | 66.4172   |
-| No log        | 7.0   | 383  | 0.7511          | 66.4660   |
-| No log        | 8.0   | 438  | 0.7662          | 66.6480   |
-| No log        | 8.99  | 492  | 0.7588          | 66.5092   |
-| No log        | 9.99  | 547  | 0.7916          | 66.5144   |
-| No log        | 11.0  | 602  | 0.8172          | 66.6279   |
-| No log        | 12.0  | 657  | 0.8350          | 66.5607   |
-| No log        | 12.99 | 711  | 0.8809          | 66.6095   |
-| No log        | 13.99 | 766  | 0.8843          | 66.4089   |
-| No log        | 15.0  | 821  | 1.0130          | 66.5184   |
-| No log        | 16.0  | 876  | 0.9180          | 66.4269   |
-| No log        | 16.99 | 930  | 0.9794          | 66.5766   |
-| No log        | 17.99 | 985  | 0.9450          | 66.6713   |
-| No log        | 19.0  | 1040 | 0.9880          | 66.7081   |
-| No log        | 20.0  | 1095 | 0.9540          | 66.4440   |
-| No log        | 20.99 | 1149 | 1.0552          | 66.5390   |
-| No log        | 21.99 | 1204 | 0.9806          | 66.5975   |
-| No log        | 23.0  | 1259 | 1.0528          | 66.6404   |
-| No log        | 24.0  | 1314 | 1.0348          | 66.4127   |
-| No log        | 24.99 | 1368 | 1.0758          | 66.6139   |
-| No log        | 25.99 | 1423 | 1.1291          | 66.6778   |
-| No log        | 27.0  | 1478 | 1.1112          | 66.6411   |
-| No log        | 28.0  | 1533 | 1.1305          | 66.5986   |
-| No log        | 28.99 | 1587 | 1.1532          | 66.5047   |
-| No log        | 29.99 | 1642 | 1.1106          | 66.5662   |
-| No log        | 31.0  | 1697 | 1.2084          | 66.6593   |
-| No log        | 32.0  | 1752 | 1.1438          | 66.6117   |
-| No log        | 32.99 | 1806 | 1.1956          | 66.6758   |
-| No log        | 33.99 | 1861 | 1.1630          | 66.7359   |
-| No log        | 35.0  | 1916 | 1.1570          | 66.6989   |
-| No log        | 36.0  | 1971 | 1.1754          | 66.6495   |
-| No log        | 36.99 | 2025 | 1.2456          | 66.7018   |
-| No log        | 37.99 | 2080 | 1.2197          | 66.7990   |
-| No log        | 39.0  | 2135 | 1.1886          | 66.7049   |
-| No log        | 39.45 | 2160 | 1.1977          | 66.7256   |
 ### Framework versions

 This model is a fine-tuned version of [facebook/bart-large](https://huggingface.co/facebook/bart-large) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.5669
+- Sacrebleu: 66.8302
 ## Model description
 - total_train_batch_size: 256
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- num_epochs: 3
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss | Sacrebleu |
 |:-------------:|:-----:|:----:|:---------------:|:---------:|
+| No log        | 0.99  | 54   | 0.6545          | 66.3234   |
+| No log        | 1.99  | 109  | 0.5940          | 66.8342   |
+| No log        | 2.96  | 162  | 0.5669          | 66.8302   |
 ### Framework versions