Langame
/

distilgpt2-starter

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

louis030195 commited on Jan 28, 2022

Commit

7f9c9d2

•

1 Parent(s): 732dafc

update model card README.md

Files changed (1) hide show

README.md +14 -19

README.md CHANGED Viewed

@@ -1,34 +1,20 @@
 ---
-language:
-- en
 tags:
-- text-generation
 - generated_from_trainer
-datasets:
-- Langame/starter
-license: apache-2.0
 model-index:
 - name: distilgpt2-starter
   results: []
-widget:
-- text: "ice breaker,travel ###"
-  example_title: "Ice breaker"
-- text: "philosophy,science,physic,mathematic,biology ###"
-  example_title: "Scientific"
-- text: "big talk,deep talk ###"
-  example_title: "Big talk"
 ---
-# Generate conversation starters, from [Langame](https://langa.me) :p
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
 # distilgpt2-starter
-This model is a fine-tuned version of [distilgpt2](https://huggingface.co/distilgpt2) on the Langame/starter dataset.
 It achieves the following results on the evaluation set:
-- Loss: 3.3145
 ## Model description
@@ -56,15 +42,24 @@ The following hyperparameters were used during training:
 - total_train_batch_size: 8
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- num_epochs: 50.0
 ### Training results
 ### Framework versions
-- Transformers 4.16.0.dev0
 - Pytorch 1.10.0+cu111
 - Datasets 1.18.1
 - Tokenizers 0.11.0

 ---
+license: apache-2.0
 tags:
 - generated_from_trainer
 model-index:
 - name: distilgpt2-starter
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
 # distilgpt2-starter
+This model is a fine-tuned version of [distilgpt2](https://huggingface.co/distilgpt2) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 5.9961
 ## Model description
 - total_train_batch_size: 8
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- num_epochs: 500.0
 ### Training results
+| Training Loss | Epoch  | Step | Validation Loss |
+|:-------------:|:------:|:----:|:---------------:|
+| No log        | 66.67  | 200  | 3.6445          |
+| No log        | 133.33 | 400  | 4.5703          |
+| 1.0101        | 200.0  | 600  | 5.2109          |
+| 1.0101        | 266.67 | 800  | 5.5430          |
+| 0.0681        | 333.33 | 1000 | 5.7227          |
+| 0.0681        | 400.0  | 1200 | 5.8672          |
+| 0.0681        | 466.67 | 1400 | 5.9961          |
 ### Framework versions
+- Transformers 4.17.0.dev0
 - Pytorch 1.10.0+cu111
 - Datasets 1.18.1
 - Tokenizers 0.11.0