:memo: add checkpoint with 6 additional epochs
Browse files
README.md
CHANGED
@@ -68,12 +68,15 @@ parameters:
|
|
68 |
|
69 |
> NOTE: this is still a work-in-progress (WIP) and not completed/converged by any means, but sharing to maybe save some time for others :)
|
70 |
|
|
|
|
|
|
|
|
|
71 |
## About
|
72 |
|
73 |
- a checkpoint of [Stancld/longt5-tglobal-large-16384-pubmed-3k_steps](https://huggingface.co/Stancld/longt5-tglobal-large-16384-pubmed-3k_steps) trained on `kmfoda/booksum` for about 20 epochs
|
74 |
-
- max input lengths during training vary between 8192 and 16384 tokens depending on GPU availability. This checkpoint was **trained with 16384 tokens as the max input length for the final
|
75 |
-
|
76 |
-
- the tl;dr of this is that if you use this checkpoint for inference, it will produce short summaries of the input text (perhaps shorter than you wanted).
|
77 |
|
78 |
## Comparisons
|
79 |
|
|
|
68 |
|
69 |
> NOTE: this is still a work-in-progress (WIP) and not completed/converged by any means, but sharing to maybe save some time for others :)
|
70 |
|
71 |
+
## updates
|
72 |
+
|
73 |
+
- July 4, 2022: add checkpoint with 6 additional epochs of training with the dataset summary outputs filtered to 1024 **tokens**, resolving prior issue of short summaries.
|
74 |
+
|
75 |
## About
|
76 |
|
77 |
- a checkpoint of [Stancld/longt5-tglobal-large-16384-pubmed-3k_steps](https://huggingface.co/Stancld/longt5-tglobal-large-16384-pubmed-3k_steps) trained on `kmfoda/booksum` for about 20 epochs
|
78 |
+
- max input lengths during training vary between 8192 and 16384 tokens depending on GPU availability. This checkpoint was **trained with 16384 tokens as the max input length for the final eight epochs**
|
79 |
+
|
|
|
80 |
|
81 |
## Comparisons
|
82 |
|