pszemraj commited on
Commit
c790326
·
1 Parent(s): 9996867

:memo: add checkpoint with 6 additional epochs

Browse files
Files changed (1) hide show
  1. README.md +6 -3
README.md CHANGED
@@ -68,12 +68,15 @@ parameters:
68
 
69
  > NOTE: this is still a work-in-progress (WIP) and not completed/converged by any means, but sharing to maybe save some time for others :)
70
 
 
 
 
 
71
  ## About
72
 
73
  - a checkpoint of [Stancld/longt5-tglobal-large-16384-pubmed-3k_steps](https://huggingface.co/Stancld/longt5-tglobal-large-16384-pubmed-3k_steps) trained on `kmfoda/booksum` for about 20 epochs
74
- - max input lengths during training vary between 8192 and 16384 tokens depending on GPU availability. This checkpoint was **trained with 16384 tokens as the max input length for the final two epochs**
75
- - **An important caveat** (and part of why this is WIP) is that the dataset was filtered to only contain summaries of 1024 **characters** or shorter instead of tokens. Other checkpoints I post will have this fixed.
76
- - the tl;dr of this is that if you use this checkpoint for inference, it will produce short summaries of the input text (perhaps shorter than you wanted).
77
 
78
  ## Comparisons
79
 
 
68
 
69
  > NOTE: this is still a work-in-progress (WIP) and not completed/converged by any means, but sharing to maybe save some time for others :)
70
 
71
+ ## updates
72
+
73
+ - July 4, 2022: add checkpoint with 6 additional epochs of training with the dataset summary outputs filtered to 1024 **tokens**, resolving prior issue of short summaries.
74
+
75
  ## About
76
 
77
  - a checkpoint of [Stancld/longt5-tglobal-large-16384-pubmed-3k_steps](https://huggingface.co/Stancld/longt5-tglobal-large-16384-pubmed-3k_steps) trained on `kmfoda/booksum` for about 20 epochs
78
+ - max input lengths during training vary between 8192 and 16384 tokens depending on GPU availability. This checkpoint was **trained with 16384 tokens as the max input length for the final eight epochs**
79
+
 
80
 
81
  ## Comparisons
82