pszemraj
/

long-t5-tglobal-large-pubmed-3k-booksum-16384-WIP

@@ -68,12 +68,15 @@ parameters:
 > NOTE: this is still a work-in-progress (WIP) and not completed/converged by any means, but sharing to maybe save some time for others :)
 ## About
 - a checkpoint of [Stancld/longt5-tglobal-large-16384-pubmed-3k_steps](https://huggingface.co/Stancld/longt5-tglobal-large-16384-pubmed-3k_steps) trained on `kmfoda/booksum` for about 20 epochs
-- max input lengths during training vary between 8192 and 16384 tokens depending on GPU availability. This checkpoint was **trained with 16384 tokens as the max input length for the final two epochs**
-- **An important caveat** (and part of why this is WIP) is that the dataset was filtered to only contain summaries of 1024 **characters** or shorter instead of tokens. Other checkpoints I post will have this fixed.
-  - the tl;dr of this is that if you use this checkpoint for inference, it will produce short summaries of the input text (perhaps shorter than you wanted).
  ## Comparisons

 > NOTE: this is still a work-in-progress (WIP) and not completed/converged by any means, but sharing to maybe save some time for others :)
+## updates
+- July 4, 2022: add checkpoint with 6 additional epochs of training with the dataset summary outputs filtered to 1024 **tokens**, resolving prior issue of short summaries.
 ## About
 - a checkpoint of [Stancld/longt5-tglobal-large-16384-pubmed-3k_steps](https://huggingface.co/Stancld/longt5-tglobal-large-16384-pubmed-3k_steps) trained on `kmfoda/booksum` for about 20 epochs
+- max input lengths during training vary between 8192 and 16384 tokens depending on GPU availability. This checkpoint was **trained with 16384 tokens as the max input length for the final eight epochs**
  ## Comparisons