Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
nicolay-rΒ 
posted an update Nov 9
Post
719
πŸ“’ Have you ever been wondered how specifically Transformers were capable for handling long input contexts?
I got a chance to tackle this through long document texts summarization problem, and delighted to share the related survey and diagram for a quick skimming below:

Preprint πŸ“ https://nicolay-r.github.io/website/data/preprint-AINL_2023_longt5_summarization.pdf
Springer πŸ“ https://link.springer.com/article/10.1007/s10958-024-07435-z

🎯 The aim of the survey was the development of the long-document summarizer for mass-media news in Vietnamese language. πŸ‡»πŸ‡³

Sharing for a quick skimming of the methods performance overview of various LM-based solution across several datasets, covering domain-oriented advances in Vietnamese language (see attached screenshots)

As for solution we consider:
β˜‘οΈ 1. Adapt existed google/pegasus-cnn_dailymail for summarizing large dataset for arranging training
β˜‘οΈ 2. Tuning google/long-t5-tglobal-large suitable for performing generative summarization.

Implementation details:
🌟 https://github.com/nicolay-r/ViLongT5
(Simplier to go with huggingface rather flaxformer that so far become a legacy engine)
In this post