gmurro
/

bart-large-finetuned-filtered-spotify-podcast-summ

Text2Text Generation

generated_from_keras_callback

Inference Endpoints

Model card Files Files and versions Community

gmurro commited on Jun 21, 2022

Commit

2c8901f

•

1 Parent(s): 330f5d4

Update README.md

Files changed (1) hide show

README.md +3 -2

README.md CHANGED Viewed

@@ -21,8 +21,9 @@ It achieves the following results on the evaluation set:
 This model is intended to be used for automatic podcast summarisation. Given the podcast transcript in input, the objective is to provide a short text summary that a user might read when deciding whether to listen to a podcast. The summary should accurately convey the content of the podcast, be human-readable, and be short enough to be quickly read on a smartphone screen.
 ## Training and evaluation data
-We split the filtered brass set into train/dev sets of 69,336/7,705 episodes.
 The test set consists of 1,027 episodes. Only 1025 have been used because two of them did not contain an episode description.

 This model is intended to be used for automatic podcast summarisation. Given the podcast transcript in input, the objective is to provide a short text summary that a user might read when deciding whether to listen to a podcast. The summary should accurately convey the content of the podcast, be human-readable, and be short enough to be quickly read on a smartphone screen.
 ## Training and evaluation data
+In our solution, an extractive module is developed to select salient chunks from the transcript, which serve as the input to an abstractive summarizer.
+An extensive pre-processing on the creator-provided descriptions is performed selecting a subset of the corpus that is suitable for the training supervised model.
+We split the filtered dataset into train/dev sets of 69,336/7,705 episodes.
 The test set consists of 1,027 episodes. Only 1025 have been used because two of them did not contain an episode description.