File size: 805 Bytes
fe2d5db
 
 
 
 
 
 
3f7f4c2
fe2d5db
 
 
 
2cd4b7a
4a9a4ee
2cd4b7a
 
fe2d5db
2cd4b7a
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
---
language:
- en
tags:
- summarization
license: mit
datasets:
- wiki_lingua
metrics:
- rouge
---

#### Pre-trained BART Model fine-tune on WikiLingua dataset
The repository for the fine-tuned BART model (by sshleifer) using the **wiki_lingua** dataset (English)

**Purpose:** Examine the performance of a fine-tuned model research purposes

**Observation:**
- Pre-trained model was trained on the XSum dataset, which summarize a not-too-long documents into one-liner summary
- Fine-tuning this model using WikiLingua is appropriate since the summaries for that dataset are also short
- In the end, however, the model cannot capture much clearer key points, but instead it mostly extracts the opening sentence
- Some data pre-processing and models' hyperparameter are also need to be tuned more properly.