krotima1 commited on
Commit
ca26836
·
2 Parent(s): 139c0ef c16db27

Merge branch 'main' of https://huggingface.co./krotima1/mbart-ht2a-s into main

Browse files
Files changed (1) hide show
  1. README.md +62 -2
README.md CHANGED
@@ -1,2 +1,62 @@
1
- # mBART fine-tuned model for Czech Summarization
2
- -
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - cs
4
+ - cs
5
+ tags:
6
+ - abstractive summarization
7
+ - mbart-cc25
8
+ - Czech
9
+ license: apache-2.0
10
+ datasets:
11
+ - SumeCzech dataset news-based
12
+ metrics:
13
+ - rouge
14
+ - rougeraw
15
+ ---
16
+
17
+ # mBART fine-tuned model for Czech abstractive summarization (HT2A-S)
18
+ This model is a fine-tuned checkpoint of [facebook/mbart-large-cc25](https://huggingface.co/facebook/mbart-large-cc25) on the Czech news dataset to produce Czech abstractive summaries.
19
+ ## Task
20
+ The model deals with the task ``Headline + Text to Abstract`` (HT2A) which consists in generating a multi-sentence summary considered as an abstract from a Czech news text.
21
+
22
+ ## Dataset
23
+ The model has been trained on the [SumeCzech](https://ufal.mff.cuni.cz/sumeczech) dataset. The dataset includes around 1M Czech news-based documents consisting of a Headline, Abstract, and Full-text sections. Truncation and padding were configured for 512 tokens for the encoder and 128 for the decoder.
24
+
25
+ ## Training
26
+ The model has been trained on 1x NVIDIA Tesla A100 40GB for 20 hours, 1x NVIDIA Tesla V100 32GB for 40 hours, and 4x NVIDIA Tesla A100 40GB for 20 hours. During training, the model has seen 6928K documents corresponding to roughly 8 epochs.
27
+
28
+ # Use
29
+ Assuming you are using the provided Summarizer.ipynb file.
30
+ ```python
31
+ def summ_config():
32
+ cfg = OrderedDict([
33
+ # summarization model - checkpoint from website
34
+ ("model_name", "krotima1/mbart-ht2a-s"),
35
+ ("inference_cfg", OrderedDict([
36
+ ("num_beams", 4),
37
+ ("top_k", 40),
38
+ ("top_p", 0.92),
39
+ ("do_sample", True),
40
+ ("temperature", 0.89),
41
+ ("repetition_penalty", 1.2),
42
+ ("no_repeat_ngram_size", None),
43
+ ("early_stopping", True),
44
+ ("max_length", 128),
45
+ ("min_length", 10),
46
+ ])),
47
+ #texts to summarize
48
+ ("text",
49
+ [
50
+ "Input your Czech text",
51
+ ]
52
+ ),
53
+ ])
54
+ return cfg
55
+ cfg = summ_config()
56
+ #load model
57
+ model = AutoModelForSeq2SeqLM.from_pretrained(cfg["model_name"])
58
+ tokenizer = AutoTokenizer.from_pretrained(cfg["model_name"])
59
+ # init summarizer
60
+ summarize = Summarizer(model, tokenizer, cfg["inference_cfg"])
61
+ summarize(cfg["text"])
62
+ ```