rowjak commited on
Commit
5d1c9d2
1 Parent(s): 1a44ae5

add code sample on readme

Browse files
Files changed (1) hide show
  1. README.md +38 -0
README.md CHANGED
@@ -16,3 +16,41 @@ This model is fine-tuned based on the original [BERT2BERT Indonesian Summarizati
16
  - **Task**: Summarization
17
 
18
  This model was fine-tuned using the [Liputan6_ID](https://huggingface.co/datasets/fajrikoto/id_liputan6) dataset, which contains Indonesian news articles. The model is optimized for summarizing domain-specific texts from the Liputan6 dataset.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
  - **Task**: Summarization
17
 
18
  This model was fine-tuned using the [Liputan6_ID](https://huggingface.co/datasets/fajrikoto/id_liputan6) dataset, which contains Indonesian news articles. The model is optimized for summarizing domain-specific texts from the Liputan6 dataset.
19
+
20
+ ## Code Sample
21
+
22
+ ```python
23
+ from transformers import BertTokenizer, EncoderDecoderModel
24
+
25
+ tokenizer = BertTokenizer.from_pretrained("rowjak/bert-indonesian-news-summarization")
26
+ tokenizer.bos_token = tokenizer.cls_token
27
+ tokenizer.eos_token = tokenizer.sep_token
28
+ model = EncoderDecoderModel.from_pretrained("rowjak/bert-indonesian-news-summarization")
29
+
30
+ #
31
+ ARTICLE = ""
32
+
33
+ # generate summary
34
+ input_ids = tokenizer.encode(ARTICLE, return_tensors='pt')
35
+ summary_ids = model.generate(input_ids,
36
+ max_length=150,
37
+ num_beams=10,
38
+ repetition_penalty=2.5,
39
+ length_penalty=1.0,
40
+ early_stopping=True,
41
+ no_repeat_ngram_size=2,
42
+ use_cache=True,
43
+ do_sample = True,
44
+ temperature = 0.8,
45
+ top_k = 50,
46
+ top_p = 0.95)
47
+
48
+ summary_text = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
49
+ print(summary_text)
50
+ ```
51
+
52
+ Output:
53
+
54
+ ```
55
+ ---
56
+ ```