Fine-Tuned BERT2BERT Summarization Model
This model is fine-tuned based on the original BERT2BERT Indonesian Summarization model.
Fine-Tuned Dataset:
- Dataset: Liputan6_ID
- Task: Summarization
This model was fine-tuned using the Liputan6_ID dataset, which contains Indonesian news articles. The model is optimized for summarizing domain-specific texts from the Liputan6 dataset.
Code Sample
from transformers import BertTokenizer, EncoderDecoderModel
tokenizer = BertTokenizer.from_pretrained("rowjak/bert-indonesian-news-summarization")
tokenizer.bos_token = tokenizer.cls_token
tokenizer.eos_token = tokenizer.sep_token
model = EncoderDecoderModel.from_pretrained("rowjak/bert-indonesian-news-summarization")
#
ARTICLE = ""
# generate summary
input_ids = tokenizer.encode(ARTICLE, return_tensors='pt')
summary_ids = model.generate(input_ids,
max_length=125,
num_beams=2,
repetition_penalty=2.5,
length_penalty=1.0,
early_stopping=True,
no_repeat_ngram_size=2,
use_cache=True)
summary_text = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
print(summary_text)
Output:
---
- Downloads last month
- 487
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for rowjak/bert-indonesian-news-summarization
Base model
cahya/bert2bert-indonesian-summarization