metadata

license: cc-by-sa-4.0
language:
  - ko
tags:
  - korean

KoBigBird-RoBERTa-large

This is a large-sized Korean BigBird model introduced in our paper (IJCNLP-AACL 2023). The model draws heavily from the parameters of klue/roberta-large to ensure high performance. By employing the BigBird architecture and incorporating the newly proposed TAPER, the language model accommodates even longer input lengths.

How to Use

from transformers import AutoTokenizer, AutoModelForMaskedLM

tokenizer = AutoTokenizer.from_pretrained("vaiv/kobigbird-roberta-large")
model = AutoModelForMaskedLM.from_pretrained("vaiv/kobigbird-roberta-large")

Hyperparameters

Results

Measurement on validation sets of the KLUE benchmark datasets

Limitations

While our model achieves great results even without additional pretraining, direct pretraining can further refine positional representations.

Citation Information

To Be Announced