File size: 1,244 Bytes
d0e7094 5334aa7 d0e7094 5334aa7 a10d657 4e12e52 5334aa7 4e12e52 5334aa7 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 |
---
license: cc-by-sa-4.0
language:
- ko
tags:
- korean
---
# **KoBigBird-RoBERTa-large**
This is a large-sized Korean BigBird model introduced in our [paper]() (IJCNLP-AACL 2023).
The model draws heavily from the parameters of [klue/roberta-large](https://huggingface.co./klue/roberta-large) to ensure high performance.
By employing the BigBird architecture and incorporating the newly proposed TAPER, the language model accommodates even longer input lengths.
### How to Use
```python
from transformers import AutoTokenizer, AutoModelForMaskedLM
tokenizer = AutoTokenizer.from_pretrained("vaiv/kobigbird-roberta-large")
model = AutoModelForMaskedLM.from_pretrained("vaiv/kobigbird-roberta-large")
```
### Hyperparameters
![image/png](https://cdn-uploads.huggingface.co/production/uploads/62ce3886a9be5c195564fd71/bhuidw3bNQZbE2tzVcZw_.png)
### Results
Measurement on validation sets of the KLUE benchmark datasets
![image/png](https://cdn-uploads.huggingface.co/production/uploads/62ce3886a9be5c195564fd71/50jMYggkGVUM06n2v1Hxm.png)
### Limitations
While our model achieves great results even without additional pretraining, direct pretraining can further refine positional representations.
## Citation Information
To Be Announced |