File size: 1,244 Bytes
d0e7094
 
5334aa7
 
 
 
d0e7094
5334aa7
 
 
a10d657
4e12e52
 
5334aa7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4e12e52
5334aa7
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
---
license: cc-by-sa-4.0
language:
- ko
tags:
- korean
---

# **KoBigBird-RoBERTa-large**

This is a large-sized Korean BigBird model introduced in our [paper]() (IJCNLP-AACL 2023).
The model draws heavily from the parameters of [klue/roberta-large](https://huggingface.co./klue/roberta-large) to ensure high performance.
By employing the BigBird architecture and incorporating the newly proposed TAPER, the language model accommodates even longer input lengths.

### How to Use

```python
from transformers import AutoTokenizer, AutoModelForMaskedLM

tokenizer = AutoTokenizer.from_pretrained("vaiv/kobigbird-roberta-large")
model = AutoModelForMaskedLM.from_pretrained("vaiv/kobigbird-roberta-large")
```

### Hyperparameters

![image/png](https://cdn-uploads.huggingface.co/production/uploads/62ce3886a9be5c195564fd71/bhuidw3bNQZbE2tzVcZw_.png)

### Results

Measurement on validation sets of the KLUE benchmark datasets

![image/png](https://cdn-uploads.huggingface.co/production/uploads/62ce3886a9be5c195564fd71/50jMYggkGVUM06n2v1Hxm.png)

### Limitations
While our model achieves great results even without additional pretraining, direct pretraining can further refine positional representations.

## Citation Information

To Be Announced