--- license: cc-by-sa-4.0 language: - ko tags: - korean --- # **KoBigBird-RoBERTa-large** This is a large-sized Korean BigBird model introduced in our [paper](https://arxiv.org/abs/2309.10339). The model draws heavily from the parameters of [klue/roberta-large](https://huggingface.co./klue/roberta-large) to ensure high performance. By employing the BigBird architecture and incorporating the newly proposed TAPER, the language model accommodates even longer input lengths. ### How to Use ```python from transformers import AutoTokenizer, AutoModelForMaskedLM tokenizer = AutoTokenizer.from_pretrained("vaiv/kobigbird-roberta-large") model = AutoModelForMaskedLM.from_pretrained("vaiv/kobigbird-roberta-large") ``` ### Hyperparameters ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62ce3886a9be5c195564fd71/bhuidw3bNQZbE2tzVcZw_.png) ### Results Measurement on validation sets of the KLUE benchmark datasets ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62ce3886a9be5c195564fd71/50jMYggkGVUM06n2v1Hxm.png) ### Limitations While our model achieves great results even without additional pretraining, further pretraining can refine the positional representations more. ## Citation Information ```bibtex @article{yang2023kobigbird, title={KoBigBird-large: Transformation of Transformer for Korean Language Understanding}, author={Yang, Kisu and Jang, Yoonna and Lee, Taewoo and Seong, Jinwoo and Lee, Hyungjin and Jang, Hwanseok and Lim, Heuiseok}, journal={arXiv preprint arXiv:2309.10339}, year={2023} } ```