Question answering model for Estonian
This is a question answering model based on XLM-Roberta base model. It is fine-tuned subsequentially on:
- English SQuAD v1.1
- SQuAD v1.1 translated into Estonian
- Small native Estonian dataset (800 samples)
The model has retained good multilingual properties and can be used for extractive QA tasks in all languages included in XLM-Roberta. The performance is best in the fine-tuning languages of Estonian and English.
Tested on | F1 | EM |
---|---|---|
EstQA test set | 82.4 | 75.3 |
SQuAD v1.1 dev set | 86.9 | 77.9 |
The Estonian dataset used for fine-tuning and validating results is available in https://huggingface.co./datasets/anukaver/EstQA/ (version 1.0)
- Downloads last month
- 1
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.