You need to agree to share your contact information to access this model
This repository is publicly accessible, but you have to accept the conditions to access its files and content.
Please provide some basic information that confirms the scholarly use of our models.
Log in or Sign Up to review the conditions and access this model content.
[README UNDER CONSTRUCTION]
emBert is a Hungarian text classification model, aimed at classifying 7 possible emotions and a neutral state. The model uses huBERT tokenizer, and was fine-tuned on a huBERT base model with a proprietary database of Hungarian online news site sentences. The sentences for the fine-tuning set were classified manually by experts in a double-blind manner. Inconsistencies were dealt with manually. The results of the fine-tuning validation were:
emotion | precision | recall | f1-score |
---|---|---|---|
0 - Anger | 0.70 | 0.74 | 0.72 |
1 - Disgust | 0.72 | 0.73 | 0.73 |
2 - Fear | 0.61 | 0.47 | 0.53 |
3 - Happiness | 0.38 | 0.37 | 0.38 |
4 - Neutral | 0.65 | 0.62 | 0.63 |
5 - Sad | 0.74 | 0.72 | 0.73 |
6 - Successful | 0.79 | 0.81 | 0.80 |
7 - Trustful | 0.76 | 0.78 | 0.77 |
weighted avg | 0.73 | 0.74 | 0.73 |
Accuracy reached 74%. |
The emotions are based on Plutchik 1980, with anticipation substituted with neutral.
Proper use of the model:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("SZTAKI-HLT/hubert-base-cc")
model = AutoModelForSequenceClassification.from_pretrained("poltextlab/emBERT")
The model was created by György Márk Kis, Orsolya Ring, Miklós Sebők of the Center for Social Sciences.
- Downloads last month
- 11