|
--- |
|
model-index: |
|
- name: twitter-roberta-base-hate-multiclass-latest |
|
results: [] |
|
language: |
|
- en |
|
pipeline_tag: text-classification |
|
--- |
|
|
|
|
|
# cardiffnlp/twitter-roberta-base-hate-multiclass-latest |
|
|
|
This model is a fine-tuned version of [cardiffnlp/twitter-roberta-base-2022-154m](https://huggingface.co./cardiffnlp/twitter-roberta-base-2022-154m) for multiclass hate-speech classification. A combination of 13 different hate-speech datasets in the English language were used to fine-tune the model. |
|
|
|
## Classes available |
|
``` |
|
{ |
|
"sexism": 0, |
|
"racism": 1, |
|
"disability": 2, |
|
"sexual_orientation": 3, |
|
"religion": 4, |
|
"other": 5, |
|
"not_hate":6 |
|
} |
|
``` |
|
|
|
## Following metrics are achieved |
|
* Accuracy: 0.9419 |
|
* Macro-F1: 0.5752 |
|
* Weighted-F1: 0.9390 |
|
|
|
### Usage |
|
Install tweetnlp via pip. |
|
```shell |
|
pip install tweetnlp |
|
``` |
|
Load the model in python. |
|
```python |
|
import tweetnlp |
|
model = tweetnlp.Classifier("cardiffnlp/twitter-roberta-base-hate-latest") |
|
model.predict('Women are trash 2.') |
|
>> {'label': 'sexism'} |
|
model.predict('@user dear mongoloid respect sentiments & belief refrain totalitarianism. @user') |
|
>> {'label': 'disability'} |
|
|
|
``` |
|
|
|
|
|
|
|
### Model based on: |
|
``` |
|
@misc{antypas2023robust, |
|
title={Robust Hate Speech Detection in Social Media: A Cross-Dataset Empirical Evaluation}, |
|
author={Dimosthenis Antypas and Jose Camacho-Collados}, |
|
year={2023}, |
|
eprint={2307.01680}, |
|
archivePrefix={arXiv}, |
|
primaryClass={cs.CL} |
|
} |
|
|
|
``` |