valurank
/

distilroberta-mbfc-bias

Text Classification

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

ytsaig commited on Dec 31, 2021

Commit

4ecf503

•

1 Parent(s): c43edbf

Update README.md

Files changed (1) hide show

README.md +14 -9

README.md CHANGED Viewed

@@ -12,22 +12,27 @@ should probably proofread and complete it, then remove this comment. -->
 # distilroberta-mbfc-bias
-This model is a fine-tuned version of [distilroberta-base](https://huggingface.co/distilroberta-base) on an unknown dataset.
 It achieves the following results on the evaluation set:
 - Loss: 1.4130
 - Acc: 0.6348
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
 ## Training procedure

 # distilroberta-mbfc-bias
+This model is a fine-tuned version of [distilroberta-base](https://huggingface.co/distilroberta-base) on the Proppy dataset, using political bias from mediabiasfactcheck.com as labels.
 It achieves the following results on the evaluation set:
 - Loss: 1.4130
 - Acc: 0.6348
+## Training and evaluation data
+The training data used is the [proppy corpus](https://zenodo.org/record/3271522). Articles are labeled for political bias using the political bias of the source publication, as scored by mediabiasfactcheck.com. See [Proppy: Organizing the News Based on Their Propagandistic Content](https://propaganda.qcri.org/papers/elsarticle-template.pdf) for details.
+To create a more balanced training set, common labels are downsampled to have a maximum of 2000 articles. The resulting label distribution in the training data is as follows:
+```
+extremeright     689
+leastbiased     2000
+left             783
+leftcenter      2000
+right           1260
+rightcenter     1418
+unknown         2000
+```
 ## Training procedure