ytsaig commited on
Commit
4ecf503
1 Parent(s): c43edbf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -9
README.md CHANGED
@@ -12,22 +12,27 @@ should probably proofread and complete it, then remove this comment. -->
12
 
13
  # distilroberta-mbfc-bias
14
 
15
- This model is a fine-tuned version of [distilroberta-base](https://huggingface.co/distilroberta-base) on an unknown dataset.
 
16
  It achieves the following results on the evaluation set:
17
  - Loss: 1.4130
18
  - Acc: 0.6348
19
 
20
- ## Model description
21
-
22
- More information needed
23
-
24
- ## Intended uses & limitations
25
 
26
- More information needed
27
 
28
- ## Training and evaluation data
29
 
30
- More information needed
 
 
 
 
 
 
 
 
31
 
32
  ## Training procedure
33
 
 
12
 
13
  # distilroberta-mbfc-bias
14
 
15
+ This model is a fine-tuned version of [distilroberta-base](https://huggingface.co/distilroberta-base) on the Proppy dataset, using political bias from mediabiasfactcheck.com as labels.
16
+
17
  It achieves the following results on the evaluation set:
18
  - Loss: 1.4130
19
  - Acc: 0.6348
20
 
21
+ ## Training and evaluation data
 
 
 
 
22
 
23
+ The training data used is the [proppy corpus](https://zenodo.org/record/3271522). Articles are labeled for political bias using the political bias of the source publication, as scored by mediabiasfactcheck.com. See [Proppy: Organizing the News Based on Their Propagandistic Content](https://propaganda.qcri.org/papers/elsarticle-template.pdf) for details.
24
 
25
+ To create a more balanced training set, common labels are downsampled to have a maximum of 2000 articles. The resulting label distribution in the training data is as follows:
26
 
27
+ ```
28
+ extremeright 689
29
+ leastbiased 2000
30
+ left 783
31
+ leftcenter 2000
32
+ right 1260
33
+ rightcenter 1418
34
+ unknown 2000
35
+ ```
36
 
37
  ## Training procedure
38