Text Classification
Transformers
PyTorch
English
deberta-v2
Inference Endpoints
ikrysinska commited on
Commit
0ca1b82
1 Parent(s): 471e9d4

Update README.md

Browse files

Add evaluation description

Files changed (1) hide show
  1. README.md +13 -27
README.md CHANGED
@@ -12,22 +12,22 @@ pipeline_tag: text-classification
12
  ### Model Description
13
 
14
  <!-- Provide a longer summary of what this model is. -->
15
- This is a deBERTa-v3-base model with an adapter trained on X tweets from [More Information Needed] finetuned for text classification. The model predicts whether a tweet supports a given conspiracy theory or not. The model was trained on tweets related to six common COVID-19 conspiracy theories.
16
 
17
- 1. **Vaccines are unsafe** The coronavirus vaccine is either unsafe or part of a larger plot to control people or reduce the population.
18
 
19
- 2. **Governments and politicians spread misinformation** Politicians or government agencies are intentionally spreading false information, or they have some other motive for the way they are responding to the coronavirus.
20
 
21
- 3. **The Chinese intentionally spread the virus** The Chinese government intentionally created or spread the coronavirus to harm other countries.
22
 
23
- 4. **Deliberate strategy to create economic instability or benefit large corporations** The coronavirus or the government's response to it is a deliberate strategy to create economic instability or to benefit large corporations over small businesses.
24
 
25
- 5. **Public intentionally misled about the true nature of the virus and prevention** The public is being intentionally misled about the true nature of the Coronavirus, its risks, or the efficacy of certain treatments or prevention methods.
26
 
27
- 6. **Human made and bioweapon** The Coronavirus was created intentionally, made by humans, or as a bioweapon.
28
 
29
 
30
- This model is suitable for English.
31
 
32
  - **Developed by:** Webimmunication Team
33
  - **Shared by [optional]:** @ikrysinska
@@ -44,10 +44,6 @@ This model is suitable for English.
44
 
45
  <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
46
 
47
- ### Direct Use
48
-
49
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
50
-
51
  [More Information Needed]
52
 
53
  ### Downstream Use [optional]
@@ -64,6 +60,7 @@ This model is suitable for English.
64
 
65
  ## Bias, Risks, and Limitations
66
 
 
67
  <!-- This section is meant to convey both technical and sociotechnical limitations. -->
68
 
69
  [More Information Needed]
@@ -90,26 +87,15 @@ Use the code below to get started with the model.
90
 
91
  ### Training Procedure
92
 
93
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
94
 
95
  #### Preprocessing [optional]
96
 
97
- - hashtags, mentions, punctuation.
98
-
99
-
100
- #### Training Hyperparameters
101
-
102
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
103
-
104
- #### Speeds, Sizes, Times [optional]
105
-
106
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
107
-
108
- [More Information Needed]
109
 
110
  ## Evaluation
111
 
112
- The model was evaluated on a sample
113
 
114
  ### Testing Data, Factors & Metrics
115
 
@@ -121,7 +107,7 @@ The model was evaluated on a sample
121
 
122
  #### Factors
123
 
124
- <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
125
 
126
  [More Information Needed]
127
 
 
12
  ### Model Description
13
 
14
  <!-- Provide a longer summary of what this model is. -->
15
+ This is a DeBERTa-v3-base-tasksource-nli model with an adapter trained on [More Information Needed, which contains X pairs of a tweet and a conspiracy theory along with class labels: support, denies, neutral. The model was finetuned for text classification to predict whether a tweet supports a given conspiracy theory or not. The model was trained on tweets related to six common COVID-19 conspiracy theories.
16
 
17
+ 1. **Vaccines are unsafe.** The coronavirus vaccine is either unsafe or part of a larger plot to control people or reduce the population.
18
 
19
+ 2. **Governments and politicians spread misinformation.** Politicians or government agencies are intentionally spreading false information, or they have some other motive for the way they are responding to the coronavirus.
20
 
21
+ 3. **The Chinese intentionally spread the virus.** The Chinese government intentionally created or spread the coronavirus to harm other countries.
22
 
23
+ 4. **Deliberate strategy to create economic instability or benefit large corporations.** The coronavirus or the government's response to it is a deliberate strategy to create economic instability or to benefit large corporations over small businesses.
24
 
25
+ 5. **Public was intentionally misled about the true nature of the virus and prevention.** The public is being intentionally misled about the true nature of the Coronavirus, its risks, or the efficacy of certain treatments or prevention methods.
26
 
27
+ 6. **Human made and bioweapon.** The Coronavirus was created intentionally, made by humans, or as a bioweapon.
28
 
29
 
30
+ This model is suitable for English only.
31
 
32
  - **Developed by:** Webimmunication Team
33
  - **Shared by [optional]:** @ikrysinska
 
44
 
45
  <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
46
 
 
 
 
 
47
  [More Information Needed]
48
 
49
  ### Downstream Use [optional]
 
60
 
61
  ## Bias, Risks, and Limitations
62
 
63
+
64
  <!-- This section is meant to convey both technical and sociotechnical limitations. -->
65
 
66
  [More Information Needed]
 
87
 
88
  ### Training Procedure
89
 
90
+ The adapter was trained for 5 epochs with a batch size of 16.
91
 
92
  #### Preprocessing [optional]
93
 
94
+ The training data was cleaned before the training. All URLs, Twitter user mentions, and non-ASCII characters were removed.
 
 
 
 
 
 
 
 
 
 
 
95
 
96
  ## Evaluation
97
 
98
+ The model was evaluated on a sample of the tweets collected during the COVID-19 pandemic. All the tweets were rated against each of the six theories by five annotators. Using sliding scales, they rated each tweets' endorsement likelihood for the respective conspiracy theory from 0% to 100%. The consensus among raters was substantial for every conspiracy theory (see table below).
99
 
100
  ### Testing Data, Factors & Metrics
101
 
 
107
 
108
  #### Factors
109
 
110
+ The evaluation dataset
111
 
112
  [More Information Needed]
113