Text Classification
Transformers
PyTorch
English
deberta-v2
Inference Endpoints
File size: 6,714 Bytes
55b51f3
5c5e8e0
55b51f3
5c5e8e0
 
 
 
55b51f3
5c5e8e0
 
 
 
 
 
73ed339
5c5e8e0
0783058
471e9d4
0783058
471e9d4
0783058
471e9d4
0783058
471e9d4
0783058
471e9d4
0783058
471e9d4
 
0ca1b82
5c5e8e0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0783058
 
 
 
5c5e8e0
 
 
 
 
 
0783058
 
 
5c5e8e0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0ca1b82
5c5e8e0
73ed339
5c5e8e0
0ca1b82
5c5e8e0
 
 
73ed339
5c5e8e0
 
73ed339
 
 
 
 
 
 
 
5c5e8e0
 
 
 
 
73ed339
5c5e8e0
471e9d4
 
 
 
73ed339
5c5e8e0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
73ed339
5c5e8e0
 
 
 
 
 
73ed339
5c5e8e0
73ed339
5c5e8e0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
---
language: en
license: cc-by-4.0
datasets:
- multi_nli
library_name: transformers
pipeline_tag: text-classification
---

# Model Card for Model COVID-19-CT-tweets-classification

### Model Description

<!-- Provide a longer summary of what this model is. -->
This is a  DeBERTa-v3-base-tasksource-nli model with an adapter trained on [More Information Needed], which contains X pairs of a tweet and a conspiracy theory along with class labels: support, deny, neutral. The model was finetuned for text classification to predict whether a tweet supports a given conspiracy theory or not. The model was trained on tweets related to six common COVID-19 conspiracy theories.

1. **CT6: Vaccines are unsafe.** The coronavirus vaccine is either unsafe or part of a larger plot to control people or reduce the population. 

2. **CT4: Governments and politicians spread misinformation.** Politicians or government agencies are intentionally spreading false information, or they have some other motive for the way they are responding to the coronavirus. 

3. **CT5: The Chinese intentionally spread the virus.** The Chinese government intentionally created or spread the coronavirus to harm other countries. 

4. **CT1: Deliberate strategy to create economic instability or benefit large corporations.** The coronavirus or the government's response to it is a deliberate strategy to create economic instability or to benefit large corporations over small businesses. 

5. **CT2: Public was intentionally misled about the true nature of the virus and prevention.** The public is being intentionally misled about the true nature of the Coronavirus, its risks, or the efficacy of certain treatments or prevention methods. 

6. **CT3: Human made and bioweapon.** The Coronavirus was created intentionally, made by humans, or as a bioweapon. 


This model is suitable for English only.

- **Developed by:** Webimmunication Team
- **Shared by [optional]:** @ikrysinska
- **Model type:** [More Information Needed]
- **Language(s) (NLP):** EN
- **License:** CC BY 4.0
- **Finetuned from model [optional]:** https://huggingface.co./sileod/deberta-v3-base-tasksource-nli 

### Model Sources

- **Paper:** [More Information Needed]

- ## Uses

<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->

[More Information Needed]

### Downstream Use [optional]

<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->

[More Information Needed]

### Out-of-Scope Use

- spreading/generating tweets that support a given conspiracy theory
- amplifying echo chambers of social subnetworks susceptible to believe in conspiracy theories


<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->

[More Information Needed]

## Bias, Risks, and Limitations

- results are distorted for the conspiracy theories out of the training dataset
- unintentional stifling of legitimate public discourse (elimination of discussion that resembles conspiracy theories from social subnetworks)
- bias: text style, economic status...
<!-- This section is meant to convey both technical and sociotechnical limitations. -->

[More Information Needed]

### Recommendations

<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

## How to Get Started with the Model

Use the code below to get started with the model.

[More Information Needed]

## Training Details

### Training Data

<!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->

[More Information Needed]

### Training Procedure 

The adapter was trained for 5 epochs with a batch size of 16. 

#### Preprocessing

The training data was cleaned before the training. All URLs, Twitter user mentions, and non-ASCII characters were removed. 

## Evaluation

The model was evaluated on a sample of the tweets collected during the COVID-19 pandemic. All the tweets were rated against each of the six theories by five annotators. Using sliding scales, they rated each tweets' endorsement likelihood for the respective conspiracy theory from 0% to 100%. The consensus among raters was substantial for every conspiracy theory. Comparisons with human evaluations revealed substantial correlations. The model significantly surpasses the performance of the pre-trained model without the finetuned adapter (see table below). 


| Conspiracy Theory  | Correlations between human raters  | Correlation between human ratings and model without adapter  | Correlation between human ratings and model with finetuned adapter |
|---|---|---|---|
| **Vaccines are unsafe.** | 0.78 | 0.29 | 0.57 |
| **Governments and politicians spread misinformation.**  | 0.58 | 0.32 | 0.72 | 
| **The Chinese intentionally spread the virus.**  |  0.62  | 0.53  |  0.64  |
| **Deliberate strategy to create economic instability or benefit large corporations.** | 0.56 | 0.33 | 0.54 |
| **Public was intentionally misled about the true nature of the virus and prevention.** | 0.66 | 0.37 | 0.68 | 
| **Human made and bioweapon.** | 0.67 | 0.15 | .78 |



## Environmental Impact

Carbon emissions are estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).

- **Hardware Type:** GPU Tesla V100
- **Hours used:** 40
- **Cloud Provider:** Google Cloud Platform
- **Compute Region:** us-east1
- **Carbon Emitted:** 4.44 kg CO2 eq ([equivalent to: 17.9 km driven by an average ICE car, 2.22 kgs of coal burned, 0.07 tree seedlings sequesting carbon for 10 years](https://www.epa.gov/energy/greenhouse-gases-equivalencies-calculator-calculations-and-references)


## Citation [optional]

<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->

**BibTeX:**

[More Information Needed]

**APA:**

[More Information Needed]

## Glossary [optional]

<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->

[More Information Needed]


## Model Card Authors

@ikrysinska, @wtomi

## Model Card Contact

[email protected]

[email protected]

[email protected]