rabuahmad
/

cc-tweets-classifier-de

Text Classification

Model card Files Files and versions Community

rabuahmad commited on Sep 23, 2024

Commit

3d84b26

·

verified ·

1 Parent(s): f63b619

Update README.md

Files changed (1) hide show

README.md +67 -3

README.md CHANGED Viewed

@@ -1,3 +1,67 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+language:
+- de
+base_model:
+- dbmdz/bert-base-german-uncased
+pipeline_tag: text-classification
+---
+## Social Media Style Classifier for Climate Change Text (German)
+This model is a fine-tuned bert-base-uncased on a binary classification task to determine whether a German text about Climate Change is written in a social media style.
+Social media texts were gathered from [GerCCT](https://github.com/RobinSchaefer/GerCCT) and [r/Klimawandel](https://www.reddit.com/r/Klimawandel/).
+Non-social media texts were gathered by tokenizing sentences from 15 Wikipedia articles:
+1. [Klimawandel](https://de.wikipedia.org/wiki/Klimawandel),
+2. [Globale Erwärmung](https://de.wikipedia.org/wiki/Globale_Erw%C3%A4rmung),
+3. [Forschungsgeschichte des Klimawandels](https://de.wikipedia.org/wiki/Forschungsgeschichte_des_Klimawandels),
+4. [Klimahysterie](https://de.wikipedia.org/wiki/Klimahysterie),
+5. [Klimawandelleugnung](https://de.wikipedia.org/wiki/Klimawandelleugnung),
+6. [Folgen der globalen Erwärmung in der Arktis](https://de.wikipedia.org/wiki/Folgen_der_globalen_Erw%C3%A4rmung_in_der_Arktis)
+7. [Folgen der globalen Erwärmung](https://de.wikipedia.org/wiki/Folgen_der_globalen_Erw%C3%A4rmung)
+8. [Klimamodell](https://de.wikipedia.org/wiki/Klimamodell)
+9. [Anpassung an die globale Erwärmung](https://de.wikipedia.org/wiki/Anpassung_an_die_globale_Erw%C3%A4rmung)
+10. [Kontroverse um die globale Erwärmung](https://de.wikipedia.org/wiki/Kontroverse_um_die_globale_Erw%C3%A4rmung)
+11. [UN-Klimakonferenz in Dubai 2023](https://de.wikipedia.org/wiki/UN-Klimakonferenz_in_Dubai_2023)
+12. [Umweltbewegung](https://de.wikipedia.org/wiki/Umweltbewegung#Klimaschutz)
+13. [Treibhausgas](https://de.wikipedia.org/wiki/Treibhausgas)
+14. [Treibhauseffekt](https://de.wikipedia.org/wiki/Treibhauseffekt)
+15. [Klimaschutz](https://de.wikipedia.org/wiki/Klimaschutz)
+The dataset contained about 8K instances, with a 50/50 distribution between the two classes. It was shuffled with a random seed of 42 and split into 80/20 for training/testing.
+The V100-16GB GPU was used for training three epochs with a batch size of 8. Other hyperparameters were default values from the HuggingFace Trainer.
+The model was trained in order to evaluate a text style transfer task, converting formal-language texts to tweets.
+### How to use
+```python
+from transformers import AutoModelForSequenceClassification, AutoTokenizer, TextClassificationPipeline
+model_name = "rabuahmad/cc-tweets-classifier-de"
+model = AutoModelForSequenceClassification.from_pretrained(model_name)
+tokenizer = AutoTokenizer.from_pretrained(model_name, max_len=512)
+classifier = TextClassificationPipeline(model=model, tokenizer=tokenizer, truncation=True, max_length=512)
+text = "Gestern war ein schöner Tag!"
+result = classifier(text)
+```
+Label 1 indicates that the text is predicted to be a tweet.
+### Evaluation
+Evaluation results on the test set:
+| Metric   |Score      |
+|----------|-----------|
+| Accuracy | 0.96494   |
+| Precision| 0.97552   |
+| Recall   | 0.95564   |
+| F1       | 0.96547   |