pchatz
/

palobert-base-greek-social-media-sentiment-v2

Text Classification

Model card Files Files and versions Community

pchatz commited on Apr 6, 2023

Commit

c0c9452

·

1 Parent(s): e584c07

Create README.md

Files changed (1) hide show

README.md +66 -0

README.md ADDED Viewed

	@@ -0,0 +1,66 @@

+---
+language:
+- el
+pipeline_tag: text-classification
+---
+# PaloBERT for Sentiment Analysis
+A greek [RoBERTa](https://arxiv.org/abs/1907.11692) based model ([PaloBERT](https://huggingface.co/pchatz/greeksocialbert-base-greek-social-media)) fine-tuned for sentiment analysis.
+## Training data
+The model is pre-trained on a corpus of 458,293 documents collected from greek social media (Twitter, Instagram, Facebook and YouTube). A RoBERTa tokenizer trained from scratch on the same corpus is also included. The fine-tuning process is done on a dataset of ~60,000 documents, also collected from greek social media.
+The corpus as well as the annotated dataset have been provided by [Palo LTD](http://www.paloservices.com/).
+## Requirements
+```
+pip install transformers
+pip install torch
+```
+## Pre-processing details
+In order to use 'palobert-base-greek-social-media-sentiment', the text needs to be pre-processed as follows:
+* remove all greek diacritics
+* convert to lowercase
+* remove all punctuation
+```python
+import re
+import unicodedata
+def preprocess(text, default_replace=""):
+  text = text.lower()
+  text = unicodedata.normalize('NFD',text).translate({ord('\N{COMBINING ACUTE ACCENT}'):None})
+  text = re.sub(r'[^\w\s]', default_replace, text)
+  return text
+```
+## Load Model
+```python
+from transformers import AutoTokenizer, AutoModelForMaskedLM
+tokenizer = AutoTokenizer.from_pretrained("pchatz/palobert-base-greek-social-media-sentiment")
+model = AutoModelForMaskedLM.from_pretrained("pchatz/palobert-base-greek-social-media-sentiment")
+```
+You can use this model directly with a pipeline for masked language modeling:
+## Evaluation
+For detailed results refer to Thesis: ['Ανάλυση συναισθήματος κειμένου στα Ελληνικά με χρήση Δικτύων Μετασχηματιστών'](http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/18623) (version - p2)
+## Author
+Pavlina Chatziantoniou, Georgios Alexandridis and Athanasios Voulodimos
+## Citation info
+http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/18623