Update README.md
Browse files
README.md
CHANGED
@@ -5,7 +5,7 @@ pipeline_tag: text-classification
|
|
5 |
---
|
6 |
# PaloBERT for Sentiment Analysis
|
7 |
|
8 |
-
A greek [RoBERTa](https://arxiv.org/abs/1907.11692) based model ([PaloBERT](https://huggingface.co/pchatz/
|
9 |
|
10 |
## Training data
|
11 |
|
@@ -13,7 +13,6 @@ The model is pre-trained on a corpus of 458,293 documents collected from greek s
|
|
13 |
|
14 |
The corpus as well as the annotated dataset have been provided by [Palo LTD](http://www.paloservices.com/).
|
15 |
|
16 |
-
|
17 |
## Requirements
|
18 |
|
19 |
```
|
@@ -41,6 +40,35 @@ def preprocess(text, default_replace=""):
|
|
41 |
return text
|
42 |
```
|
43 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
44 |
|
45 |
## Evaluation
|
46 |
|
|
|
5 |
---
|
6 |
# PaloBERT for Sentiment Analysis
|
7 |
|
8 |
+
A greek [RoBERTa](https://arxiv.org/abs/1907.11692) based model ([PaloBERT](https://huggingface.co/pchatz/palobert-base-greek-social-media)) fine-tuned for sentiment analysis.
|
9 |
|
10 |
## Training data
|
11 |
|
|
|
13 |
|
14 |
The corpus as well as the annotated dataset have been provided by [Palo LTD](http://www.paloservices.com/).
|
15 |
|
|
|
16 |
## Requirements
|
17 |
|
18 |
```
|
|
|
40 |
return text
|
41 |
```
|
42 |
|
43 |
+
## Load Model
|
44 |
+
|
45 |
+
```python
|
46 |
+
from transformers import AutoTokenizer, AutoModel
|
47 |
+
|
48 |
+
tokenizer = AutoTokenizer.from_pretrained("pchatz/palobert-base-greek-social-media") #load PaloBERT pre-trained model
|
49 |
+
language_model = AutoModel.from_pretrained("pchatz/palobert-base-greek-social-media")
|
50 |
+
```
|
51 |
+
Refer to [GitHub](https://github.com/Paulinechatz/sentiment-analysis-greek-social-media/blob/main/code/train_classifier_roberta_arch.py#L100) code for details on ModelClass architecture
|
52 |
+
```python
|
53 |
+
model = TheModelClass(*args, **kwargs) #load fine-tuned model as SentimentClassifier_v2
|
54 |
+
model.load_state_dict(torch.load(PATH))
|
55 |
+
model.eval()
|
56 |
+
```
|
57 |
+
You can use this sentiment analysis model directly on raw text:
|
58 |
+
```python
|
59 |
+
#Example
|
60 |
+
class_names={0: 'neutral', 1:'positive', 2:'negative'}
|
61 |
+
text='οι εξετασεις ηταν πολυ καλες'
|
62 |
+
encoding=tokenizer(text,return_tensors='pt')
|
63 |
+
|
64 |
+
input_ids = encoding['input_ids']
|
65 |
+
attention_mask = encoding['attention_mask']
|
66 |
+
|
67 |
+
output = model(input_ids, attention_mask)
|
68 |
+
_,prediction = torch.max(output, dim=1)
|
69 |
+
|
70 |
+
print(f'sentiment : {class_names[prediction.item()]}') #positive
|
71 |
+
```
|
72 |
|
73 |
## Evaluation
|
74 |
|