lucas-leme
commited on
Commit
•
4d81309
1
Parent(s):
c8c3087
Upload README.md
Browse files
README.md
CHANGED
@@ -11,9 +11,9 @@ widget:
|
|
11 |
example_title: "Example 3"
|
12 |
---
|
13 |
|
14 |
-
#
|
15 |
|
16 |
-
|
17 |
|
18 |
The model was trained in two main stages: language modeling and sentiment modeling. In the first stage, a language model was trained with more than 1.4 million texts of financial news in Portuguese.
|
19 |
From this first training, it was possible to build a sentiment classifier with few labeled texts (500) that presented a satisfactory convergence.
|
@@ -33,21 +33,30 @@ Among the applications, it was demonstrated that the model can be used to build
|
|
33 |
![Inflation Analysis](sentiment_inflation.png)
|
34 |
|
35 |
## Usage
|
|
|
|
|
|
|
36 |
```python
|
37 |
from transformers import AutoTokenizer, BertForSequenceClassification
|
38 |
import numpy as np
|
39 |
|
40 |
-
pred_mapper = {
|
|
|
|
|
|
|
|
|
|
|
|
|
41 |
|
42 |
-
tokenizer = AutoTokenizer.from_pretrained("lucas-leme/FinBERT-PT-BR")
|
43 |
-
finbertptbr = BertForSequenceClassification.from_pretrained("lucas-leme/FinBERT-PT-BR")
|
44 |
|
45 |
tokens = tokenizer(["Hoje a bolsa caiu", "Hoje a bolsa subiu"], return_tensors="pt",
|
46 |
padding=True, truncation=True, max_length=512)
|
47 |
finbertptbr_outputs = finbertptbr(**tokens)
|
|
|
48 |
preds = [pred_mapper[np.argmax(pred)] for pred in finbertptbr_outputs.logits.cpu().detach().numpy()]
|
49 |
```
|
50 |
-
|
51 |
## Author
|
52 |
|
53 |
- [Lucas Leme](https://www.linkedin.com/in/lucas-leme-santos/)
|
|
|
11 |
example_title: "Example 3"
|
12 |
---
|
13 |
|
14 |
+
# FinBERT-PT-BR : Financial BERT PT BR
|
15 |
|
16 |
+
FinBERT-PT-BR is a pre-trained NLP model to analyze sentiment of Brazilian Portuguese financial texts.
|
17 |
|
18 |
The model was trained in two main stages: language modeling and sentiment modeling. In the first stage, a language model was trained with more than 1.4 million texts of financial news in Portuguese.
|
19 |
From this first training, it was possible to build a sentiment classifier with few labeled texts (500) that presented a satisfactory convergence.
|
|
|
33 |
![Inflation Analysis](sentiment_inflation.png)
|
34 |
|
35 |
## Usage
|
36 |
+
|
37 |
+
In order to use the model, you need to get the HuggingFace auth token. You can get it [here](https://huggingface.co/settings/token).
|
38 |
+
|
39 |
```python
|
40 |
from transformers import AutoTokenizer, BertForSequenceClassification
|
41 |
import numpy as np
|
42 |
|
43 |
+
pred_mapper = {
|
44 |
+
0: "POSITIVE",
|
45 |
+
1: "NEGATIVE",
|
46 |
+
2: "NEUTRAL"
|
47 |
+
}
|
48 |
+
|
49 |
+
huggingface_auth_token = 'AUTH_TOKEN'
|
50 |
|
51 |
+
tokenizer = AutoTokenizer.from_pretrained("lucas-leme/FinBERT-PT-BR", use_auth_token=huggingface_auth_token)
|
52 |
+
finbertptbr = BertForSequenceClassification.from_pretrained("lucas-leme/FinBERT-PT-BR", use_auth_token=huggingface_auth_token)
|
53 |
|
54 |
tokens = tokenizer(["Hoje a bolsa caiu", "Hoje a bolsa subiu"], return_tensors="pt",
|
55 |
padding=True, truncation=True, max_length=512)
|
56 |
finbertptbr_outputs = finbertptbr(**tokens)
|
57 |
+
|
58 |
preds = [pred_mapper[np.argmax(pred)] for pred in finbertptbr_outputs.logits.cpu().detach().numpy()]
|
59 |
```
|
|
|
60 |
## Author
|
61 |
|
62 |
- [Lucas Leme](https://www.linkedin.com/in/lucas-leme-santos/)
|