README.md · cappuch/Emotions_BERT at 8de7ea8ab546b4bc8ab8fff1e76a1c5338ff9718

metadata

language: en
tags:
  - text-classification
  - tensorflow
  - roberta
datasets:
  - go_emotions
license: mit

Connect me on LinkedIn

linkedin.com/in/arpanghoshal

What is GoEmotions

Dataset labelled 58000 Reddit comments with 28 emotions

admiration, amusement, anger, annoyance, approval, caring, confusion, curiosity, desire, disappointment, disapproval, disgust, embarrassment, excitement, fear, gratitude, grief, joy, love, nervousness, optimism, pride, realization, relief, remorse, sadness, surprise + neutral

What is RoBERTa

RoBERTa builds on BERT’s language masking strategy and modifies key hyperparameters in BERT, including removing BERT’s next-sentence pretraining objective, and training with much larger mini-batches and learning rates. RoBERTa was also trained on an order of magnitude more data than BERT, for a longer amount of time. This allows RoBERTa representations to generalize even better to downstream tasks compared to BERT.

Hyperparameters

Parameter
Learning rate	5e-5
Epochs	10
Max Seq Length	50
Batch size	16
Warmup Proportion	0.1
Epsilon	1e-8

Results

Best Result of Macro F1 - 49.30%

Usage


from transformers import RobertaTokenizerFast, TFRobertaForSequenceClassification, pipeline

tokenizer = RobertaTokenizerFast.from_pretrained("arpanghoshal/EmoRoBERTa")
model = TFRobertaForSequenceClassification.from_pretrained("arpanghoshal/EmoRoBERTa")

emotion = pipeline('sentiment-analysis', 
                    model='arpanghoshal/EmoRoBERTa')

emotion_labels = emotion("Thanks for using it.")
print(emotion_labels)

Output

[{'label': 'gratitude', 'score': 0.9964383244514465}]