BYRD'S I - ROBERTA BASED TWEET/REVIEW/TEXT ANALYSIS

This is roBERTa-base model fine tuned on 8 datasets with ~20 M tweets this model is suitable for english while can do a fine job on other languages.

Git Repo: SENTIMENTANALYSIS-PROJECT

Demo: BYRD'S I

labels: 0 -> Negative; 1 -> Neutral; 2 -> Positive;

Model Metrics
Accuracy: ~96%
Sparse Categorical Accuracy: 0.9597
Loss: 0.1144
val_loss -- [onLast_train] : 0.1482
Note: Due to dataset discrepencies of Neutral data we published another model Byrd's I only positive_negative model to find only neutral data and have used AdaBoot method to get the accurate output.

Example of Classification:

from transformers import AutoTokenizer, AutoModelForSequenceClassification, AutoModelForSeq2SeqLM
from transformers import TFAutoModelForSequenceClassification
import pandas as pd
import numpy as np
import tensorflow

# model 0
tokenizer = AutoTokenizer.from_pretrained("AK776161/birdseye_roberta-base-18", use_fast = True)
model = AutoModelForSequenceClassification.from_pretrained("AK776161/birdseye_roberta-base-18", from_tf=True)
# model1 
tokenizer1 = AutoTokenizer.from_pretrained("AK776161/birdseye_roberta-base-tweet-eval", use_fast = True)
model1 = AutoModelForSequenceClassification.from_pretrained("AK776161/birdseye_roberta-base-tweet-eval",from_tf=True)

#-----------------------Adaboot technique---------------------------
def nparraymeancalc(arr1, arr2):
  returner = []
  for i in range(0,len(arr1)):
    if(arr1[i][1] < -7):
      arr1[i][1] = 0
    returner.append(np.mean([arr1[i],arr2[i]], axis = 0))
    
  return np.array(returner)

def predictions(tokenizedtext):
  output1 = model(**tokenizedtext)
  output2 = model1(**tokenizedtext)

  logits1 = output1.logits
  logits1 = logits1.detach().numpy()

  logits2 = output2.logits
  logits2 = logits2.detach().numpy()
  
  # print(logits1, logits2)
  predictionresult = nparraymeancalc(logits1,logits2)

  return np.array(predictionresult)

def labelassign(predictionresult):
  labels = []
  for i in predictionresult:
    label_id = i.argmax()
    labels.append(label_id)
  return labels

tokenizeddata = tokenizer("----YOUR_TEXT---", return_tensors = 'pt', padding = True, truncation = True)
result = predictions(tokenizeddata)

print(labelassign(result))

Output for "I LOVE YOU":

1) Positive: 0.994
2) Negative: 0.000
3) Neutral: 0.006
Downloads last month
16
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train AK776161/birdseye_roberta-base-tweet-eval