BERT-Based Sentiment Analysis Models

Model Description

This repository contains two versions of BERT-based models fine-tuned for sentiment analysis tasks:

  • BERT-1: Fine-tuned on the IMDB movie reviews dataset.
  • BERT-2: Fine-tuned on a combined dataset of IMDB movie reviews dataset and Twitter comments.

Both models are based on the bert-base-uncased pre-trained model from Hugging Face's Transformers library.

Intended Use

These models are intended for binary sentiment analysis of English text data. They can be used to classify text into positive or negative sentiment categories.

Loading the Models

from transformers import AutoTokenizer, AutoModelForSequenceClassification

# Load BERT-1
tokenizer_bert1 = AutoTokenizer.from_pretrained("verneylmavt/bert-base-uncased_sentiment-analysis/bert-1")
model_bert1 = AutoModelForSequenceClassification.from_pretrained("verneylmavt/bert-base-uncased_sentiment-analysis/bert-1")

# Load BERT-2
tokenizer_bert2 = AutoTokenizer.from_pretrained("verneylmavt/bert-base-uncased_sentiment-analysis/bert-2")
model_bert2 = AutoModelForSequenceClassification.from_pretrained("verneylmavt/bert-base-uncased_sentiment-analysis/bert-2")

Performing Sentiment Analysis

from transformers import pipeline

# Initialize pipelines
sentiment_pipeline_bert1 = pipeline("sentiment-analysis", model=model_bert1, tokenizer=tokenizer_bert1)
sentiment_pipeline_bert2 = pipeline("sentiment-analysis", model=model_bert2, tokenizer=tokenizer_bert2)

# Sample text
text = "I absolutely loved this product! It exceeded my expectations."

# Get predictions
result_bert1 = sentiment_pipeline_bert1(text)
result_bert2 = sentiment_pipeline_bert2(text)

print("BERT-1 Prediction:", result_bert1)
print("BERT-2 Prediction:", result_bert2)

Training Details

BERT-1

  • Dataset: IMDB Movie Reviews Dataset
  • Objective: Binary sentiment classification (positive/negative)
  • Optimizer: AdamW with a learning rate lr (value unspecified)
  • Scheduler: Linear scheduler with warmup (get_linear_schedule_with_warmup)
  • Epochs: num_epochs = 3
  • Device: Trained on GPU if available
  • Metrics Monitored: Training loss, training accuracy, testing accuracy per epoch

BERT-2

  • Dataset:
  • Objective: Binary sentiment classification (positive/negative)
  • Optimizer: AdamW with weight decay (0.01) and parameters requiring gradients
  • Scheduler: Linear scheduler with warmup (10% of total steps)
  • Gradient Clipping: Applied with max_norm=1.0
  • Early Stopping: Implemented with a patience of 2 epochs without improvement in validation loss
  • Epochs: num_epochs = 3, training may stop early due to early stopping
  • Device: Trained on GPU if available
  • Metrics Monitored: Training loss, training accuracy, validation loss, validation accuracy per epoch

Limitations and Biases

  • Data Bias: The models are trained on specific datasets, which may contain inherent biases such as demographic or cultural biases.
  • Language Support: Only supports English language text.
  • Generalization: Performance may degrade on text significantly different from the training data (e.g., slang, jargon).
  • Ethical Considerations: Users should be cautious of potential biases in predictions and should not use the model for critical decisions without human oversight.

License

The models are distributed under the same license as the original bert-base-uncased model (Apache License 2.0).

Acknowledgements


Disclaimer: The models are provided "as is" without warranty of any kind. The author is not responsible for any outcomes resulting from the use of these models.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Examples
Unable to determine this model's library. Check the docs .

Model tree for verneylmavt/bert-base-uncased_sentiment-analysis

Finetuned
(2372)
this model

Dataset used to train verneylmavt/bert-base-uncased_sentiment-analysis