DistilBERT Base Model for Lithuanian Reviews Sentiment Analysis

Overview

This repository contains a fine-tuned version of the distilbert/distilbert-base-multilingual-cased model for sentiment analysis classification. It was specifically trained using Lithuanian internet reviews from various domains as part of a master's degree research project on the topic "Sentiment Analysis of Lithuanian Online Reviews Using Deep Language Models".

DistilBERT is a smaller, faster, and more efficient version of BERT, retaining 97% of BERT’s language understanding while being 60% faster and 40% smaller. The base DistilBERT model was pre-trained on the Wikipedia dataset across 104 languages, including Lithuanian. The case-sensitive model can differentiate between 'labai nepatiko' and 'LABAI nepatiko'. For more architectural details refer to distilbert/distilbert-base-multilingual-cased model description.

Model Details

Model Description

  • Developed by: Brigita Vileikytė
  • Model type: Transformer-based language model
  • Language(s) (NLP): fine-tuned for Lithuanian, pre-trained on 104 languages;
  • License: Apache 2.0
  • Finetuned from model: distilbert/distilbert-base-multilingual-cased

Bias, Risks, and Limitations

While the fine-tuned DistilBERT model shows promising results in classifying sentiments from Lithuanian reviews, it is important to be aware of potential biases and limitations:

Dataset Bias
  1. Imbalance in Sentiment Distribution: The dataset contains more positive reviews than negative or neutral ones. This imbalance can lead the model to perform better on positive sentiments and less accurately on neutral or negative ones.
  2. Source Bias: Reviews were collected from specific sources (Pigu.lt, Atsiliepimai.lt, Google Maps). These sources may not represent the full spectrum of sentiments expressed across all Lithuanian internet domains.
Practical Considerations
  1. Interpretation of Sentiments: Sentiments are subjective, and the model's classification might not always align with human judgment. Users should consider the model's predictions as one of several tools for sentiment analysis.
  2. Updates and Maintenance: The model's performance may degrade as language usage evolves. Regular updates and retraining with new data can help maintain accuracy.

Training Details

Training Data

The dataset for fine-tuning the model was collected from three sources:

  1. Pigu.lt - 5993 reviews
  2. Atsiliepimai.lt - 3212 reviews
  3. Google Maps - 122795 reviews

The reviews were classified into five categories based on a 5-star rating system:

  • 5 stars: Emotionally positive sentiment (Category 4)
  • 4 stars: Rationally positive sentiment (Category 3)
  • 3 stars: Neutral sentiment (Category 2)
  • 2 stars: Rationally negative sentiment (Category 1)
  • 1 star: Emotionally negative sentiment (Category 0)

Evaluation

Performance Metrics

Model Accuracy F1 Score Overall F1 Scores by Category
DistilBERT 0.6845 0.6751 0.7601, 0.3556, 0.4938, 0.4513, 0.8354

Results

The model's performance was evaluated using a confusion matrix and various metrics. The table below presents the results for all five sentiment categories:

True Category Emotionally Negative Rationally Negative Neutral Rationally Positive Emotionally Positive
Emotionally Negative 2135 (80.74%) 248 (9.38%) 197 (7.45%) 82 (3.10%) 83 (3.14%)
Rationally Negative 362 (26.32%) 402 (29.20%) 232 (16.85%) 71 (5.15%) 40 (2.91%)
Neutral 237 (12.76%) 217 (11.69%) 984 (53.00%) 396 (21.31%) 280 (15.08%)
Rationally Positive 48 (2.63%) 32 (1.75%) 299 (16.41%) 1030 (56.51%) 978 (53.60%)
Emotionally Positive 71 (1.14%) 25 (0.40%) 149 (2.37%) 590 (9.39%) 5645 (89.61%)

The table below presents the results for three sentiment categories:

True Category Negative Neutral Positive
Negative 3147 (75.79%) 429 (10.34%) 276 (6.65%)
Neutral 454 (14.90%) 984 (32.18%) 676 (22.09%)
Positive 217 (2.98%) 445 (6.11%) 8243 (91.01%)

Getting Started

Model Usage

To use the fine-tuned model for sentiment analysis, you can follow the steps below:

from transformers import pipeline, AutoModelForSequenceClassification, AutoTokenizer

# Load the fine-tuned model and tokenizer
model_output_dir = "brivil1/lithuanian-sentiment-analysis-ByT5"
trained_model = AutoModelForSequenceClassification.from_pretrained(model_output_dir)
trained_tokenizer = AutoTokenizer.from_pretrained(model_output_dir)

# Create a sentiment analysis pipeline
sentiment_pipeline = pipeline("text-classification", model=trained_model, tokenizer=trained_tokenizer)

Example

print(sentiment_pipeline("Blogai. ziauru ir nepatiko"))
print(sentiment_pipeline("Labai puiku"))
print(sentiment_pipeline("Nežinau, visai nepatinka"))

Results:

[{'label': 'negative', 'score': 0.9424479007720947}]
[{'label': 'positive', 'score': 0.8821539282798767}]
[{'label': 'neutral', 'score': 0.9761189222335815}]
Downloads last month
513
Safetensors
Model size
135M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.