metadata
library_name: transformers
tags:
- indobert
- indonlu
- indobenchmark
datasets:
- fahrendrakhoirul/ecommerce-reviews-multilabel-dataset
language:
- id
metrics:
- f1
- precision
- recall
This model leverages IndoBERT for understanding language and a Long Short-Term Memory (LSTM) network to capture sequential information in customer reviews. It's designed for multi-label classification of e-commerce reviews, focusing on:
- Produk (Product): Customer satisfaction with product quality, performance, and description accuracy.
- Layanan Pelanggan (Customer Service): Interaction with sellers, their responsiveness, and complaint handling.
- Pengiriman (Shipping/Delivery): Speed of delivery, item condition upon arrival, and timeliness.
How to import in PyTorch:
import torch.nn as nn
from huggingface_hub import PyTorchModelHubMixin
from transformers import AutoModelForSequenceClassification, AutoTokenizer
class IndoBertLSTMEcommerceReview(nn.Module, PyTorchModelHubMixin):
def __init__(self, bert):
super().__init__()
self.bert = bert
self.lstm = nn.LSTM(bert.config.hidden_size, 128)
self.linear = nn.Linear(128, 3)
self.sigmoid = nn.Sigmoid()
def forward(self, input_ids, attention_mask):
outputs = self.bert(input_ids=input_ids, attention_mask=attention_mask)
last_hidden_state = outputs.last_hidden_state
lstm_out, _ = self.lstm(last_hidden_state)
pooled = lstm_out[:, -1, :]
logits = self.linear(pooled)
probabilities = self.sigmoid(logits)
return probabilities
bert = AutoModelForSequenceClassification.from_pretrained("indobenchmark/indobert-base-p1",
num_labels=3,
problem_type="multi_label_classification")
tokenizer = AutoTokenizer.from_pretrained("fahrendrakhoirul/indobert-lstm-finetuned-ecommerce-reviews")
model = IndoBertLSTMEcommerceReview.from_pretrained("fahrendrakhoirul/indobert-lstm-finetuned-ecommerce-reviews", bert=bert)