File size: 4,140 Bytes
19385e3 62a5e87 19385e3 afe7010 19385e3 9eef501 8f4ab80 7f26536 9eef501 9e6a657 9eef501 7f26536 9eef501 62a5e87 9eef501 7f26536 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 |
---
license: apache-2.0
language:
- en
pipeline_tag: text-classification
tags:
- Sentiment Analysis
- Language Models
---
# DistilSenti-Net42M: Context Distilled Small Language Model For Sentiment Analysis
## Model Architecture
- **Embedding Layer**: Converts input text into dense vectors.
- **CNN Layers**: Extracts features from text sequences.
- **Vanilla RNN, LSTM**: Capture temporal dependencies in text.
- **Dense Layers**: Classify text into sentiment categories.
## Usage
You can use this model for sentiment analysis on text data. Here's a sample code to load and use the model:
```python
from huggingface_hub import from_pretrained_keras
import re
import numpy as np
from tensorflow.keras.preprocessing.sequence import pad_sequences
# Load model
model = from_pretrained_keras("Ravinthiran/DistilSenti-Net42M")
# Example prediction function
def predict_sentiment(text, model, tokenizer, label_encoder):
text = text.lower()
text = re.sub(r'[^\w\s]', '', text)
sequence = tokenizer.texts_to_sequences([text])
padded_sequence = pad_sequences(sequence, maxlen=100)
pred = model.predict(padded_sequence)
sentiment = label_encoder.inverse_transform(pred.argmax(axis=1))
sentiment_score = pred[0]
return sentiment[0], sentiment_score
# Example usage
new_text = "I recently started a new fitness program at a local wellness center, and it has been an incredibly positive experience."
predicted_sentiment, sentiment_score = predict_sentiment(new_text, model, tokenizer, label_encoder)
print(f"Predicted Sentiment: {predicted_sentiment}")
print(f"Sentiment Scores: {sentiment_score}")
```
## Using Keras
Download DistilSentiNet-42M.keras https://huggingface.co./Ravinthiran/Distilsenti-Net-42M/blob/main/DistilSentiNet-42M.keras
## Using HDFS (H5)
Download DistilSentiNet-42M.h5 here: https://huggingface.co./Ravinthiran/DistilSenti-Net42M/blob/main/DistilSentiNet-42M.h5
```python
import numpy as np
import pandas as pd
import re
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.preprocessing import LabelEncoder
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.models import load_model
# Load the saved Keras model
model_hybrid = load_model('< DistilSentiNet-42M.h5 File Path > or < DistilSentiNet-42M.keras File Path >')
# Sample data
df = pd.read_csv("<Your Test Dataset>")
# Preprocessing
df['text'] = df['text'].str.lower().str.replace('[^\w\s]', '', regex=True)
# Encode labels
label_encoder = LabelEncoder()
df['label'] = label_encoder.fit_transform(df['sentiment'])
# Tokenization and padding
tokenizer = Tokenizer(num_words=5000)
tokenizer.fit_on_texts(df['text'])
X = tokenizer.texts_to_sequences(df['text'])
X = pad_sequences(X, maxlen=100)
# Function to predict sentiment of new input text
def predict_sentiment(text, tokenizer, model):
# Preprocess the input text
text = text.lower()
text = re.sub(r'[^\w\s]', '', text)
sequence = tokenizer.texts_to_sequences([text])
padded_sequence = pad_sequences(sequence, maxlen=100)
# Predict sentiment
pred = model.predict(padded_sequence)
sentiment = label_encoder.inverse_transform(pred.argmax(axis=1))
sentiment_score = pred[0]
return sentiment[0], sentiment_score
# Example usage
new_text = "I recently started a new fitness program at a local wellness center, and it has been an incredibly positive experience. The trainers are highly knowledgeable and provide personalized guidance to help me achieve my fitness goals. The facilities are state-of-the-art, with a wide range of equipment and classes to choose from. The supportive community and motivating environment have made working out enjoyable and rewarding. I have already noticed significant improvements in my health and fitness levels, and the positive changes have greatly enhanced my overall well-being."
predicted_sentiment, sentiment_score = predict_sentiment(new_text, tokenizer, model_hybrid)
print(f"The sentiment of the input text is: {predicted_sentiment} with scores {sentiment_score}")
```
|