fuchenru
/

Trading-Hero-LLM

Text Classification

sentiment analysis

financial sentiment analysis

Inference Endpoints

Model card Files Files and versions Community

fuchenru commited on May 25

Commit

51ec3c2

•

1 Parent(s): 077edb9

Update README.md

Files changed (1) hide show

README.md +89 -3

README.md CHANGED Viewed

@@ -1,3 +1,89 @@
----
-license: mit
----

+---
+license: mit
+tags:
+- sentiment analysis
+- financial sentiment analysis
+- bert
+- text-classification
+- finance
+- finbert
+- financial
+---
+# Trading Hero Financial Sentiment Analysis
+Model Description: This model is a fine-tuned version of [FinBERT](https://huggingface.co/yiyanghkust/finbert-pretrain), a BERT model pre-trained on financial texts. The fine-tuning process was conducted to adapt the model to specific financial NLP tasks, enhancing its performance on domain-specific applications for sentiment analysis.
+## Model Use
+Primary Users: Financial analysts, NLP researchers, and developers working on financial data.
+## Training Data
+Training Dataset: The model was fine-tuned on a custom dataset of financial communication texts. The dataset was split into training, validation, and test sets as follows:
+Training Set: 10,918,272 tokens
+Validation Set: 1,213,184 tokens
+Test Set: 1,347,968 tokens
+Pre-training Dataset: FinBERT was pre-trained on a large financial corpus totaling 4.9 billion tokens, including:
+Corporate Reports (10-K & 10-Q): 2.5 billion tokens
+Earnings Call Transcripts: 1.3 billion tokens
+Analyst Reports: 1.1 billion tokens
+## Evaluation
+* Test Accuracy = 0.908469
+* Test Precision = 0.927788
+* Test Recall = 0.908469
+* Test F1 = 0.913267
+* **Labels**: 0 -> Neutral; 1 -> Positive; 2 -> Negative
+## Usage
+```
+import torch
+from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline
+tokenizer = AutoTokenizer.from_pretrained("fuchenru/Trading-Hero-LLM")
+model = AutoModelForSequenceClassification.from_pretrained("fuchenru/Trading-Hero-LLM")
+nlp = pipeline("text-classification", model=model, tokenizer=tokenizer)
+# Preprocess the input text
+def preprocess(text, tokenizer, max_length=128):
+    inputs = tokenizer(text, truncation=True, padding='max_length', max_length=max_length, return_tensors='pt')
+    return inputs
+# Function to perform prediction
+def predict_sentiment(input_text):
+    # Tokenize the input text
+    inputs = tokenizer(input_text, return_tensors="pt", truncation=True, padding=True)
+    # Perform inference
+    with torch.no_grad():
+        outputs = model(**inputs)
+    # Get predicted label
+    predicted_label = torch.argmax(outputs.logits, dim=1).item()
+    # Map the predicted label to the original labels
+    label_map = {0: 'neutral', 1: 'positive', 2: 'negative'}
+    predicted_sentiment = label_map[predicted_label]
+    return predicted_sentiment
+stock_news = [
+    "Market analysts predict a stable outlook for the coming weeks.",
+    "The market remained relatively flat today, with minimal movement in stock prices.",
+    "Investor sentiment improved following news of a potential trade deal.",
+.......
+]
+for i in stock_news:
+    predicted_sentiment = predict_sentiment(i)
+    print("Predicted Sentiment:", predicted_sentiment)
+```
+```
+Predicted Sentiment: neutral
+Predicted Sentiment: neutral
+Predicted Sentiment: positive
+```