|
--- |
|
library_name: transformers |
|
tags: |
|
- trl |
|
- sft |
|
license: mit |
|
datasets: |
|
- gbharti/finance-alpaca |
|
base_model: |
|
- mistralai/Mistral-7B-v0.1 |
|
language: |
|
- en |
|
--- |
|
### Model Description |
|
|
|
This model is based on the Mistral 7B architecture, PEFT fine-tuned on financial data. It is designed to handle various finance-related NLP tasks such as financial text analysis, sentiment detection, market trend analysis, and more. This model leverages the powerful transformer architecture of Mistral with specialized fine-tuning for financial applications. |
|
|
|
- **Developed by:** Cole McIntosh |
|
- **Model type:** Transformer-based large language model (LLM) |
|
- **Language(s) (NLP):** English |
|
- **Finetuned from model:** Mistral 7B |
|
|
|
## Uses |
|
|
|
### Direct Use |
|
|
|
The Mistral 7B Finance Fine-tuned model is designed to assist users with finance-related natural language processing tasks such as: |
|
|
|
- Financial report analysis |
|
- Sentiment analysis of financial news |
|
- Forecasting market trends based on textual data |
|
- Analyzing earnings call transcripts |
|
- Extracting structured information from unstructured financial text |
|
|
|
### Downstream Use |
|
|
|
This model can be fine-tuned further for more specific tasks such as: |
|
|
|
- Portfolio analysis based on sentiment scores |
|
- Predictive analysis for stock market movements |
|
- Automated financial report generation |
|
|
|
### Out-of-Scope Use |
|
|
|
This model should not be used for tasks unrelated to finance or those requiring a high level of factual accuracy in non-financial domains. It is not suitable for: |
|
|
|
- Medical or legal document analysis |
|
- General conversational chatbots (as the model may provide misleading financial interpretations) |
|
- Decision-making without human oversight, especially in high-stakes financial operations |
|
|
|
### Recommendations |
|
|
|
- Carefully review model outputs, especially in critical financial decisions. |
|
- Use up-to-date fine-tuning datasets to ensure relevance. |
|
- Cross-validate the model's predictions or insights with alternative data sources or human expertise. |
|
|
|
## How to Get Started with the Model |
|
|
|
You can use the Hugging Face `transformers` library to load and use this model. Here’s a basic example: |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
# Load model and tokenizer |
|
tokenizer = AutoTokenizer.from_pretrained("colesmcintosh/mistral_7b_finance_finetuned") |
|
model = AutoModelForCausalLM.from_pretrained("colesmcintosh/mistral_7b_finance_finetuned") |
|
|
|
# Example usage |
|
inputs = tokenizer("Analyze the financial outlook for Q3 2024.", return_tensors="pt") |
|
outputs = model.generate(**inputs) |
|
print(tokenizer.decode(outputs[0])) |
|
``` |
|
|
|
## Training Details |
|
|
|
### Training Procedure |
|
|
|
The fine-tuning process used PEFT to accelerate training on GPUs. |
|
|
|
#### Summary |
|
|
|
The model performs well in finance-specific tasks like sentiment analysis and entity recognition. It demonstrates strong generalization across different sectors but shows slight performance drops when analyzing non-English financial texts. |
|
|
|
### Model Architecture and Objective |
|
|
|
The model is based on the Mistral 7B architecture, a highly optimized transformer-based model. Its primary objective is text generation and understanding, with a focus on financial texts. |
|
|
|
### Compute Infrastructure |
|
|
|
#### Hardware |
|
|
|
The model was fine-tuned using: |
|
|
|
- 1 NVIDIA A100 GPU (40 GB) |
|
|
|
#### Software |
|
|
|
- Hugging Face `transformers` library |
|
- PEFT finetuning |