Bitext-Jamba-1.5-Mini-Banking-Customer-Support

Model Description

This model is version of ai21labs/AI21-Jamba-1.5-Mini fine-tuned on the Bitext Banking Customer Support Dataset dataset, which is specifically tailored for the Banking domain. It is optimized to answer questions and assist users with various banking transactions. It has been trained using hybrid synthetic data generated using our NLP/NLG technology and our automated Data Labeling (DAL) tools.

The goal of this model is to show that a generic verticalized model makes customization for a final use case much easier. For example, if you are "ACME Bank", you can create your own customized model by using this fine-tuned model and doing an additional fine-tuning using a small amount of your own data. An overview of this approach can be found at: From General-Purpose LLMs to Verticalized Enterprise Models

Intended Use

Recommended applications: This model is designed to be used as the first step in Bitext’s two-step approach to LLM fine-tuning for the creation of chatbots, virtual assistants and copilots for the Banking domain, providing customers with fast and accurate answers about their banking needs.
Out-of-scope: This model is not suited for non-banking related questions and should not be used for providing health, legal, or critical safety advice.

Training Data

The model was fine-tuned on a dataset comprising various banking-related intents, including transactions like balance checks, money transfers, loan applications, and more, totaling 89 intents each represented by approximately 1000 examples. This comprehensive training helps the model address a broad spectrum of banking-related questions effectively. The dataset follows the same structured approach as our dataset published on Hugging Face as bitext/Bitext-customer-support-llm-chatbot-training-dataset, but with a focus on banking.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 16
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.01
num_epochs: 3

Training results

Training Loss	Epoch	Step	Validation Loss
0.7424	0.9983	299	0.7416
0.7224	2.0	599	0.7293
0.7153	2.9950	897	0.7288

Framework versions

PEFT 0.13.2
Transformers 4.45.0.dev0
Pytorch 2.1.0+cu118
Datasets 3.1.0
Tokenizers 0.19.1

Limitations and Bias

The model is trained for banking-specific contexts but may underperform in unrelated areas.
Potential biases in the training data could affect the neutrality of the responses; users are encouraged to evaluate responses critically.

Ethical Considerations

It is important to use this technology thoughtfully, ensuring it does not substitute for human judgment where necessary, especially in sensitive financial situations.

Acknowledgments

This model was developed and trained by Bitext using proprietary data and technology.

License

This model, "Bitext-Jamba-1.5-Mini-Banking-Customer-Support", is licensed under the Jamba Open Model License, a permissive license allowing full research use and commercial use under the license terms. If you need to license the model for your needs, talk to us.

bitext
/

Bitext-Jamba-1.5-Mini-Banking-Customer-Support