Bitext-Jamba-1.5-Mini-Banking-Customer-Support

Model Description

This model is version of ai21labs/AI21-Jamba-1.5-Mini fine-tuned on the Bitext Banking Customer Support Dataset dataset, which is specifically tailored for the Banking domain. It is optimized to answer questions and assist users with various banking transactions. It has been trained using hybrid synthetic data generated using our NLP/NLG technology and our automated Data Labeling (DAL) tools.

The goal of this model is to show that a generic verticalized model makes customization for a final use case much easier. For example, if you are "ACME Bank", you can create your own customized model by using this fine-tuned model and doing an additional fine-tuning using a small amount of your own data. An overview of this approach can be found at: From General-Purpose LLMs to Verticalized Enterprise Models

Intended Use

  • Recommended applications: This model is designed to be used as the first step in Bitext’s two-step approach to LLM fine-tuning for the creation of chatbots, virtual assistants and copilots for the Banking domain, providing customers with fast and accurate answers about their banking needs.
  • Out-of-scope: This model is not suited for non-banking related questions and should not be used for providing health, legal, or critical safety advice.

Training Data

The model was fine-tuned on a dataset comprising various banking-related intents, including transactions like balance checks, money transfers, loan applications, and more, totaling 89 intents each represented by approximately 1000 examples. This comprehensive training helps the model address a broad spectrum of banking-related questions effectively. The dataset follows the same structured approach as our dataset published on Hugging Face as bitext/Bitext-customer-support-llm-chatbot-training-dataset, but with a focus on banking.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.01
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss
0.7424 0.9983 299 0.7416
0.7224 2.0 599 0.7293
0.7153 2.9950 897 0.7288

Framework versions

  • PEFT 0.13.2
  • Transformers 4.45.0.dev0
  • Pytorch 2.1.0+cu118
  • Datasets 3.1.0
  • Tokenizers 0.19.1

Limitations and Bias

  • The model is trained for banking-specific contexts but may underperform in unrelated areas.
  • Potential biases in the training data could affect the neutrality of the responses; users are encouraged to evaluate responses critically.

Ethical Considerations

It is important to use this technology thoughtfully, ensuring it does not substitute for human judgment where necessary, especially in sensitive financial situations.

Acknowledgments

This model was developed and trained by Bitext using proprietary data and technology.

License

This model, "Bitext-Jamba-1.5-Mini-Banking-Customer-Support", is licensed under the Jamba Open Model License, a permissive license allowing full research use and commercial use under the license terms. If you need to license the model for your needs, talk to us.

Downloads last month
95
Safetensors
Model size
51.6B params
Tensor type
FP16
·
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for bitext/Bitext-Jamba-1.5-Mini-Banking-Customer-Support

Finetuned
(3)
this model

Dataset used to train bitext/Bitext-Jamba-1.5-Mini-Banking-Customer-Support

Collection including bitext/Bitext-Jamba-1.5-Mini-Banking-Customer-Support