Ansh007's picture
Update README.md
473b421 verified
metadata
language:
  - en
library_name: transformers
tags:
  - bert
  - multilabel
  - classification
  - finetune
  - finance
  - regulatory
  - text
  - risk
metrics:
  - f1
pipeline_tag: text-classification
widget:
  - text: >-
      Where an FI employs a technological solution provided by an external party
      to conduct screening of virtual asset transactions and the associated
      wallet addresses, the FI remains responsible for discharging its AML/CFT
      obligations. The FI should conduct due diligence on the solution before
      deploying it, taking into account relevant factors such as:

This model is a fine-tuned version of the BERT language model, specifically adapted for multi-label classification tasks in the financial regulatory domain. It is built upon the pre-trained ProsusAI/finbert model, which has been further fine-tuned using a diverse dataset of financial regulatory texts. This allows the model to accurately classify text into multiple relevant categories simultaneously.

Model Architecture

  • Base Model: BERT
  • Pre-trained Model: ProsusAI/finbert
  • Task: Multi-label classification

Performance

Performance metrics on the validation set:

  • F1 Score: 0.8637
  • ROC AUC: 0.9044
  • Accuracy: 0.6155

Limitations and Ethical Considerations

  • This model's performance may vary depending on the specific nature of the text data and label distribution.
  • Class imbalance in the dataset.

Dataset Information

  • Training Dataset: Number of samples: 6562
  • Validation Dataset: Number of samples: 929
  • Test Dataset: Number of samples: 1884

Training Details

  • Training Strategy: Fine-tuning BERT with a randomly initialized classification head.
  • Optimizer: Adam
  • Learning Rate: 1e-4
  • Batch Size: 16
  • Number of Epochs: 2
  • Evaluation Strategy: Epoch
  • Weight Decay: 0.01
  • Metric for Best Model: F1 Score