File size: 1,886 Bytes
7818195
1ca6bd8
 
7818195
267a51f
 
 
 
 
473b421
 
 
 
267a51f
 
e56b297
 
473b421
 
 
 
 
 
7818195
 
7d378dd
 
 
7818195
7d378dd
7818195
7d378dd
 
 
7818195
7d378dd
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e56b297
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
---
language:
- en
library_name: transformers
tags:
- bert
- multilabel
- classification
- finetune
- finance
- regulatory
- text
- risk
metrics:
- f1
pipeline_tag: text-classification
widget:
- text: >-
    Where an FI employs a technological solution provided by an external party
    to conduct screening of virtual asset transactions and the associated wallet
    addresses, the FI remains responsible for discharging its AML/CFT
    obligations. The FI should conduct due diligence on the solution before
    deploying it, taking into account relevant factors such as:
---

This model is a fine-tuned version of the BERT language model, specifically adapted for multi-label classification tasks in the 
financial regulatory domain. It is built upon the pre-trained ProsusAI/finbert model, which has been further fine-tuned using a diverse 
dataset of financial regulatory texts. This allows the model to accurately classify text into multiple relevant categories simultaneously.

# Model Architecture

- **Base Model**: BERT
- **Pre-trained Model**: ProsusAI/finbert
- **Task**: Multi-label classification


## Performance

Performance metrics on the validation set:

- F1 Score: 0.8637
- ROC AUC: 0.9044
- Accuracy: 0.6155

## Limitations and Ethical Considerations

- This model's performance may vary depending on the specific nature of the text data and label distribution.
- Class imbalance in the dataset.

## Dataset Information

- **Training Dataset**: Number of samples: 6562
- **Validation Dataset**: Number of samples: 929
- **Test Dataset**: Number of samples: 1884

## Training Details

- **Training Strategy**: Fine-tuning BERT with a randomly initialized classification head.
- **Optimizer**: Adam
- **Learning Rate**: 1e-4
- **Batch Size**: 16
- **Number of Epochs**: 2
- **Evaluation Strategy**: Epoch
- **Weight Decay**: 0.01
- **Metric for Best Model**: F1 Score