Update README.md
Browse files
README.md
CHANGED
@@ -6,6 +6,17 @@ tags:
|
|
6 |
model-index:
|
7 |
- name: TAPP-multilabel-bge
|
8 |
results: []
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
9 |
---
|
10 |
|
11 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
@@ -13,9 +24,11 @@ should probably proofread and complete it, then remove this comment. -->
|
|
13 |
|
14 |
# TAPP-multilabel-bge
|
15 |
|
16 |
-
This model is a fine-tuned version of [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) on the
|
|
|
|
|
17 |
It achieves the following results on the evaluation set:
|
18 |
-
|
19 |
- Precision-micro: 0.7772
|
20 |
- Precision-samples: 0.7644
|
21 |
- Precision-weighted: 0.7756
|
@@ -28,7 +41,16 @@ It achieves the following results on the evaluation set:
|
|
28 |
|
29 |
## Model description
|
30 |
|
31 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
32 |
|
33 |
## Intended uses & limitations
|
34 |
|
@@ -36,7 +58,21 @@ More information needed
|
|
36 |
|
37 |
## Training and evaluation data
|
38 |
|
39 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
40 |
|
41 |
## Training procedure
|
42 |
|
@@ -64,10 +100,16 @@ The following hyperparameters were used during training:
|
|
64 |
| 0.0291 | 6.0 | 3762 | 0.8849 | 0.7773 | 0.7640 | 0.7776 | 0.8301 | 0.7890 | 0.8301 | 0.8028 | 0.7597 | 0.8027 |
|
65 |
| 0.0147 | 7.0 | 4389 | 0.9217 | 0.7772 | 0.7644 | 0.7756 | 0.8329 | 0.7920 | 0.8329 | 0.8041 | 0.7609 | 0.8029 |
|
66 |
|
|
|
|
|
|
|
|
|
|
|
|
|
67 |
|
68 |
### Framework versions
|
69 |
|
70 |
- Transformers 4.38.1
|
71 |
- Pytorch 2.1.0+cu121
|
72 |
- Datasets 2.18.0
|
73 |
-
- Tokenizers 0.15.2
|
|
|
6 |
model-index:
|
7 |
- name: TAPP-multilabel-bge
|
8 |
results: []
|
9 |
+
datasets:
|
10 |
+
- GIZ/policy_classification
|
11 |
+
co2_eq_emissions:
|
12 |
+
emissions: 71.4552917731392
|
13 |
+
source: codecarbon
|
14 |
+
training_type: fine-tuning
|
15 |
+
on_cloud: true
|
16 |
+
cpu_model: Intel(R) Xeon(R) CPU @ 2.30GHz
|
17 |
+
ram_total_size: 12.6747894287109
|
18 |
+
hours_used: 1.36
|
19 |
+
hardware_used: 1 x Tesla T4
|
20 |
---
|
21 |
|
22 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
|
|
24 |
|
25 |
# TAPP-multilabel-bge
|
26 |
|
27 |
+
This model is a fine-tuned version of [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) on the [Policy-Classification](https://huggingface.co/datasets/GIZ/policy_classification) dataset.
|
28 |
+
|
29 |
+
*The loss function BCEWithLogitsLoss is modified with pos_weight to focus on recall, therefore instead of loss the evaluation metrics are used to assess the model performance during training*
|
30 |
It achieves the following results on the evaluation set:
|
31 |
+
|
32 |
- Precision-micro: 0.7772
|
33 |
- Precision-samples: 0.7644
|
34 |
- Precision-weighted: 0.7756
|
|
|
41 |
|
42 |
## Model description
|
43 |
|
44 |
+
The purpose of this model is to predict multiple labels simultaneously from a given input data. Specifically, the model will predict four labels -
|
45 |
+
ActionLabel, PlansLabel, PolicyLabel, and TargetLabel - that are relevant to a particular task or application
|
46 |
+
- **Target**: Targets are an intention to achieve a specific result, for example, to reduce GHG emissions to a specific level
|
47 |
+
(a GHG target) or increase energy efficiency or renewable energy to a specific level (a non-GHG target), typically by
|
48 |
+
a certain date.
|
49 |
+
- **Action**: Actions are an intention to implement specific means of achieving GHG reductions, usually in forms of concrete projects.
|
50 |
+
- **Policies**: Policies are domestic planning documents such as policies, regulations or guidlines.
|
51 |
+
- **Plans**:Plans are broader than specific policies or actions, such as a general intention to ‘improve efficiency’, ‘develop renewable energy’, etc.
|
52 |
+
|
53 |
+
*The terms come from the World Bank's NDC platform and WRI's publication*
|
54 |
|
55 |
## Intended uses & limitations
|
56 |
|
|
|
58 |
|
59 |
## Training and evaluation data
|
60 |
|
61 |
+
- Training Dataset: 10031
|
62 |
+
| Class | Positive Count of Class|
|
63 |
+
|:-------------|:--------|
|
64 |
+
| Action | 5416 |
|
65 |
+
| Plans | 2140 |
|
66 |
+
| Policy | 1396|
|
67 |
+
| Target | 2911 |
|
68 |
+
|
69 |
+
- Validation Dataset: 932
|
70 |
+
| Class | Positive Count of Class|
|
71 |
+
|:-------------|:--------|
|
72 |
+
| Action | 513 |
|
73 |
+
| Plans | 198 |
|
74 |
+
| Policy | 122 |
|
75 |
+
| Target | 256 |
|
76 |
|
77 |
## Training procedure
|
78 |
|
|
|
100 |
| 0.0291 | 6.0 | 3762 | 0.8849 | 0.7773 | 0.7640 | 0.7776 | 0.8301 | 0.7890 | 0.8301 | 0.8028 | 0.7597 | 0.8027 |
|
101 |
| 0.0147 | 7.0 | 4389 | 0.9217 | 0.7772 | 0.7644 | 0.7756 | 0.8329 | 0.7920 | 0.8329 | 0.8041 | 0.7609 | 0.8029 |
|
102 |
|
103 |
+
|label | precision |recall |f1-score| support|
|
104 |
+
|:-------------:|:---------:|:-----:|:------:|:------:|
|
105 |
+
|Action |0.826 |0.883 |0.853 | 513.0 |
|
106 |
+
|Plans |0.653 |0.646 |0.649 | 198.0 |
|
107 |
+
|Policy |0.726 |0.803 |0.762 | 122.0 |
|
108 |
+
|Target |0.791 |0.890 |0.838 | 256.0 |
|
109 |
|
110 |
### Framework versions
|
111 |
|
112 |
- Transformers 4.38.1
|
113 |
- Pytorch 2.1.0+cu121
|
114 |
- Datasets 2.18.0
|
115 |
+
- Tokenizers 0.15.2
|