devdroide commited on
Commit
22d4bc1
1 Parent(s): e9aacc4

End of training

Browse files
Files changed (4) hide show
  1. README.md +14 -89
  2. config.json +34 -34
  3. model.safetensors +1 -1
  4. tokenizer.json +1 -1
README.md CHANGED
@@ -10,12 +10,6 @@ metrics:
10
  model-index:
11
  - name: bert-base-spanish-analysis-app-questions
12
  results: []
13
- license: mit
14
- datasets:
15
- - devdroide/MiFirma-Ejemplo
16
- language:
17
- - es
18
- pipeline_tag: text-classification
19
  ---
20
 
21
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -23,9 +17,9 @@ should probably proofread and complete it, then remove this comment. -->
23
 
24
  # bert-base-spanish-analysis-app-questions
25
 
26
- This model is a fine-tuned version of [dccuchile/bert-base-spanish-wwm-uncased](https://huggingface.co/dccuchile/bert-base-spanish-wwm-uncased) on an [devdroide/MiFirma-Ejemplo](https://huggingface.co/datasets/devdroide/MiFirma-Ejemplo) dataset.
27
  It achieves the following results on the evaluation set:
28
- - Loss: 0.0004
29
  - Accuracy: 1.0
30
  - F1: 1.0
31
  - Precision: 1.0
@@ -33,31 +27,17 @@ It achieves the following results on the evaluation set:
33
 
34
  ## Model description
35
 
36
- This model was fine-tuned for question classification in a fictitious app. List label from dataset:
37
-
38
- * informacion_aplicacion
39
- * Perfiles
40
- * Perfil_adminsitrador
41
- * Perfil_cliente
42
- * Procesos
43
- * Productos
44
- * Personas_Firmantes
45
- * Error_324
46
- * Error_339
47
- * Error_507
48
- * Error_532
49
- * Error_517
50
- * Error_517_06
51
- * Error_517_10
52
- * Error_517_45
53
- * Error_517_1120
54
- * Error_301
55
-
56
- ### num_labels: 17
57
 
58
  ## Training and evaluation data
59
 
60
- Set of frequently asked questions for an application. The set of questions consists of approximately 680 questions in Spanish. The set has the split of training, validation and testing.
 
 
61
 
62
  ### Training hyperparameters
63
 
@@ -74,16 +54,16 @@ The following hyperparameters were used during training:
74
 
75
  | Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 | Precision | Recall |
76
  |:-------------:|:-----:|:----:|:---------------:|:--------:|:---:|:---------:|:------:|
77
- | No log | 1.0 | 22 | 0.0026 | 1.0 | 1.0 | 1.0 | 1.0 |
78
- | No log | 2.0 | 44 | 0.0014 | 1.0 | 1.0 | 1.0 | 1.0 |
79
  | No log | 3.0 | 66 | 0.0010 | 1.0 | 1.0 | 1.0 | 1.0 |
80
  | No log | 4.0 | 88 | 0.0008 | 1.0 | 1.0 | 1.0 | 1.0 |
81
- | No log | 5.0 | 110 | 0.0006 | 1.0 | 1.0 | 1.0 | 1.0 |
82
  | No log | 6.0 | 132 | 0.0006 | 1.0 | 1.0 | 1.0 | 1.0 |
83
  | No log | 7.0 | 154 | 0.0005 | 1.0 | 1.0 | 1.0 | 1.0 |
84
  | No log | 8.0 | 176 | 0.0005 | 1.0 | 1.0 | 1.0 | 1.0 |
85
  | No log | 9.0 | 198 | 0.0005 | 1.0 | 1.0 | 1.0 | 1.0 |
86
- | No log | 10.0 | 220 | 0.0004 | 1.0 | 1.0 | 1.0 | 1.0 |
87
 
88
 
89
  ### Framework versions
@@ -92,58 +72,3 @@ The following hyperparameters were used during training:
92
  - Pytorch 2.3.1+cu121
93
  - Datasets 2.20.0
94
  - Tokenizers 0.19.1
95
-
96
- ## Demo - Basic Usage
97
-
98
- ```python
99
- # Colab
100
-
101
- !pip install transformers
102
-
103
- name_model = "devdroide/bert-base-spanish-analysis-app-questions"
104
-
105
- from transformers import AutoTokenizer, AutoModelForSequenceClassification
106
- tokenizer = AutoTokenizer.from_pretrained(name_model)
107
- model = AutoModelForSequenceClassification.from_pretrained(name_model)
108
-
109
- def classify_question(question):
110
- inputs = tokenizer(question, padding=True, truncation=True, return_tensors="pt")
111
- outputs = model(**inputs)
112
- predictions = outputs.logits.argmax(dim=-1)
113
- list_label = ['informacion_aplicacion', 'Perfiles', 'Perfil_adminsitrador', 'Perfil_cliente', 'Procesos', 'Productos', 'Personas_Firmantes', 'Error_324', 'Error_339', 'Error_507', 'Error_532', 'Error_517', 'Error_517_06', 'Error_517_10', 'Error_517_45', 'Error_517_1120', 'Error_301']
114
- return list_label[predictions.item()]
115
-
116
- questions = [
117
- "¿Qué es mi firma?",
118
- "Hola, Al cliente le salió en la aplicación el código de error 517:06 ¿Cuál es la recomendación?",
119
- "Buenas tardes ¿En la herramienta que perfiles hay?",
120
- "Buenos días, ¿Cuál es el listado de perfiles en la aplicación?",
121
- "Buenas tardes al cliente le salió el error 517 06 ¿Cuál es la recomendación",
122
- "Hola Tengo en la herramienta el código de error 517 ¿Cuál es la recomendación?",
123
- ]
124
-
125
- for question in questions:
126
- category = classify_question(question)
127
- print(f"Question: {question}")
128
- print(f"Predicted category: {category}\n")
129
-
130
- # Response example
131
- # Question: ¿Qué es mi firma?
132
- # Predicted category: informacion_aplicacion
133
-
134
- # uestion: Hola, Al cliente le salió en la aplicación el código de error 517:06 ¿Cuál es la recomendación?
135
- # Predicted category: Error_517_06
136
-
137
- # Question: Buenas tardes ¿En la herramienta que perfiles hay?
138
- # Predicted category: Perfiles
139
-
140
- # Question: Buenos días, ¿Cuál es el listado de perfiles en la aplicación?
141
- # Predicted category: Perfiles
142
-
143
- # Question: Buenas tardes al cliente le salió el error 517 06 ¿Cuál es la recomendación
144
- # Predicted category: Error_517_06
145
-
146
- # Question: Hola Tengo en la herramienta el código de error 517 ¿Cuál es la recomendación?
147
- # Predicted category: Error_517
148
-
149
- ```
 
10
  model-index:
11
  - name: bert-base-spanish-analysis-app-questions
12
  results: []
 
 
 
 
 
 
13
  ---
14
 
15
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
17
 
18
  # bert-base-spanish-analysis-app-questions
19
 
20
+ This model is a fine-tuned version of [dccuchile/bert-base-spanish-wwm-uncased](https://huggingface.co/dccuchile/bert-base-spanish-wwm-uncased) on an unknown dataset.
21
  It achieves the following results on the evaluation set:
22
+ - Loss: 0.0005
23
  - Accuracy: 1.0
24
  - F1: 1.0
25
  - Precision: 1.0
 
27
 
28
  ## Model description
29
 
30
+ More information needed
31
+
32
+ ## Intended uses & limitations
33
+
34
+ More information needed
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
35
 
36
  ## Training and evaluation data
37
 
38
+ More information needed
39
+
40
+ ## Training procedure
41
 
42
  ### Training hyperparameters
43
 
 
54
 
55
  | Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 | Precision | Recall |
56
  |:-------------:|:-----:|:----:|:---------------:|:--------:|:---:|:---------:|:------:|
57
+ | No log | 1.0 | 22 | 0.0028 | 1.0 | 1.0 | 1.0 | 1.0 |
58
+ | No log | 2.0 | 44 | 0.0015 | 1.0 | 1.0 | 1.0 | 1.0 |
59
  | No log | 3.0 | 66 | 0.0010 | 1.0 | 1.0 | 1.0 | 1.0 |
60
  | No log | 4.0 | 88 | 0.0008 | 1.0 | 1.0 | 1.0 | 1.0 |
61
+ | No log | 5.0 | 110 | 0.0007 | 1.0 | 1.0 | 1.0 | 1.0 |
62
  | No log | 6.0 | 132 | 0.0006 | 1.0 | 1.0 | 1.0 | 1.0 |
63
  | No log | 7.0 | 154 | 0.0005 | 1.0 | 1.0 | 1.0 | 1.0 |
64
  | No log | 8.0 | 176 | 0.0005 | 1.0 | 1.0 | 1.0 | 1.0 |
65
  | No log | 9.0 | 198 | 0.0005 | 1.0 | 1.0 | 1.0 | 1.0 |
66
+ | No log | 10.0 | 220 | 0.0005 | 1.0 | 1.0 | 1.0 | 1.0 |
67
 
68
 
69
  ### Framework versions
 
72
  - Pytorch 2.3.1+cu121
73
  - Datasets 2.20.0
74
  - Tokenizers 0.19.1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
config.json CHANGED
@@ -10,44 +10,44 @@
10
  "hidden_dropout_prob": 0.1,
11
  "hidden_size": 768,
12
  "id2label": {
13
- "0": "LABEL_0",
14
- "1": "LABEL_1",
15
- "2": "LABEL_2",
16
- "3": "LABEL_3",
17
- "4": "LABEL_4",
18
- "5": "LABEL_5",
19
- "6": "LABEL_6",
20
- "7": "LABEL_7",
21
- "8": "LABEL_8",
22
- "9": "LABEL_9",
23
- "10": "LABEL_10",
24
- "11": "LABEL_11",
25
- "12": "LABEL_12",
26
- "13": "LABEL_13",
27
- "14": "LABEL_14",
28
- "15": "LABEL_15",
29
- "16": "LABEL_16"
30
  },
31
  "initializer_range": 0.02,
32
  "intermediate_size": 3072,
33
  "label2id": {
34
- "LABEL_0": 0,
35
- "LABEL_1": 1,
36
- "LABEL_10": 10,
37
- "LABEL_11": 11,
38
- "LABEL_12": 12,
39
- "LABEL_13": 13,
40
- "LABEL_14": 14,
41
- "LABEL_15": 15,
42
- "LABEL_16": 16,
43
- "LABEL_2": 2,
44
- "LABEL_3": 3,
45
- "LABEL_4": 4,
46
- "LABEL_5": 5,
47
- "LABEL_6": 6,
48
- "LABEL_7": 7,
49
- "LABEL_8": 8,
50
- "LABEL_9": 9
51
  },
52
  "layer_norm_eps": 1e-12,
53
  "max_position_embeddings": 512,
 
10
  "hidden_dropout_prob": 0.1,
11
  "hidden_size": 768,
12
  "id2label": {
13
+ "0": "informacion_aplicacion",
14
+ "1": "Perfiles",
15
+ "10": "Error_532",
16
+ "11": "Error_517",
17
+ "12": "Error_517_06",
18
+ "13": "Error_517_10",
19
+ "14": "Error_517_45",
20
+ "15": "Error_517_1120",
21
+ "16": "Error_301",
22
+ "2": "Perfil_adminsitrador",
23
+ "3": "Perfil_cliente",
24
+ "4": "Procesos",
25
+ "5": "Productos",
26
+ "6": "Personas_Firmantes",
27
+ "7": "Error_324",
28
+ "8": "Error_339",
29
+ "9": "Error_507"
30
  },
31
  "initializer_range": 0.02,
32
  "intermediate_size": 3072,
33
  "label2id": {
34
+ "Error_301": "16",
35
+ "Error_324": "7",
36
+ "Error_339": "8",
37
+ "Error_507": "9",
38
+ "Error_517": "11",
39
+ "Error_517_06": "12",
40
+ "Error_517_10": "13",
41
+ "Error_517_1120": "15",
42
+ "Error_517_45": "14",
43
+ "Error_532": "10",
44
+ "Perfil_adminsitrador": "2",
45
+ "Perfil_cliente": "3",
46
+ "Perfiles": "1",
47
+ "Personas_Firmantes": "6",
48
+ "Procesos": "4",
49
+ "Productos": "5",
50
+ "informacion_aplicacion": "0"
51
  },
52
  "layer_norm_eps": 1e-12,
53
  "max_position_embeddings": 512,
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:d7cb23705e90854dbcef51e3ef2d907cd492e2a4de419dcd8f560b4c553c60df
3
  size 439479348
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:db85e3b5162c5def82c89449f435e72ad0011c863854b33116f1ff26e42c83fb
3
  size 439479348
tokenizer.json CHANGED
@@ -2,7 +2,7 @@
2
  "version": "1.0",
3
  "truncation": {
4
  "direction": "Right",
5
- "max_length": 34,
6
  "strategy": "LongestFirst",
7
  "stride": 0
8
  },
 
2
  "version": "1.0",
3
  "truncation": {
4
  "direction": "Right",
5
+ "max_length": 24,
6
  "strategy": "LongestFirst",
7
  "stride": 0
8
  },