smarsol commited on
Commit
995c906
1 Parent(s): c6357e3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +89 -3
README.md CHANGED
@@ -1,3 +1,89 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - es
5
+ pipeline_tag: text-classification
6
+ tags:
7
+ - sentence-transformers
8
+ - text-classification
9
+ - bert
10
+ - biomedical
11
+ - lexical semantics
12
+ - bionlp
13
+ ---
14
+
15
+ # Biomedical term classifier with SetFit in Spanish
16
+
17
+ ## Table of contents
18
+ <details>
19
+ <summary>Click to expand</summary>
20
+
21
+ - [Model description](#model-description)
22
+ - [Intended uses and limitations](#intended-use)
23
+ - [How to use](#how-to-use)
24
+ - [Training](#training)
25
+ - [Evaluation](#evaluation)
26
+ - [Additional information](#additional-information)
27
+ - [Author](#author)
28
+ - [Licensing information](#licensing-information)
29
+ - [Citation information](#citation-information)
30
+ - [Disclaimer](#disclaimer)
31
+
32
+ </details>
33
+
34
+ ## Model description
35
+ This is a Transformer's [AutoModelForSequenceClassification](https://huggingface.co/docs/transformers/model_doc/auto#transformers.AutoModelForSequenceClassification) trained for multilabel biomedical text classification in Spanish.
36
+
37
+ ## Intended uses and limitations
38
+ The model is prepared to classify medical entities among 21 classes, including diseases, medical procedures, symptoms, and drugs, among others. It still lacks some classes like body structures.
39
+
40
+ ## How to use
41
+ This model is implemented as part of the KeyCARE library. Install first the keycare module to call the SetFit classifier:
42
+
43
+ ```bash
44
+ python -m pip install keycare
45
+ ```
46
+
47
+ You can then run the KeyCARE pipeline that uses the SetFit model:
48
+
49
+ ```python
50
+ from keycare install TermExtractor.TermExtractor
51
+
52
+ # initialize the termextractor object
53
+ termextractor = TermExtractor(categorization_method='transformers')
54
+ # Run the pipeline
55
+ text = """Acude al Servicio de Urgencias por cefalea frontoparietal derecha.
56
+ Mediante biopsia se diagnostica adenocarcinoma de próstata Gleason 4+4=8 con metástasis óseas múltiples.
57
+ Se trata con Ácido Zoledrónico 4 mg iv/4 semanas.
58
+ """
59
+ termextractor(text)
60
+ # You can also access the class storing the SetFit model
61
+ categorizer = termextractor.categorizer
62
+ ```
63
+
64
+ ## Training
65
+ The model has been trained using data obtained from NER Gold Standard Corpora also generated by BSC-NLP4BIA, including [MedProcNER](https://temu.bsc.es/medprocner/), [DISTEMIST](https://temu.bsc.es/distemist/), [SympTEMIST](https://temu.bsc.es/symptemist/), [CANTEMIST](https://temu.bsc.es/cantemist/), and [PharmaCoNER](https://temu.bsc.es/pharmaconer/), among others.
66
+
67
+ ## Evaluation
68
+ To be published
69
+
70
+ ## Additional information
71
+
72
+ ### Author
73
+ NLP4BIA at the Barcelona Supercomputing Center
74
+
75
+ ### Licensing information
76
+ [Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0)
77
+
78
+ ### Citation information
79
+ To be published
80
+
81
+ ### Disclaimer
82
+ <details>
83
+ <summary>Click to expand</summary>
84
+
85
+ The models published in this repository are intended for a generalist purpose and are available to third parties. These models may have bias and/or any other undesirable distortions.
86
+
87
+ When third parties, deploy or provide systems and/or services to other parties using any of these models (or using systems based on these models) or become users of the models, they should note that it is their responsibility to mitigate the risks arising from their use and, in any event, to comply with applicable regulations, including regulations regarding the use of Artificial Intelligence.
88
+
89
+ </details>