Model Card for medical-ner-koelectra
Model Summary
This model is a fine-tuned version of the monologg/koelectra-base-v3-discriminator.
We fine-tuned the model using the KBMC and Naver X Changwon Univ NER dataset datasets.
Model Details
Model Description
- Developed by: Sungjoo Byun (Grace Byun)
- Language(s) (NLP): Korean
- License: Apache 2.0
- Finetuned from model: monologg/koelectra-base-v3-discriminator
Training Data
The model was trained using the dataset Naver X Changwon Univ NER dataset and Korean Bio-Medical Corpus (KBMC).
Model Performance
Overall Metrics
- F1 Score: 0.8886
- Loss: 0.2949
- Precision: 0.8844
- Recall: 0.8928
Class-wise Performance
Class | Precision | Recall | F1-Score | Support |
---|---|---|---|---|
AFW | 0.6676 | 0.6326 | 0.6496 | 362 |
ANM | 0.7476 | 0.7800 | 0.7635 | 600 |
Body | 0.9731 | 0.9813 | 0.9772 | 1068 |
CVL | 0.8492 | 0.8579 | 0.8536 | 4977 |
DAT | 0.9078 | 0.9286 | 0.9181 | 2130 |
Disease | 0.9738 | 0.9872 | 0.9805 | 2109 |
EVT | 0.7332 | 0.7446 | 0.7389 | 1026 |
FLD | 0.6138 | 0.6170 | 0.6154 | 188 |
LOC | 0.8721 | 0.8691 | 0.8706 | 1734 |
MAT | 0.5385 | 0.5000 | 0.5185 | 14 |
NUM | 0.9227 | 0.9305 | 0.9266 | 4660 |
ORG | 0.8917 | 0.8866 | 0.8892 | 3307 |
PER | 0.8918 | 0.9049 | 0.8983 | 3626 |
PLT | 0.2941 | 0.2174 | 0.2500 | 23 |
TIM | 0.8644 | 0.9173 | 0.8901 | 278 |
Treatment | 0.9468 | 0.9852 | 0.9656 | 271 |
Averages
Metric | Micro Avg | Macro Avg | Weighted Avg |
---|---|---|---|
Precision | 0.8844 | 0.7930 | 0.8841 |
Recall | 0.8928 | 0.7963 | 0.8928 |
F1-Score | 0.8886 | 0.7941 | 0.8884 |
Citations
Please cite our KBMC paper:
@misc{byun2024korean,
title={Korean Bio-Medical Corpus (KBMC) for Medical Named Entity Recognition},
author={Sungjoo Byun and Jiseung Hong and Sumin Park and Dongjun Jang and Jean Seo and Minseok Kim and Chaeyoung Oh and Hyopil Shin},
year={2024},
eprint={2403.16158},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Model Card Contact
For any questions or issues, please contact [email protected].
- Downloads last month
- 110
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.