CardioBERTpt - Portuguese Transformer-based Models for Clinical Language Representation in Cardiology

This model card describes CardioBERTpt, a clinical model trained on the cardiology domain for NER tasks in Portuguese. This model is a fine-tuned version of bert-base-multilingual-cased on a cardiology text dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4495
  • Accuracy: 0.8864

How to use the model

Load the model via the transformers library:

from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("pucpr-br/cardiobertpt")
model = AutoModel.from_pretrained("pucpr-br/cardiobertpt")

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 15.0

Framework versions

  • Transformers 4.17.0.dev0
  • Pytorch 1.8.0
  • Datasets 1.18.3
  • Tokenizers 0.11.0

More Information

Refer to the original paper, CardioBERTpt - Portuguese Transformer-based Models for Clinical Language Representation in Cardiology for additional details and performance on Portuguese NER tasks.

Acknowledgements

This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001, and by Foxconn Brazil and Zerbini Foundation as part of the research project Machine Learning in Cardiovascular Medicine.

Citation

@INPROCEEDINGS{10178779,
  author={Schneider, Elisa Terumi Rubel and Gumiel, Yohan Bonescki and de Souza, João Vitor Andrioli and Mie Mukai, Lilian and Emanuel Silva e Oliveira, Lucas and de Sa Rebelo, Marina and Antonio Gutierrez, Marco and Eduardo Krieger, Jose and Teodoro, Douglas and Moro, Claudia and Paraiso, Emerson Cabrera},
  booktitle={2023 IEEE 36th International Symposium on Computer-Based Medical Systems (CBMS)}, 
  title={CardioBERTpt: Transformer-based Models for Cardiology Language Representation in Portuguese}, 
  year={2023},
  volume={},
  number={},
  pages={378-381},
  doi={10.1109/CBMS58004.2023.00247}}
}

Questions?

Post a Github issue on the CardioBERTpt repo.

Downloads last month
6
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.