ernie-health-zh

Introduction

ERNIE-health is a Chinese biomedical language model pre-trained from in-domain text of de-identified online doctor-patient dialogues, electronic medical records, and textbooks.

More detail: https://github.com/PaddlePaddle/Research/tree/master/KG/eHealth https://github.com/PaddlePaddle/PaddleNLP/tree/develop/model_zoo/ernie-health https://arxiv.org/pdf/2110.07244.pdf

Released Model Info

Model Name	Language	Model Structure
ernie-health-zh	Chinese	Layer:12, Hidden:768, Heads:12

This released pytorch model is converted from the officially released PaddlePaddle ERNIE model and a series of experiments have been conducted to check the accuracy of the conversion.

Official PaddlePaddle ERNIE repo:https://github.com/PaddlePaddle/Research/tree/master/KG/eHealth
Pytorch Conversion repo: https://github.com/nghuyong/ERNIE-Pytorch

How to use

from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("nghuyong/ernie-health-zh")
model = AutoModel.from_pretrained("nghuyong/ernie-health-zh")

Citation

@article{wang2021building,
  title={Building Chinese Biomedical Language Models via Multi-Level Text Discrimination},
  author={Wang, Quan and Dai, Songtai and Xu, Benfeng and Lyu, Yajuan and Zhu, Yong and Wu, Hua and Wang, Haifeng},
  journal={arXiv preprint arXiv:2110.07244},
  year={2021}
}