Edit model card

ClinicalNoteBERT

Using openly available clinical notes, we pretrain ClinicalNoteBERT, a series of encoders of three model sizes (110M, 67M, and 14.5M) that consider note contexts and variations during pretraining. We adopt a range of downstream applications to evaluate ClinicalNoteBERT, including tasks in fine-tuning, unsupervised semantic textual similarity, retrieval-augmented generation of LLMs, and unimodal and multimodal clinical predictions, and compare with strong baselines. Our models achieve better results than the baseline models of similar or larger sizes on various tasks and datasets. We find that different choices made during pretraining can lead to varied improvements for the downstream tasks. Our small and tiny versions of ClinicalNoteBERT maintain over 96% and 91% of the best performance with less than 61% and 14% of the parameters, respectively.

Overall performance

# Params FT STS RAG CP Fusion
ClinicalNoteBERT-note-only 110M 80.0 78.9 14.0 63.8 66.5
ClinicalNoteBERT-note-ntp 110M 80.6 73.6 13.0 62.9 65.8
ClinicalNoteBERT-base 110M 80.1 79.8 12.3 64.0 66.7
ClinicalNoteBERT-small 67M 78.1 77.1 11.4 64.6 66.8
ClinicalNoteBERT-tiny 14.5M 74.1 75.7 8.9 62.4 65.5

FT: fine-tuning. STS: semantic textual similarity (ClinicalSTS). RAG: retrieval augmented generation (GPT2, Llama2). CP: clinical prediction. Fusion: multimodal fusion for clinical prediction.

When encoding text sequences for STS, RAG, and CP/Fusion, ClinicalNoteBERT models are adapted through extra SimCSE training in the unsupervised fashion using varied sequence lengths/types. Sequence-sentence, sequence-segment, and sequence-note are used for STS, RAG, and CP/Fusion, respectively, according to their corresponding lengths. More details can be found in the paper.

Citation

Under review

Downloads last month
5
Inference API
Unable to determine this model’s pipeline type. Check the docs .