license: mit
base_model: facebook/bart-large-cnn
tags:
- generated_from_trainer
metrics:
- rouge
model-index:
- name: conversation-summ
results: []
datasets:
- har1/MTS_Dialogue-Clinical_Note
language:
- en
HealthScribe (A Clinical Note Generator)
This model is a fine-tuned version of facebook/bart-large-cnn on a modified version of MTS-Dialog Dataset dataset.
Model description
The model was developed for the project HealthScirbe. This model is integrated with a Flask web application. The project is a web application that allows users to generate clinical notes from transcribed ASR(Automatic Speech Recognition) data of conversations between doctors and patients.
TEST DATA Sample For Inference (More given in test.txt
)
You can refer test.txt
for further examples of conversations.
"Doctor: Hi there, I love that dress, very pretty!
Patient: Thank you for complementing a seventy-two-year-old patient.
Doctor: No, I mean it, seriously. Okay, so you were admitted here in May two thousand nine. You have a history of hypertension, and on June eighteenth two thousand nine you had bad abdominal pain diarrhea and cramps.
Patient: Yes, they told me I might have C Diff? They did a CT of my abdomen and that is when they thought I got the infection.
Doctor: Yes, it showed evidence of diffuse colitis, so I believe they gave you IV antibiotics?
Patient: Yes they did.
Doctor: Yeah I see here, Flagyl and Levaquin. They started IV Reglan as well for your vomiting.
Patient: Yes, I was very nauseous. Vomited as well.
Doctor: After all this I still see your white blood cells high. Are you still nauseous?
Patient: No, I do not have any nausea or vomiting, but still have diarrhea. Due to all that diarrhea I feel very weak.
Doctor: Okay. Anything else any other symptoms?
Patient: Actually no. Everything's well.
Doctor: Great.
Patient: Yeah."
Intended uses & limitations
The model is used to generate clinical notes from doctor-patient conversation data(ASR). This model has certain limitations like :
- N/A output generation is low. Sometimes None is produced
- When the input data is composed of very minimal character tokens or if input is very large it starts to hallucinate.
Training Metrics
Training and evaluation data
The model achieves the following results on the evaluation set:
- Loss: 0.1562
- Rouge1: 54.3238
- Rouge2: 34.2678
- Rougel: 46.5847
- Rougelsum: 51.2214
- Generation Length: 77.04
Training procedure
The model was trained on 1201 training samples and 100 validation samples of the modified MTS-Dialog
Training hyperparameters
The following hyperparameters were used during training:
learning_rate
: 2e-05train_batch_size
: 1eval_batch_size
: 1seed
: 42gradient_accumulation_steps
: 2total_train_batch_size
: 2optimizer
: Adam with betas=(0.9,0.999) and epsilon=1e-08lr_scheduler_type
: linearnum_epochs
: 3mixed_precision_training
: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
0.4426 | 1.0 | 600 | 0.1588 | 52.8864 | 33.253 | 44.9089 | 50.5072 | 69.38 |
0.1137 | 2.0 | 1201 | 0.1517 | 56.8499 | 35.309 | 48.2171 | 53.6983 | 72.74 |
0.0796 | 3.0 | 1800 | 0.1562 | 54.3238 | 34.2678 | 46.5847 | 51.2214 | 77.04 |
Framework versions
- Transformers 4.39.2
- Pytorch 2.2.1+cu121
- Datasets 2.18.0
- Tokenizers 0.15.2