vit-pneumonia-x-ray_3_class
The model is a finetuned ViT model from 21k task.
Model description
The model outputs a distribution on 3 classes (Normal vs Bacterial Pneumonia vs Viral Pneumonia )
Intended uses & limitations
The intended use is only academic, as the limitations of this model are severe. First of all, it was trained on a very limited dataset (Kermany et al., 2018), which includes only around 5k chest x-ray images (2306 bacterial, 1224 viral, and 1116 normal). The dataset consisted of only PA chest x-rays, and as such, it was only used for these types of x-rays. Additionally, most of the images have been marked with the letter R, indicating the right side of the body; however, not all chest-x-rays used in the world have such a marking (some have the letter L). There is also a problem that sometimes a direct diagnosis of a chest x-ray pneumonia type cannot be made simply because one patient can be infected with both viral and bacterial pneumonia. Moreover, some patients have been diagnosed with pneumonia, but the underlying cause of it is non-infectious. Please consult this paper for a deeper understanding of the causes of pneumonia.
Training and evaluation data
The model followed a standard procudure of finetuning a ViT model. The only difference is that the first 11 layers of the encoder have been frozen consult the code, to get a better idea. Additionnaly data augmentation applied is very much subtle that is rotation by around (-10,10) degrees and very small light change, this was concious choice as the chest x-ray data is very homogenous in structure and a more extreme data augmentation scheme could introduce too much noise, see this paper to understand the challanges of data augmentation for these type of data.
Training procedure
The max epochs was set to 50, early stopping on epoch 5 based on the eval_loss was chosen to prevent overfitting.
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 50
- we use a weights that penelize errors based on the the number of instances ina class (to circumvent the class imbalance)
- early stopping: 5 steps with no improvment on val loss
- validation step: every 100 steps
Framework versions
- Transformers 4.38.1
- Pytorch 2.3.0
- Datasets 2.19.1
- Tokenizers 0.15.2
Test Metrics
Test metric performed on the (Kermany et al., 2018) test set.
Metric | ViT Model (DA) |
---|---|
Test Accuracy | 0.8686 |
Test Precision | 0.8777 |
Test Recall | 0.8686 |
Test F1 Score | 0.8697 |
Bacterial Accuracy | 0.9541 |
Viral Accuracy | 0.8209 |
Normal Accuracy | 0.8162 |
- Downloads last month
- 34