rclip / README.md
kaveh's picture
init
12cb5cd
|
raw
history blame
3.76 kB
metadata
tags:
  - generated_from_trainer
  - clip
  - bert
  - vision-language models
model-index:
  - name: output_8_clip14_cxrbert
    results: []
language:
  - en
library_name: transformers
pipeline_tag: feature-extraction

RCLIP (Clip model fine-tuned on radiology images and their captions)

This model is a fine-tuned version of openai/clip-vit-large-patch14 as an image encoder and microsoft/BiomedVLP-CXR-BERT-general as a text encoder on the ROCO dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3388

Heatmap

Here is the heatmap of the similarity score of the first 30 samples on the test split of the ROCO dataset of images vs their captions: heatmap

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 24
  • eval_batch_size: 24
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 8.0

Training results

Training Loss Epoch Step Validation Loss
0.7951 0.09 500 1.1912
0.5887 0.18 1000 0.9833
0.5023 0.28 1500 0.8459
0.4709 0.37 2000 0.8479
0.4484 0.46 2500 0.7667
0.4319 0.55 3000 0.8092
0.4181 0.64 3500 0.6964
0.4107 0.73 4000 0.6463
0.3723 0.83 4500 0.7893
0.3746 0.92 5000 0.6863
0.3667 1.01 5500 0.6910
0.3253 1.1 6000 0.6863
0.3274 1.19 6500 0.6445
0.3065 1.28 7000 0.5908
0.2834 1.38 7500 0.6138
0.293 1.47 8000 0.6515
0.303 1.56 8500 0.5806
0.2638 1.65 9000 0.5587
0.2593 1.74 9500 0.5216
0.2451 1.83 10000 0.5283
0.2468 1.93 10500 0.5001
0.2295 2.02 11000 0.4975
0.1953 2.11 11500 0.4750
0.1954 2.2 12000 0.4572
0.1737 2.29 12500 0.4731
0.175 2.38 13000 0.4526
0.1873 2.48 13500 0.4890
0.1809 2.57 14000 0.4210
0.1711 2.66 14500 0.4197
0.1457 2.75 15000 0.3998
0.1583 2.84 15500 0.3923
0.1579 2.94 16000 0.3823
0.1339 3.03 16500 0.3654
0.1164 3.12 17000 0.3592
0.1217 3.21 17500 0.3641
0.119 3.3 18000 0.3553
0.1151 3.39 18500 0.3524
0.119 3.49 19000 0.3452
0.102 3.58 19500 0.3439
0.1085 3.67 20000 0.3422
0.1142 3.76 20500 0.3396
0.1038 3.85 21000 0.3392
0.1143 3.94 21500 0.3390
0.0983 4.04 22000 0.3390
0.0974 4.13 22500 0.3388

Framework versions

  • Transformers 4.31.0.dev0
  • Pytorch 2.0.1+cu117
  • Datasets 2.13.1
  • Tokenizers 0.13.3