kaveh
/

rclip

@@ -29,26 +29,21 @@ widget:
 ---
 # RCLIP (Clip model fine-tuned on radiology images and their captions)
 This model is a fine-tuned version of [openai/clip-vit-large-patch14](https://huggingface.co/openai/clip-vit-large-patch14) as an image encoder and [microsoft/BiomedVLP-CXR-BERT-general](https://huggingface.co/microsoft/BiomedVLP-CXR-BERT-general) as a text encoder on the [ROCO dataset](https://github.com/razorx89/roco-dataset).
 It achieves the following results on the evaluation set:
 - Loss: 0.3388
------
-## 1-Heatmap
 Here is the heatmap of the similarity score of the first 30 samples on the test split of the ROCO dataset of images vs their captions:
 ![heatmap](https://imgur.com/fPFM694.png)
------
-## 2-Applications
-### 2-1-Image Retrieval
 This model can be utilized for image retrieval purposes, as demonstrated below:
-##### 2-1-1-Save Image Embeddings
 <details>
 <summary>click to show the code</summary>
 ```python
 from PIL import Image
 import numpy as np
@@ -77,8 +72,7 @@ with open("embeddings.pkl", 'wb') as f:
 ```
 </details>
-##### 2-1-2-Query for Images
 ```python
 import numpy as np
 from sklearn.metrics.pairwise import cosine_similarity
@@ -114,10 +108,8 @@ similar_image_names = [images[index] for index in similar_image_indices]
 Image.open(similar_image_names[0])
 ```
-### 2-2-Zero-Shot Image Classification
 This model can be effectively employed for zero-shot image classification, as exemplified below:
 ```python
 import requests
 from PIL import Image
@@ -139,15 +131,10 @@ print("".join([x[0] + ": " + x[1] + "\n" for x in zip(possible_class_names, [for
 image
 ```
------
-## 3-Training info
-### 3-1-Metrics
 | Training Loss | Epoch | Step  | Validation Loss |
 |:-------------:|:-----:|:-----:|:---------------:|
 | 0.0974        | 4.13  | 22500 | 0.3388          |
 <details>
 <summary>expand to view all steps</summary>
@@ -201,8 +188,7 @@ image
 </details>
-### 3-2-Training Hyperparameters
 The following hyperparameters were used during training:
 - learning_rate: 5e-05
 - train_batch_size: 24
@@ -212,16 +198,14 @@ The following hyperparameters were used during training:
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_steps: 500
 - num_epochs: 8.0
------
-## 4-Framework Versions
 - Transformers 4.31.0.dev0
 - Pytorch 2.0.1+cu117
 - Datasets 2.13.1
 - Tokenizers 0.13.3
------
-# 5-Citation
 ```bibtex
 @misc{RCLIPmodel,
   doi = {10.57967/HF/0896},

 ---
 # RCLIP (Clip model fine-tuned on radiology images and their captions)
 This model is a fine-tuned version of [openai/clip-vit-large-patch14](https://huggingface.co/openai/clip-vit-large-patch14) as an image encoder and [microsoft/BiomedVLP-CXR-BERT-general](https://huggingface.co/microsoft/BiomedVLP-CXR-BERT-general) as a text encoder on the [ROCO dataset](https://github.com/razorx89/roco-dataset).
 It achieves the following results on the evaluation set:
 - Loss: 0.3388
+## Heatmap
 Here is the heatmap of the similarity score of the first 30 samples on the test split of the ROCO dataset of images vs their captions:
 ![heatmap](https://imgur.com/fPFM694.png)
+## Image Retrieval
 This model can be utilized for image retrieval purposes, as demonstrated below:
+### 1-Save Image Embeddings
 <details>
 <summary>click to show the code</summary>
 ```python
 from PIL import Image
 import numpy as np
 ```
 </details>
+### 2-Query for Images
 ```python
 import numpy as np
 from sklearn.metrics.pairwise import cosine_similarity
 Image.open(similar_image_names[0])
 ```
+## Zero-Shot Image Classification
 This model can be effectively employed for zero-shot image classification, as exemplified below:
 ```python
 import requests
 from PIL import Image
 image
 ```
+## Metrics
 | Training Loss | Epoch | Step  | Validation Loss |
 |:-------------:|:-----:|:-----:|:---------------:|
 | 0.0974        | 4.13  | 22500 | 0.3388          |
 <details>
 <summary>expand to view all steps</summary>
 </details>
+## Hyperparameters
 The following hyperparameters were used during training:
 - learning_rate: 5e-05
 - train_batch_size: 24
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_steps: 500
 - num_epochs: 8.0
+## Framework Versions
 - Transformers 4.31.0.dev0
 - Pytorch 2.0.1+cu117
 - Datasets 2.13.1
 - Tokenizers 0.13.3
+## Citation
 ```bibtex
 @misc{RCLIPmodel,
   doi = {10.57967/HF/0896},