Spaces:
Running
Running
adding HP and training details
Browse files- introduction.md +20 -0
introduction.md
CHANGED
@@ -157,6 +157,26 @@ We split this section in two: we first provide a quantitative evaluation to ensu
|
|
157 |
We then show some qualitative examples of images found by the model. **All the code we have written** to run our validation experiments (in combination with
|
158 |
code made available by Nils Reimers and by the authors of the original CLIP) is available.
|
159 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
160 |
## Quantitative Evaluation
|
161 |
Showing great images is definitely cool and interesting, but a model is nothing without validation.
|
162 |
Since this is the first clip-based model in Italian, we decided to use the multilingual CLIP model as a comparison baseline.
|
|
|
157 |
We then show some qualitative examples of images found by the model. **All the code we have written** to run our validation experiments (in combination with
|
158 |
code made available by Nils Reimers and by the authors of the original CLIP) is available.
|
159 |
|
160 |
+
## Training Details
|
161 |
+
|
162 |
+
### Datasets Splits
|
163 |
+
|
164 |
+
We tried different combinations of splits sizes for training and validation. Eventually, we focused on a 95% training split with 5% of data
|
165 |
+
going into the validation, each dataset is split in training and validation data and then we concatenate the files.
|
166 |
+
Note that the 5% means 70K validation samples, making this set almost as big as the MSCOCO dataset.
|
167 |
+
|
168 |
+
### Hyper-parameters
|
169 |
+
|
170 |
+
The hyper-parameters can be found in the [repository](https://github.com/clip-italian/clip-italian/tree/master/hybrid_clip).
|
171 |
+
We have a maximum sequence length of 95 tokens. To compute this we look at the distribution of the captions in the various
|
172 |
+
datasets and we eventually realized that 95 was an excellent compromise between training speed and data coverage.
|
173 |
+
We use a batch size of 128 and a learning rate of 0.00001.
|
174 |
+
|
175 |
+
### Training
|
176 |
+
|
177 |
+
We usually train until we see the loss going up and we then pick the model with the best validation loss. We adjusted the number of training epochs
|
178 |
+
as the project progressed: at first we run 100 epochs but after we replaced the optimizer we have been able to reduce this number.
|
179 |
+
|
180 |
## Quantitative Evaluation
|
181 |
Showing great images is definitely cool and interesting, but a model is nothing without validation.
|
182 |
Since this is the first clip-based model in Italian, we decided to use the multilingual CLIP model as a comparison baseline.
|