Training time and resoruces
#1
by
adendek
- opened
Dear authors,
Could you please share the resources (time and hardware) that were needed to train or finetune this model?
We conducted the training on a V100 GPU in Google Colab with a batch size of 100. On average, it took about 2 seconds per epoch, and we performed 10 epochs in one sweep.
However, the model is small enough to fit into even an 8 GB GPU. Training would also work on such a GPU, but it would take significantly longer.