Reproducing idefics-8b(instruct)

#61

by Iheb-Chaabane - opened May 28, 2024

May 28, 2024

I’m trying to reproduce the instruct version starting from the base ( pretrained) checkpoint.
Can you please provide more details on the proportion of the datasets in cauldron and training hyper parameters (lr, weight decay, nbr epochs…)?
Thanks,

HugoLaurencon

May 29, 2024

Most of this is detailed in the paper in appendix

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment