Releasing base model and combined SFT dataset

#13

by SS12444 - opened Aug 27, 2024

Aug 27, 2024

Great work. Are there plans for releasing the base model and expanded training dataset like for idefics2? Base model is good for experimentation. Thanks!

HugoLaurencon

Aug 28, 2024

•

edited Aug 28, 2024

Thanks! No unfortunately, since we included large synthetic instruction datasets directly in the pre-training, we didn't really have a base model anymore, so we only release the final instruct version

SS12444

Sep 6, 2024

I might be missing something in the paper, but is there a model that is like the base model of idefics2, after the pretraining stage, but before the SFT stage (Table 3 of Idefics3 paper). I understand that pretraining can include wider forms of data but the model will not be necessarily instruction tuned

dipta007

Sep 8, 2024

I was also wondering as the paper suggests that there is pre-training and then fine-tuning, there should be a base model between pre-training and fine-tuning. Let me know if I am missing something here. BY THE WAY great work.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment