Details
#2
by
isr431
- opened
From what I understand, this is identical to the Replete series but with a different name? In that case, is it optimized for function calling, or other purposes? Can you also provide more details about how this is better than plain 7b instruct?
Yea so i left replete and took my work with me, thats why i renamed the models.
As far as how its diffrent from instruct. Is that the model is a averge of the instruct and base weights, so it decreases the catastrophic forgetting from finetuning the model. Retaining more knowledge from pretraining, while also keeping the knowledge from the instruct finetuning.
The main takeaway, is that its just a better version of the instruct model.