Cool Idea, I'm gonna try to make it cooler

by rombodawg - opened Aug 11

Discussion

rombodawg

Aug 11

•

edited Aug 11

I introduced a method to continuous finetune a model in a recent paper

https://docs.google.com/document/d/1OjbjU5AOz4Ftn9xHQrX3oFQGhQ6RDUuXQipnQ9gn6tU

Im going to use this with my adapter here:

https://huggingface.co./Replete-AI/Replete-LLM-Qwen2-7b-Adapter

And make a version of your model thats actually finetuned on top of the spark model. I wont even have to tune it, just add the lora, then remerge it

Nelathan

Aug 11

I did a similar merge, feel free to compare them: https://huggingface.co./Nelathan/Qwen2-7B-FocusMix

sometimesanotion

Owner Sep 3

Fascinating! @rombodawg , I am making these merges as a build-up to composing a Mixture of Adapters model, and I think your paper makes some important points for this.

@Nelathan , I'm definitely a fan of FocusMix, especially its incorporation of calme with its steering. I'm hoping to make something similar for the Llama architecture, soon as I have evals set up for coding and steering. I already feel good about the prose quality in this merge: https://huggingface.co./listtowardslight/code-replete-storm-della

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment