Post
3008
🤗 PEFT v0.10.0 release! 🔥🚀✨
Some highli📝ghts:
1. FSDP+QLoRA and DeepSpeed Stage-3+QLoRA
2. Layer expansion + LoRA
3. DoRA support for Conv2D layers and quantized bitsandbytes layers
4. New LoftQ utility
5. Batched inference for mixed LoRA adapters.
http://Answer.AI team in collaboration with bitsandbytes and Hugging Face 🤗 open sourced code enabling the usage of FSDP+QLoRA and explained the whole process in their insightful blogpost https://lnkd.in/g6jgfXyv. This is now integrated into Hugging Face ecosystem.
For an end-to-end example on FSDP+QLoRA, please refer https://lnkd.in/gT3yY-Rx.
For an end-to-end example on DeepSpeed Stage-3+QLoRA, please refer https://lnkd.in/gkt-xZRE.
With the PR https://lnkd.in/g5F348MN these changes are now upstreamed in https://lnkd.in/g5_MxYtY thanks to Wing Lian ! 🚀
Kudos to http://Answer.AI team, Titus von Köller , Younes Belkada, Benjamin Bossan and Zachary Mueller for all the help without which this couldn't have been possible. 🤗
For efficient depthwise layer expansion akin to
Now DoRA is supported for Conv2D layers as well as bitsandbytes quantized layers ✨. For more details, please refer the below thread.
https://lnkd.in/gsJbuWPD
Now you can mix different LoRA adapters in a batch during inference which speeds-up the inference by avoiding computation of base model multiple times which would be the case for adaptive inference with batch_size=1! ⚡️.
Details below. https://lnkd.in/gD-pcX_B
LoftQ reduces quantization error by appropriately initializing the LoRA adapter weights. Normally, this is a two-step process. Benjamin Bossan
added new util
For more details, refer to the release notes. 📝
https://lnkd.in/gg7-AmHA. As always, make sure losses go down and be happy to watch your model train!
Some highli📝ghts:
1. FSDP+QLoRA and DeepSpeed Stage-3+QLoRA
2. Layer expansion + LoRA
3. DoRA support for Conv2D layers and quantized bitsandbytes layers
4. New LoftQ utility
5. Batched inference for mixed LoRA adapters.
http://Answer.AI team in collaboration with bitsandbytes and Hugging Face 🤗 open sourced code enabling the usage of FSDP+QLoRA and explained the whole process in their insightful blogpost https://lnkd.in/g6jgfXyv. This is now integrated into Hugging Face ecosystem.
For an end-to-end example on FSDP+QLoRA, please refer https://lnkd.in/gT3yY-Rx.
For an end-to-end example on DeepSpeed Stage-3+QLoRA, please refer https://lnkd.in/gkt-xZRE.
With the PR https://lnkd.in/g5F348MN these changes are now upstreamed in https://lnkd.in/g5_MxYtY thanks to Wing Lian ! 🚀
Kudos to http://Answer.AI team, Titus von Köller , Younes Belkada, Benjamin Bossan and Zachary Mueller for all the help without which this couldn't have been possible. 🤗
For efficient depthwise layer expansion akin to
passthrough
method of mergekit
but without using additional memory and attaching LoRAs to it, refer to the details below! 🔥https://lnkd.in/ge95ztjANow DoRA is supported for Conv2D layers as well as bitsandbytes quantized layers ✨. For more details, please refer the below thread.
https://lnkd.in/gsJbuWPD
Now you can mix different LoRA adapters in a batch during inference which speeds-up the inference by avoiding computation of base model multiple times which would be the case for adaptive inference with batch_size=1! ⚡️.
Details below. https://lnkd.in/gD-pcX_B
LoftQ reduces quantization error by appropriately initializing the LoRA adapter weights. Normally, this is a two-step process. Benjamin Bossan
added new util
replace_lora_weights_loftq
for LoftQ to use it on the fly with bnb.For more details, refer to the release notes. 📝
https://lnkd.in/gg7-AmHA. As always, make sure losses go down and be happy to watch your model train!