EXL2 4.5 bpw quant now available - Request to add reference
#14
by
DavidCatalano
- opened
I’ve created an EXL2 4.5 bits per weight (bpw) quantization of your calme-3.2-instruct-78b model. This quantization uses ExLlamaV2 0.2.6 and offers 32K context in ~44GB of VRAM.
You can find the repository here: DavidCatalano/calme-3.2-instruct-78b-exl2-4.5bpw.
I would be honored if you could include it in the list of quantizations under your model card.
Thank you very much for your contribution! I have added it to the model's card, it was my pleasure! https://huggingface.co./MaziyarPanahi/calme-3.2-instruct-78b