EXL2 4.5 bpw quant now available - Request to add reference

#14
by DavidCatalano - opened

I’ve created an EXL2 4.5 bits per weight (bpw) quantization of your calme-3.2-instruct-78b model. This quantization uses ExLlamaV2 0.2.6 and offers 32K context in ~44GB of VRAM.

You can find the repository here: DavidCatalano/calme-3.2-instruct-78b-exl2-4.5bpw.

I would be honored if you could include it in the list of quantizations under your model card.

Hi @DavidCatalano

Thank you very much for your contribution! I have added it to the model's card, it was my pleasure! https://huggingface.co./MaziyarPanahi/calme-3.2-instruct-78b

Sign up or log in to comment