Quantized by Anon from /lmg/, all credit goes to him.
Command used to quantize:
CUDA_VISIBLE_DEVICES=0 python llama.py /mnt/g/models/conceptofmind_LLongMA-2-13b/ c4 --wbits 4 --true-sequential --act-order --groupsize 32 --save_safetensors /mnt/g/models/LLongMA-2-13b-GPTQ/4bit-32g-tsao.safetensors
- Downloads last month
- 4
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.