Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co./docs/hub/model-cards#model-card-metadata)

Quant of https://huggingface.co./chavinlo/gpt4-x-alpaca

There's already one located at https://huggingface.co./anon8231489123/gpt4-x-alpaca-13b-native-4bit-128g, but neither the triton nor cuda version they uploaded seem to want to work on older versions of GPTQ-for-LLaMA such as the one currently used with KoboldAI for 4bit support on 0cc4m's fork.

This was quantized with cuda, not triton.

python llama.py ./gpt4-x-alpaca c4 --wbits 4 --true-sequential --groupsize 128 --save_safetensors gpt-x-alpaca-13b-native-4bit-128g-cuda.safetensors

Downloads last month
766
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Space using 4bit/gpt4-x-alpaca-13b-native-4bit-128g-cuda 1