Information
OpenAssistant-Llama-13B-4-bit working with GPTQ versions used in Oobabooga's Text Generation Webui and KoboldAI.This was made using Serpdotai's Open Assistant 13b LoRa trained for 4 epochs using Open Assistant's dataset.
python llama.py /KoboldAI/repos/gptq/llama13b-oasst-4-epochs-lora c4 --wbits 4 --true-sequential --groupsize 128 --save_safetensors llama13b-oasst-4-epochs-lora-4bit-128g.safetensors
Benchmarks
--true-sequential --groupsize 128
Wikitext2: 5.380471229553223
Ptb-New: 31.921072006225586
C4-New: 7.140256881713867
Note: This version uses --groupsize 128, resulting in better evaluations. However, it consumes more VRAM.
- Downloads last month
- 16
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.