TheBloke/Wizard-Vicuna-30B-Uncensored-GPTQ · best settings to load this model on oobabooga? in most cases i get errors

Sep 22, 2023

•

edited Sep 22, 2023

what are the best settings to load this model on oobabooga.
in most cases i get errors while loading or while doing inference

Model Loader: Transformers
Found modules on cpu/disk. Using Exllama backend requires all the modules to be on GPU.You can deactivate exllama backend by setting disable_exllama=True in the quantization config object

Model Loader: ExLlamav2_HF
Load error
cannot open shared object file: No such file or directory

Model Loader: ExLlama_Hf
inference error
NotImplementedError: Cannot copy out of meta tensor; no data!

Model Loader: AutoGPT with disable exllama=true
Load error
[NotImplementedError: Cannot copy out of meta tensor; no data!](NotImplementedError: Cannot copy out of meta tensor; no data!)

Yhyu13

Sep 22, 2023

What is your computer setup? GPUs VRAM and RAM?

It should be sufficient to load everything in a 24GB card. I mostly do ExLlamav2_HF for GPTQs as it is the fastest.

Would you also check integrity of model files downloaded ?

dinchu

Sep 22, 2023

I have a nvidia h100 with 40gb of ram in a 32 core server. i don't think is a hardware issue