YiTokenizer does not exist or is not currently imported.
I have a functional oobabooga install, with GPTQ working great.
Tried to run this model, installed from the model tab, and I am getting this error:
2023-11-14 12:27:30 INFO:Loading TheBloke_dolphin-2_2-yi-34b-AWQ...
Replacing layers...: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 60/60 [00:05<00:00, 10.25it/s]
Fusing layers...: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 60/60 [00:07<00:00, 7.81it/s]
2023-11-14 12:27:51 ERROR:Failed to load the model.
Traceback (most recent call last):
File "D:\TextGen\modules\ui_model_menu.py", line 210, in load_model_wrapper
shared.model, shared.tokenizer = load_model(shared.model_name, loader)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\TextGen\modules\models.py", line 93, in load_model
tokenizer = load_tokenizer(model_name, model)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\TextGen\modules\models.py", line 113, in load_tokenizer
tokenizer = AutoTokenizer.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\TextGen\installer_files\env\Lib\site-packages\transformers\models\auto\tokenization_auto.py", line 765, in from_pretrained
raise ValueError(
ValueError: Tokenizer class YiTokenizer does not exist or is not currently imported.
Also tried taking the original tokenizer from https://huggingface.co./ehartford/dolphin-2_2-yi-34b/resolve/main/tokenization_yi.py
And tried using the GPTQ tokenization_yi.py file and it didnt help (GPTQ works fine tho)
made a demo with the original model, but had the same problem which i solved & you can find it here : https://huggingface.co./spaces/Tonic1/YiTonic/tree/main just check how the tokenizer issue is handled if you like
made a demo with the original model, but had the same problem which i solved & you can find it here : https://huggingface.co./spaces/Tonic1/YiTonic/tree/main just check how the tokenizer issue is handled if you like
Thanks for the help!
Should I also pip install -r and your requirement file? It will downgrade to cu113?
Tried to copy your files and got this error:
Traceback (most recent call last):
File "D:\TextGen\modules\ui_model_menu.py", line 210, in load_model_wrapper
shared.model, shared.tokenizer = load_model(shared.model_name, loader)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\TextGen\modules\models.py", line 85, in load_model
output = load_func_maploader
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\TextGen\modules\models.py", line 299, in AutoAWQ_loader
from awq import AutoAWQForCausalLM
File "D:\TextGen\installer_files\env\Lib\site-packages\awq_init_.py", line 2, in
from awq.models.auto import AutoAWQForCausalLM
File "D:\TextGen\installer_files\env\Lib\site-packages\awq\models_init_.py", line 1, in
from .mpt import MptAWQForCausalLM
File "D:\TextGen\installer_files\env\Lib\site-packages\awq\models\mpt.py", line 1, in
from .base import BaseAWQForCausalLM
File "D:\TextGen\installer_files\env\Lib\site-packages\awq\models\base.py", line 12, in
from awq.quantize.quantizer import AwqQuantizer
File "D:\TextGen\installer_files\env\Lib\site-packages\awq\quantize\quantizer.py", line 11, in
from awq.modules.linear import WQLinear_GEMM, WQLinear_GEMV
File "D:\TextGen\installer_files\env\Lib\site-packages\awq\modules\linear.py", line 4, in
import awq_inference_engine # with CUDA kernels
^^^^^^^^^^^^^^^^^^^^^^^^^^^
ImportError: DLL load failed while importing awq_inference_engine: The specified module could not be found.
Make sure you load with trust_remote_code=True
I get this error when I try to load the model:
File "/Downloads/text-generation-webui/modules/models.py", line 85, in load_model
output = load_func_maploader
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Downloads/text-generation-webui/modules/models.py", line 299, in AutoAWQ_loader
from awq import AutoAWQForCausalLM
ModuleNotFoundError: No module named 'awq'
Make sure you load with trust_remote_code=True
Thanks, it helped.
But for some reason the GPTQ version of this model gives much better results, while this AWQ version gives random words in repeat.
Same template ChatML same context.
And, it seems the GPTQ is slightly faster..
But for some reason the GPTQ version of this model gives much better results, while this AWQ version gives random words in repeat.
Same template ChatML same context.
And, it seems the GPTQ is slightly faster..
I have had this experience as well, with multiple models... Maybe AutoAWQ needs some tweaking?
tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, trust_remote_code=True)