turboderp/Llama-3.1-8B-Instruct-exl2 · eos_token

Jul 23

Traceback (most recent call last):

File "M:\oobabooga_TGWUI\text-generation-webui-1.6\modules\ui_model_menu.py", line 232, in load_model_wrapper


shared.model, shared.tokenizer = load_model(selected_model, loader)
File "M:\oobabooga_TGWUI\text-generation-webui-1.6\modules\models.py", line 94, in load_model


output = load_func_map[loader](model_name)
File "M:\oobabooga_TGWUI\text-generation-webui-1.6\modules\models.py", line 329, in ExLlamav2_HF_loader


return Exllamav2HF.from_pretrained(model_name)
File "M:\oobabooga_TGWUI\text-generation-webui-1.6\modules\exllamav2_hf.py", line 173, in from_pretrained


config.prepare()
File "M:\oobabooga_TGWUI\text-generation-webui-1.6\installer_files\env\lib\site-packages\exllamav2\config.py", line 192, in prepare


self.eos_token_id = read(read_config, int, "eos_token_id", None)  # 2
File "M:\oobabooga_TGWUI\text-generation-webui-1.6\installer_files\env\lib\site-packages\exllamav2\config.py", line 40, in read


raise TypeError(f"Value for {key} is not of expected type {expected_type}")
TypeError: Value for eos_token_id is not of expected type <class 'int'>

CrazyKrow

Jul 23

I'm getting the same error

turboderp

Owner Jul 23

You need the dev branch of ExLlamaV2 to run it correctly, because they changed a few things with the RoPE implementation.

The error you're getting is because the config defines multiple EOS tokens which is also handled in the dev branch. You could just change the eos_token_id key to be a single value of 128001, but you might get incorrect inference for longer sequence lengths if you don't update ExLlama. There will be a new release tomorrow, but it needs a little more testing first.