Aug 8, 2024

Hello. I'm a graduate student of Sungkyunkwan University.
Thank you for distributing this model.
I have a problem when I run this code.

Please help with this problem. I want to run this code.

Env:

WSL (python 3.10.12)
transformer v4.41

Code:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
"LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct",
torch_dtype=torch.bfloat16,
trust_remote_code=True,
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct")

Choose your prompt

prompt = "Explain who you are" # English example
prompt = "너의 소원을 말해봐" # Korean example

messages = [
{"role": "system",
"content": "You are EXAONE model from LG AI Research, a helpful assistant."},
{"role": "user", "content": prompt}
]
input_ids = tokenizer.apply_chat_template(
messages,
tokenize=True,
add_generation_prompt=True,
return_tensors="pt"
)

output = model.generate(
input_ids.to("cuda"),
eos_token_id=tokenizer.eos_token_id,
max_new_tokens=128
)
print(tokenizer.decode(output[0]))

Error:

ValueError Traceback (most recent call last)
Cell In[20], line 15
6 model = AutoModelForCausalLM.from_pretrained(
7 "LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct",
8 token = access_token,
(...)
11 device_map="auto"
12 )
14 # Use the custom tokenizer if available
---> 15 tokenizer = AutoTokenizer.from_pretrained("LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct")
17 # Choose your prompt
18 prompt = "Explain who you are" # English example

File ~/.venv/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py:926, in AutoTokenizer.from_pretrained(cls, pretrained_model_name_or_path, *inputs, **kwargs)
920 else:
921 raise ValueError(
922 "This tokenizer cannot be instantiated. Please make sure you have sentencepiece installed "
923 "in order to use this tokenizer."
924 )
--> 926 raise ValueError(
927 f"Unrecognized configuration class {config.class} to build an AutoTokenizer.\n"
928 f"Model type should be one of {', '.join(c.name for c in TOKENIZER_MAPPING.keys())}."
929 )

ValueError: Unrecognized configuration class <class 'transformers_modules.LGAI-EXAONE.EXAONE-3.0-7.8B-Instruct.7f15baedd46858153d817445aff032f4d6cf4939.configuration_exaone.ExaoneConfig'> to build an AutoTokenizer.
Model type should be one of AlbertConfig, AlignConfig, BarkConfig, BartConfig, BertConfig, BertGenerationConfig, BigBirdConfig, BigBirdPegasusConfig, BioGptConfig, BlenderbotConfig, BlenderbotSmallConfig, BlipConfig, Blip2Config, BloomConfig, BridgeTowerConfig, BrosConfig, CamembertConfig, CanineConfig, ChameleonConfig, ChineseCLIPConfig, ClapConfig, CLIPConfig, CLIPSegConfig, ClvpConfig, LlamaConfig, CodeGenConfig, CohereConfig, ConvBertConfig, CpmAntConfig, CTRLConfig, Data2VecAudioConfig, Data2VecTextConfig, DbrxConfig, DebertaConfig, DebertaV2Config, DistilBertConfig, DPRConfig, ElectraConfig, ErnieConfig, ErnieMConfig, EsmConfig, FalconConfig, FastSpeech2ConformerConfig, FlaubertConfig, FNetConfig, FSMTConfig, FunnelConfig, GemmaConfig, Gemma2Config, GitConfig, GPT2Config, GPT2Config, GPTBigCodeConfig, GPTNeoConfig, GPTNeoXConfig, GPTNeoXJapaneseConfig, GPTJConfig, GPTSanJapaneseConfig, GroundingDinoConfig, GroupViTConfig, HubertConfig, IBertConfig, IdeficsConfig, Idefics2Config, InstructBlipConfig, InstructBlipVideoConfig, JambaConfig, JetMoeConfig, JukeboxConfig, Kosmos2Config, LayoutLMConfig, LayoutLMv2Config, LayoutLMv3Config, LEDConfig, LiltConfig, LlamaConfig, LlavaConfig, LlavaNextVideoConfig, LlavaNextConfig, LongformerConfig, LongT5Config, LukeConfig, LxmertConfig, M2M100Config, MambaConfig, Mamba2Config, MarianConfig, MBartConfig, MegaConfig, MegatronBertConfig, MgpstrConfig, MistralConfig, MixtralConfig, MobileBertConfig, MPNetConfig, MptConfig, MraConfig, MT5Config, MusicgenConfig, MusicgenMelodyConfig, MvpConfig, NezhaConfig, NllbMoeConfig, NystromformerConfig, OlmoConfig, OneFormerConfig, OpenAIGPTConfig, OPTConfig, Owlv2Config, OwlViTConfig, PaliGemmaConfig, PegasusConfig, PegasusXConfig, PerceiverConfig, PersimmonConfig, PhiConfig, Phi3Config, Pix2StructConfig, PLBartConfig, ProphetNetConfig, QDQBertConfig, Qwen2Config, Qwen2MoeConfig, RagConfig, RealmConfig, RecurrentGemmaConfig, ReformerConfig, RemBertConfig, RetriBertConfig, RobertaConfig, RobertaPreLayerNormConfig, RoCBertConfig, RoFormerConfig, RwkvConfig, SeamlessM4TConfig, SeamlessM4Tv2Config, SiglipConfig, Speech2TextConfig, Speech2Text2Config, SpeechT5Config, SplinterConfig, SqueezeBertConfig, StableLmConfig, Starcoder2Config, SwitchTransformersConfig, T5Config, TapasConfig, TransfoXLConfig, TvpConfig, UdopConfig, UMT5Config, VideoLlavaConfig, ViltConfig, VipLlavaConfig, VisualBertConfig, VitsConfig, Wav2Vec2Config, Wav2Vec2BertConfig, Wav2Vec2ConformerConfig, WhisperConfig, XCLIPConfig, XGLMConfig, XLMConfig, XLMProphetNetConfig, XLMRobertaConfig, XLMRobertaXLConfig, XLNetConfig, XmodConfig, YosoConfig.

ympaik

Aug 8, 2024

In your error, it says This tokenizer cannot be instantiated. Please make sure you have sentencepiece installed.
Have you checked if sentencepiece is installed in your environment?

Jinmyoung

Aug 8, 2024

I also encountered the same Value Error, but I resolved it by adding trust_remote_code=True to the AutoTokenizer part. Try using it as shown below.

tokenizer = AutoTokenizer.from_pretrained("LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct", trust_remote_code=True)

jasonskku

Aug 9, 2024

I also encountered the same Value Error, but I resolved it by adding trust_remote_code=True to the AutoTokenizer part. Try using it as shown below.
tokenizer = AutoTokenizer.from_pretrained("LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct", trust_remote_code=True)

Oh thank you. But, this problem is continued...
tokenizer problem is critical to run this code :(

jasonskku

Aug 9, 2024

In your error, it says This tokenizer cannot be instantiated. Please make sure you have sentencepiece installed.
Have you checked if sentencepiece is installed in your environment?

Yes I already installed 'sentencepiece 0.2.0' in my environment.

jasonskku changed discussion status to closed Aug 9, 2024

jasonskku changed discussion status to open Aug 9, 2024

Jinmyoung

Aug 9, 2024

•

edited Aug 9, 2024

Hmm, if the problem persists even after upgrading the transformer version or installing sentencepiece, adding an auth token when loading the model and tokenizer might help.

model = AutoModelForCausalLM.from_pretrained(
    "LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct",
    torch_dtype=torch.bfloat16,
    trust_remote_code=True,
    device_map="auto",
    token={youre_huggingface_token}
)

tokenizer = AutoTokenizer.from_pretrained("LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct",     trust_remote_code=True,    token={youre_huggingface_token})

I hope it will help : )

yireun

LG AI Research org Aug 9, 2024

It seems that you passed the access_token as a parameter to AutoModelForCausalLM.from_pretrained(). If so, you have to pass it to AutoTokenizer.from_pretrained() as well.

jasonskku

Aug 9, 2024

It seems that you passed the access_token as a parameter to AutoModelForCausalLM.from_pretrained(). If so, you have to pass it to AutoTokenizer.from_pretrained() as well.

Hello. yireun.

My code is here. Please check my code?

Thank you for your help.

Code:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
"LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct",
torch_dtype=torch.bfloat16,
trust_remote_code=True,
device_map="auto",
token={"my token"}
)
tokenizer = AutoTokenizer.from_pretrained("LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct",
trust_remote_code=True,
token={"my token"})

prompt = "Explain who you are" # English example
prompt = "너의 소원을 말해봐" # Korean example

messages = [
{"role": "system",
"content": "You are EXAONE model from LG AI Research, a helpful assistant."},
{"role": "user", "content": prompt}
]
input_ids = tokenizer.apply_chat_template(
messages,
tokenize=True,
add_generation_prompt=True,
return_tensors="pt"
)

output = model.generate(
input_ids.to("cuda"),
eos_token_id=tokenizer.eos_token_id,
max_new_tokens=128
)
print(tokenizer.decode(output[0]))

yireun

LG AI Research org Aug 9, 2024

•

edited Aug 9, 2024

Hi, jasonskku. Your code works fine in the following environments:
- ubuntu 22.04 with GPU A100
- python 3.10.8
- torch==2.4.0, transformers==4.44.0, accelerate==0.33.0
If you still have problems, it may be a problem with your WSL configuration.

yireun

LG AI Research org Aug 9, 2024

When removing the access_token in AutoTokenizer.from_pretrained(), the following errors similar to yours occur.

tokenizer = AutoTokenizer.from_pretrained("LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct", trust_remote_code=True)

warnings.warn(
Traceback (most recent call last):
File "/.../sample_test.py", line 13, in
tokenizer = AutoTokenizer.from_pretrained("LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct",
File "/.../python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 909, in from_pretrained
raise ValueError(
ValueError: Unrecognized configuration class <class 'transformers_modules.LGAI-EXAONE.EXAONE-3.0-7.8B-Instruct.7f15baedd46858153d817445aff032f4d6cf4939.configuration_exaone.ExaoneConfig'> to build an AutoTokenizer.
Model type should be one of AlbertConfig, AlignConfig, BarkConfig, BartConfig, BertConfig, BertGenerationConfig, BigBirdConfig, BigBirdPegasusConfig, BioGptConfig, BlenderbotConfig, BlenderbotSmallConfig, BlipConfig, Blip2Config, BloomConfig, BridgeTowerConfig, BrosConfig, CamembertConfig, CanineConfig, ChineseCLIPConfig, ClapConfig, CLIPConfig, CLIPSegConfig, ClvpConfig, LlamaConfig, CodeGenConfig, CohereConfig, ConvBertConfig, CpmAntConfig, CTRLConfig, Data2VecAudioConfig, Data2VecTextConfig, DbrxConfig, DebertaConfig, DebertaV2Config, DistilBertConfig, DPRConfig, ElectraConfig, ErnieConfig, ErnieMConfig, EsmConfig, FalconConfig, FastSpeech2ConformerConfig, FlaubertConfig, FNetConfig, FSMTConfig, FunnelConfig, GemmaConfig, GitConfig, GPT2Config, GPT2Config, GPTBigCodeConfig, GPTNeoConfig, GPTNeoXConfig, GPTNeoXJapaneseConfig, GPTJConfig, GPTSanJapaneseConfig, GroundingDinoConfig, GroupViTConfig, HubertConfig, IBertConfig, IdeficsConfig, Idefics2Config, InstructBlipConfig, JambaConfig, JetMoeConfig, JukeboxConfig, Kosmos2Config, LayoutLMConfig, LayoutLMv2Config, LayoutLMv3Config, LEDConfig, LiltConfig, LlamaConfig, LlavaConfig, LlavaNextConfig, LongformerConfig, LongT5Config, LukeConfig, LxmertConfig, M2M100Config, MambaConfig, MarianConfig, MBartConfig, MegaConfig, MegatronBertConfig, MgpstrConfig, MistralConfig, MixtralConfig, MobileBertConfig, MPNetConfig, MptConfig, MraConfig, MT5Config, MusicgenConfig, MusicgenMelodyConfig, MvpConfig, NezhaConfig, NllbMoeConfig, NystromformerConfig, OlmoConfig, OneFormerConfig, OpenAIGPTConfig, OPTConfig, Owlv2Config, OwlViTConfig, PaliGemmaConfig, PegasusConfig, PegasusXConfig, PerceiverConfig, PersimmonConfig, PhiConfig, Phi3Config, Pix2StructConfig, PLBartConfig, ProphetNetConfig, QDQBertConfig, Qwen2Config, Qwen2MoeConfig, RagConfig, RealmConfig, RecurrentGemmaConfig, ReformerConfig, RemBertConfig, RetriBertConfig, RobertaConfig, RobertaPreLayerNormConfig, RoCBertConfig, RoFormerConfig, RwkvConfig, SeamlessM4TConfig, SeamlessM4Tv2Config, SiglipConfig, Speech2TextConfig, Speech2Text2Config, SpeechT5Config, SplinterConfig, SqueezeBertConfig, StableLmConfig, Starcoder2Config, SwitchTransformersConfig, T5Config, TapasConfig, TransfoXLConfig, TvpConfig, UdopConfig, UMT5Config, VideoLlavaConfig, ViltConfig, VipLlavaConfig, VisualBertConfig, VitsConfig, Wav2Vec2Config, Wav2Vec2BertConfig, Wav2Vec2ConformerConfig, WhisperConfig, XCLIPConfig, XGLMConfig, XLMConfig, XLMProphetNetConfig, XLMRobertaConfig, XLMRobertaXLConfig, XLNetConfig, XmodConfig, YosoConfig.

LGAI-EXAONE
/

EXAONE-3.0-7.8B-Instruct

Value Error: Unrecognized configuration class <> to build an AutoTokenizer

Choose your prompt