how to train snowflake-arctic-embed-m-long to do classification?
# my dataset:
## {'label':xxx, 'text':xxx}
# my code:
pretrained_model_name_or_path = 'Snowflake/snowflake-arctic-embed-m-long'
tokenizer = AutoTokenizer.from_pretrained(pretrained_model_name_or_path,trust_remote_code=True)
def preprocess_function(examples):
return tokenizer(examples['text'], truncation=True, max_length=8192)
tokenized_datasets = my_datasets.map(preprocess_function, batched=True)
data_collator = DataCollatorWithPadding(tokenizer=tokenizer)
model = AutoModel.from_pretrained(pretrained_model_name_or_path, num_labels=3,trust_remote_code=True, add_pooling_layer=False, safe_serialization=True, rotary_scaling_factor=2)
but get error:
Traceback (most recent call last):
File "train_clean.py", line 338, in <module>
main()
File "train_clean.py", line 285, in main
trainer.train()
File "/cache/anaconda3/envs/bert/lib/python3.8/site-packages/transformers/trainer.py", line 1553, in train
return inner_training_loop(
File "/cache/anaconda3/envs/bert/lib/python3.8/site-packages/transformers/trainer.py", line 1835, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
File "/cache/anaconda3/envs/bert/lib/python3.8/site-packages/transformers/trainer.py", line 2679, in training_step
loss = self.compute_loss(model, inputs)
File "/cache/anaconda3/envs/bert/lib/python3.8/site-packages/transformers/trainer.py", line 2704, in compute_loss
outputs = model(**inputs)
File "/cache/anaconda3/envs/bert/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/cache/anaconda3/envs/bert/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 1040, in forward
output = self._run_ddp_forward(*inputs, **kwargs)
File "/cache/anaconda3/envs/bert/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 1000, in _run_ddp_forward
return module_to_run(*inputs[0], **kwargs[0])
File "/cache/anaconda3/envs/bert/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/cache/anaconda3/envs/bert/lib/python3.8/site-packages/accelerate/utils/operations.py", line 632, in forward
return model_forward(*args, **kwargs)
File "/cache/anaconda3/envs/bert/lib/python3.8/site-packages/accelerate/utils/operations.py", line 620, in __call__
return convert_to_fp32(self.model_forward(*args, **kwargs))
File "/cache/anaconda3/envs/bert/lib/python3.8/site-packages/torch/amp/autocast_mode.py", line 14, in decorate_autocast
return func(*args, **kwargs)
TypeError: forward() got an unexpected keyword argument 'labels'
It seems it use "NomicBertModel.forward" , but got an unexpected keyword argument 'labels', how to solve this problem?
Need to remove num_labels=3 from your code.
Need to remove num_labels=3 from your code.
still have same error.
I am confused with model, if I want to do task classification, should i use "AutoModelForSequenceClassification" instead of "AutoModel"? But when I use AutoModelForSequenceClassification, the configure will have recognized error. So which one is right? and how to use it to do classification.
Traceback (most recent call last):
File "/mnt/code_github/bert_class/data_quality_clean/train_clean.py", line 341, in <module>
main()
File "/mnt/code_github/bert_class/data_quality_clean/train_clean.py", line 221, in main
model = AutoModelForSequenceClassification.from_pretrained(pretrained_model_name_or_path, trust_remote_code=True, add_pooling_layer=False, safe_serialization=True, rotary_scaling_factor=2)
File "/mnt/anaconda3/envs/prod-torch1.13/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 566, in from_pretrained
raise ValueError(
ValueError: Unrecognized configuration class <class 'transformers_modules.Snowflake.snowflake-arctic-embed-m-long.08e7a4449e3f07709fb9387bc3172d393a6cc5e2.configuration_hf_nomic_bert.NomicBertConfig'> for this kind of AutoModel: AutoModelForSequenceClassification.
Model type should be one of AlbertConfig, BartConfig, BertConfig, BigBirdConfig, BigBirdPegasusConfig, BioGptConfig, BloomConfig, CamembertConfig, CanineConfig, LlamaConfig, ConvBertConfig, CTRLConfig, Data2VecTextConfig, DebertaConfig, DebertaV2Config, DistilBertConfig, ElectraConfig, ErnieConfig, ErnieMConfig, EsmConfig, FalconConfig, FlaubertConfig, FNetConfig, FunnelConfig, GPT2Config, GPT2Config, GPTBigCodeConfig, GPTNeoConfig, GPTNeoXConfig, GPTJConfig, IBertConfig, LayoutLMConfig, LayoutLMv2Config, LayoutLMv3Config, LEDConfig, LiltConfig, LlamaConfig, LongformerConfig, LukeConfig, MarkupLMConfig, MBartConfig, MegaConfig, MegatronBertConfig, MobileBertConfig, MPNetConfig, MptConfig, MraConfig, MT5Config, MvpConfig, NezhaConfig, NystromformerConfig, OpenLlamaConfig, OpenAIGPTConfig, OPTConfig, PerceiverConfig, PLBartConfig, QDQBertConfig, ReformerConfig, RemBertConfig, RobertaConfig, RobertaPreLayerNormConfig, RoCBertConfig, RoFormerConfig, SqueezeBertConfig, T5Config, TapasConfig, TransfoXLConfig, UMT5Config, XLMConfig, XLMRobertaConfig, XLMRobertaXLConfig, XLNetConfig, XmodConfig, YosoConfig.
The model architecture does implement sequence classification. As for how to tell HuggingFace to automatically select that class while also loading these model weights, I'm not personally sure.
Digging in a bit, it seems that the upstream architecture's config doesn't configure the auto_map
for models besides AutoModel
(i.e. it doesn't configure a mapping for AutoModelForSequenceClassification
. If you create the config and then add this mapping, you can use AutoModelForSequenceClassification
without the above error.
from transformers import AutoConfig, AutoModelForSequenceClassification
config = AutoConfig.from_pretrained(
"Snowflake/arctic-embed-m-long",
trust_remote_code=True,
num_labels=3,
rotary_scaling_factor=2,
auto_map={
"AutoModelForSequenceClassification": "Snowflake/arctic-embed-m-long--modeling_hf_nomic_bert.NomicBertForSequenceClassification"
}
)
model = AutoModelForSequenceClassification.from_config(config, trust_remote_code=True)
Looking at the Huggingface docs tutorial for sequence classification, it seems you want AutoModelForSequenceClassification
. I hope this helps!