Unable to load finetuned model after saving
Hi,
I have been able to finetune this model and subsequently save the finetuned model with:
moondream.save_pretrained("checkpoints/moondream-ft")
But, when I later tried to load the finetuned model for evaluation with:
config = AutoConfig.from_pretrained("checkpoints/moondream-ft", trust_remote_code=True)
moondream = AutoModelForCausalLM.from_pretrained("checkpoints/moondream-ft", config=config, trust_remote_code=True, device_map={"": DEVICE})
I got the following error:
AttributeError: module 'transformers_modules.vikhyatk.moondream2.fb2293ab2450beb1dae536c056f5976becD58e4c.moondream' has no attribute 'Moondream'
I am not sure how to go about loading and using the finetuned model from here.
Any ideas?
For anyone else who encounters this same issue I have found a solution.
Based on this conversation
"https://stackoverflow.com/questions/79354534/how-to-load-a-finetuned-vision-llm-model-moondream-model-case"
I have written myself a guide for how to finetune Moondream2 and later load the finetuned model.
My guide is as follows:
Python Version 3.10
Cuda Version 12.4
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124
pip install pillow transformers bitsandbytes accelerate wandb einops pyvips
Go into the saved model's config.json
Under "auto_map" set the "AuotConfig" and "AutoModelForCausalLM" to the following:
"AutoConfig": "configuration_moondream.MoondreamConfig",
"AutoModelForCausalLM": "moondream.Moondream"
For the next step, go to the repo version on huggignface that matches the one used to finetune, or slightly earlier, then copy the necessary files:
- "configuration_moondream.py"
- "moondream.py"
- "modeling_phi.py"
- "vision_encoder.py"
into the finetuned model's folder (mine is in "checkpoints\moondream-ft" so I put the files in the "moondream-ft" subfolder)
in this case, I finetuned my model with MD_REVISION = "2024-05-20" and I downloaded the above mentioned files from
https://huggingface.co./vikhyatk/moondream2/tree/48be9138e0faaec8802519b1b828350e33525d46
- Now download and place "configuration_moondream.py" into the model checkpoint's folder.
- Now download and place "moondream.py" into the model checkpoint's folder.
- Now download and place "modeling_phi.py" into the model checkpoint's folder.
- Now download and place "vision_encoder.py" into the model checkpoint's folder.
- You should now be able to load and run the model with "AutoModelForCausalLM.from_pretrained" from the transformers library.
@Charlington thanks for this.
BTW, how did you do the datacollation for the model? Did you find/develop scripts for that? Thanks
I think the data collation function was something I had help from ChatGPT for.
Here's the function I used that worked with my custom Dataset class.
def collate_fn(batch):
images = [sample['image'] for sample in batch]
images = [moondream.vision_encoder.preprocess(image) for image in images]
labels_acc = []
tokens_acc = []
for sample in batch:
toks = [tokenizer.bos_token_id]
labs = [-100] * (IMG_TOKENS + 1)
for qa in sample['qa']:
q_t = tokenizer(
f"\n\nQuestion: {qa['question']}\n\nAnswer:",
add_special_tokens=False
).input_ids
toks.extend(q_t)
labs.extend([-100] * len(q_t))
a_t = tokenizer(
f" {qa['answer']}{ANSWER_EOS}",
add_special_tokens=False
).input_ids
toks.extend(a_t)
labs.extend(a_t)
tokens_acc.append(toks)
labels_acc.append(labs)
max_len = -1
for labels in labels_acc:
max_len = max(max_len, len(labels))
attn_mask_acc = []
for i in range(len(batch)):
len_i = len(labels_acc[i])
pad_i = max_len - len_i
labels_acc[i].extend([-100] * pad_i)
tokens_acc[i].extend([tokenizer.eos_token_id] * pad_i)
attn_mask_acc.append([1] * len_i + [0] * pad_i)
return (
images,
torch.stack([torch.tensor(t, dtype=torch.long) for t in tokens_acc]),
torch.stack([torch.tensor(l, dtype=torch.long) for l in labels_acc]),
torch.stack([torch.tensor(a, dtype=torch.bool) for a in attn_mask_acc]),
)