custom_text_proj initialization

by edesalve - opened 17 days ago

17 days ago

Hi all, I've been working with the weights for custom_text_proj were not being loaded from the checkpoint and instead got initialized randomly, leading to very different results at each model instantiation.

To address this, I solved the problem by leveraging a base model provided by the Vidore team (https://huggingface.co./vidore/colqwen2.5-base).

Do you recommend this solution, or is there a better alternative or a forthcoming update to ensure proper initialization for the projection layer within your model?

Thank you!

Markgazol

Metric org 16 days ago

•

edited 15 days ago

Hi @edesalve , thanks for the question.

We ran multiple evaluations previously, and the results were consistent across runs. I checked the initialization of the custom_text_proj layer, and for various runs, the standard deviation, max, and min values remained the same. The only difference was in the mean, which had a very small variation close to zero (e.g., 4.3869e-05, 6.5327e-05). This minor variation in the mean does not adversely affect model performance.

Markgazol

Metric org 15 days ago

@edesalve hi again,
I have updated configs and added https://huggingface.co./Metric-AI/colqwen2.5-base with the corresponding weights for custom_text_proj . Now you can use the model and get deterministic scores always :)

Markgazol changed discussion status to closed 13 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment