custom_text_proj initialization
Hi all, I've been working with the weights for custom_text_proj were not being loaded from the checkpoint and instead got initialized randomly, leading to very different results at each model instantiation.
To address this, I solved the problem by leveraging a base model provided by the Vidore team (https://huggingface.co./vidore/colqwen2.5-base).
Do you recommend this solution, or is there a better alternative or a forthcoming update to ensure proper initialization for the projection layer within your model?
Thank you!
Hi @edesalve , thanks for the question.
We ran multiple evaluations previously, and the results were consistent across runs. I checked the initialization of the custom_text_proj layer, and for various runs, the standard deviation
, max
, and min
values remained the same. The only difference was in the mean
, which had a very small variation close to zero (e.g., 4.3869e-05, 6.5327e-05). This minor variation in the mean does not adversely affect model performance.
@edesalve
hi again,
I have updated configs and added https://huggingface.co./Metric-AI/colqwen2.5-base with the corresponding weights for custom_text_proj
. Now you can use the model and get deterministic scores always :)