truncate_dim=N returns non-normalized embeddings

#60
by ivanstepanovftw - opened

Please, help. I am using jina-embeddings-v3 with a pgvector extension for Postgres database, so I can search for passages using cosine similarity. I noticed that model.encode(..., truncate_dim=768) returns non-normalized embeddings. Should I normalize the embeddings before adding them to the database to later perform cosine distance search? Or, when I use truncate_dim, I should avoid cosine distance and use inner product/l2/etc.?

Jina AI org

Seems like a bug in our encode function, checking

  • Maybe it would be better to only normalize it once when the embeddings are calculated
  • also we could add another option for backwards compatibility? (not sure how to handle it in the API)

Hey, I'm experiencing the same issue. There's of course an easy workaround. @ivanstepanovftw you could just:

embeddings = F.normalize(embeddings, p=2, dim=1)

Yet, direct model interface does not allow for setting truncate_dim(both with torch 2.3.1+cu121/flash-attn-2.6.3).
Following the example in the model card within the average pooling section

[...]
adapter_mask = ...
truncate_dim = 32
model_output = model(
    input_ids=input_ids,
    attention_mask=attention_mask,
    adapter_mask=adapter_mask, # This one works as in the example
    truncate_dim=truncate_dim # This one raises Flash attention implementation does not support kwargs: truncate_dim
)
embeddings_1 = mean_pooling(model_output, attention_mask)  # Normalization happens inside mean_pooling
embeddings_1 = embeddings_1.detach().cpu().float().numpy()  # 1024-dimensional but normalized
embeddings_2 = model.encode(texts, task=task, truncate_dim=truncate_dim)  # 32-dim, not normalized

One could normalize embeddings_2, but I wonder how can truncate_dim to a direct model call?

Wold something like the following work?

embeddings = torch.nn.functional.layer_norm(
    embeddings,
    normalized_shape=(embeddings.shape[1],)
)
embeddings = embeddings[:, :truncate_dim]

It is not very clean but I wonder if it is how it is done under the hood in your system as well... yeah, yo do

Thank you for releasing the model and giving community support.

Sign up or log in to comment