Add fields to `config.json` needed to be supported by HF TEI server

#7
by k4rth33k - opened

HuggingFace's text-embeddings-inference server expects a certain format for the config.json.
This PR adds the 2 fields missing from the expected config:

  • pad_token_id (0 because of bert tokenizer)
  • max_position_embeddings (768)
    Any suggestions and corrections are welcome :)
Nomic AI org

Correct me if I'm wrong but I think max_position_embeddings should be 8192 as it's the same parameter as n_positions

Ah, yes. You are right.
Also, I realised that a nomic-emebed specific implementation has to be made on the TEI side. So, I'm closing the PR. Thanks for your time :)

k4rth33k changed pull request status to closed

Sign up or log in to comment