zeta-alpha-ai/Zeta-Alpha-E5-Mistral · Integrate with Sentence Transformers (+ third parties like LangChain/Haystack/LlamaIndex, etc.)

tomaarsen

Sep 6, 2024

•

edited Sep 6, 2024

Hello!

Pull Request overview

Integrate with Sentence Transformers
Add "how to run" snippet
Add transformers and sentence-transformers tags

Details

First of all, congratulations on your model release! Always good to see more large embedding models, and I'm looking forward to the training dataset and recipe.

With this PR I'm proposing to add the configuration files required for Sentence Transformers, as well as the related projects that rely on ST. In particular, this involves adding some configuration files, e.g. ones stating which pooling method to use (last token in this case), what sequence length to use, etc. After adding these, the new snippet in the README becomes a very simple way to use this model.
You can also rely on

task_description = "Given a claim about climate change, retrieve documents that support or refute the claim"
prompt = f'Instruct: {task_description}\nQuery:'

queries = [
    "In Alaska, brown bears are changing their feeding habits to eat elderberries that ripen earlier.",
    "Local and regional sea levels continue to exhibit typical natural variability—in some places rising and in others falling."
]
query_embeddings = model.encode(queries, prompt=prompt)
passage_embeddings = model.encode(passages)

scores = model.similarity(query_embeddings, passage_embeddings)

But I stuck with the current snippet as it resembled the transformers one more closely (i.e. with just one inference call).

Tom Aarsen

Integrate with Sentence Transformersa0c6c801

tomaarsen changed pull request status to open Sep 6, 2024

ArthurCamara

Zeta Alpha org Sep 6, 2024

Thanks, Tom!

ArthurCamara changed pull request status to merged Sep 6, 2024