Add openvino converted tokenizers

#96

by rhecker - opened 3 days ago

base: refs/heads/main

←

from: refs/pr/96

Discussion Files changed

+1750

-0

rhecker

3 days ago

In order to use the openvino models you need to tokenize the sentence first. There is currently no converted tokenizer of MiniLM-L6-H384-uncased on huggingface.

Perhaps we can include it here for the complete flow?

Add openvino converted tokenizers9005f9eb

tomaarsen

Sentence Transformers org 2 days ago

Hello!

Is the tokenizer in the root not sufficient?
E.g.:

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2", backend="openvino", device="cpu")

sentences = [
    "That is a happy person",
    "That is a happy dog",
    "That is a very happy person",
    "Today is a sunny day"
]
embeddings = model.encode(sentences)

similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]

Tom Aarsen

rhecker

2 days ago

I agree it's a bit of an edge case. But this will download the model in the script itself. My preference would be to clone the repo as is and expect all the models required to be downloaded already before execution.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Ready to merge

This branch is ready to get merged automatically.

· Sign up or log in to comment