msmarco-MiniLM-L-6-v3_onnx / README.md

jpohhhh

Try in-place swap to philschmid fork

6e21725 over 1 year ago

preview code

raw

history blame contribute delete

380 Bytes

metadata

license: mit
tags:
  - sentence-embeddings
  - endpoints-template
  - optimum
library_name: generic

This repository is a fork of philschmid/all-MiniLM-L6-v2-optimum-embeddings. My own ONNX conversion seems to be about 4x slower, no discernable reason why: the quantized models seem roughly the same. The idea here is by forking we can ex. upgrade the Optimum lib used as well.