Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference
Paper
•
2412.13663
•
Published
•
103
nlp, fewshot learning, sentence transformers
export_static_quantized_openvino_model
method to quantize a model.prompts
argument in SentenceTransformerTrainingArguments
. Our experiments show that you can easily reach 0.66% to 0.90% relative performance improvement on NDCG@10 at no extra cost by adding "query: " before each training query and "document: " before each training answer.SentenceTransformer("all-MiniLM-L6-v2", backend="onnx")
. Does your model not have an ONNX or OpenVINO file yet? No worries - it'll be autoexported for you. Thank me later 😉from_model2vec
or with from_distillation
where you do the distillation yourself. It'll only take 5 seconds on GPU & 2 minutes on CPU, no dataset needed.mine_hard_negatives
docs: https://sbert.net/docs/package_reference/util.html#sentence_transformers.util.mine_hard_negativesargs.push_to_hub=True
and args.hub_model_id
to upload your model checkpoints to Hugging Face while training. It also uploads your emissions (if codecarbon is installed) and your Tensorboard logs (if tensorboard is installed)model.similarity(embeddings1, embeddings2)
and you'll get your similarity scores immediately. Model authors can specify their desired similarity score, so you don't have to worry about it anymore!