Alibaba-NLP
/

gte-reranker-modernbert-base

Sentence Similarity

sentence-transformers

Transformers.js

text-classification

Inference Endpoints

Model card Files Files and versions Community

thenlper commited on 3 days ago

Commit

04b85dc

·

verified ·

1 Parent(s): f687ac3

Update README.md

Files changed (1) hide show

README.md +1 -27

README.md CHANGED Viewed

@@ -8,7 +8,7 @@ pipeline_tag: sentence-similarity
 library_name: transformers
 ---
-# gte-modernbert-base
 We are excited to introduce the `gte-modernbert` series of models, which are built upon the latest modernBERT pre-trained encoder-only foundation models. The `gte-modernbert` series models include both text embedding models and rerank models.
@@ -79,32 +79,6 @@ embeddings = model.encode(sentences)
 print(cos_sim(embeddings[0], embeddings[1]))
 ```
-Use with `transformers.js`:
-```js
-// npm i @xenova/transformers
-import { pipeline, dot } from '@xenova/transformers';
-// Create feature extraction pipeline
-const extractor = await pipeline('feature-extraction', 'Alibaba-NLP/gte-modernbert-base', {
-    quantized: false, // Comment out this line to use the quantized version
-});
-// Generate sentence embeddings
-const sentences = [
-    "what is the capital of China?",
-    "how to implement quick sort in python?",
-    "Beijing",
-    "sorting algorithms"
-]
-const output = await extractor(sentences, { normalize: true, pooling: 'cls' });
-// Compute similarity scores
-const [source_embeddings, ...document_embeddings ] = output.tolist();
-const similarities = document_embeddings.map(x => 100 * dot(source_embeddings, x));
-console.log(similarities);
-```
 ## Training Details
 The `gte-modernbert` series of models follows the training scheme of the previous [GTE models](https://huggingface.co/collections/Alibaba-NLP/gte-models-6680f0b13f885cb431e6d469), with the only difference being that the pre-training language model base has been replaced from [GTE-MLM](https://huggingface.co/Alibaba-NLP/gte-en-mlm-base) to [ModernBert](https://huggingface.co/answerdotai/ModernBERT-base). For more training details, please refer to our paper: [mGTE: Generalized Long-Context Text Representation and Reranking Models for Multilingual Text Retrieval](https://aclanthology.org/2024.emnlp-industry.103/)

 library_name: transformers
 ---
+# gte-reranker-modernbert-base
 We are excited to introduce the `gte-modernbert` series of models, which are built upon the latest modernBERT pre-trained encoder-only foundation models. The `gte-modernbert` series models include both text embedding models and rerank models.
 print(cos_sim(embeddings[0], embeddings[1]))
 ```
 ## Training Details
 The `gte-modernbert` series of models follows the training scheme of the previous [GTE models](https://huggingface.co/collections/Alibaba-NLP/gte-models-6680f0b13f885cb431e6d469), with the only difference being that the pre-training language model base has been replaced from [GTE-MLM](https://huggingface.co/Alibaba-NLP/gte-en-mlm-base) to [ModernBert](https://huggingface.co/answerdotai/ModernBERT-base). For more training details, please refer to our paper: [mGTE: Generalized Long-Context Text Representation and Reranking Models for Multilingual Text Retrieval](https://aclanthology.org/2024.emnlp-industry.103/)