pxyu commited on
Commit
1c61c67
·
verified ·
1 Parent(s): 0fcceba

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -4
README.md CHANGED
@@ -131,9 +131,6 @@ You no longer need to support models to empower high-quality English and multili
131
  | me5 base | 560M | 303M | 1024 | 51.4 | 54.0 | 43.0 | 34.6 |
132
  | bge-m3 (BAAI) | 568M | 303M | 1024 | 48.8 | **56.8** | 40.8 | 41.3 |
133
  | gte (Alibaba) | 305M | 113M | 768 | 51.1 | 52.3 | 47.7 | 53.1 |
134
- | me5 base | 560M | 303M | 1024 | 51.4 | 54.0 | 43.0 | 34.6 |
135
- | bge-m3 (BAAI) | 568M | 303M | 1024 | 48.8 | 56.8 | 40.8 | 41.3 |
136
- | gte (Alibaba) | 305M | 113M | 768 | 51.1 | 52.3 | 47.7 | 53.1 |
137
 
138
  Aside from high-quality retrieval, arctic delivers embeddings that are easily compressible. By leveraging vector truncation via MRL to decrease vector size by 3x with about 3% degradation in quality.
139
  Combine MRLed vectors with vector compression (Int4) to power retrieval in 128 bytes per doc.
@@ -188,7 +185,7 @@ tokenizer = AutoTokenizer.from_pretrained(model_name)
188
  model = AutoModel.from_pretrained(model_name, add_pooling_layer=False, trust_remote_code=True)
189
  model.eval()
190
 
191
- query_prefix = 'Query: '
192
  queries = ['what is snowflake?', 'Where can I get the best tacos?']
193
  queries_with_prefix = ["{}{}".format(query_prefix, i) for i in queries]
194
  query_tokens = tokenizer(queries_with_prefix, padding=True, truncation=True, return_tensors='pt', max_length=8192)
 
131
  | me5 base | 560M | 303M | 1024 | 51.4 | 54.0 | 43.0 | 34.6 |
132
  | bge-m3 (BAAI) | 568M | 303M | 1024 | 48.8 | **56.8** | 40.8 | 41.3 |
133
  | gte (Alibaba) | 305M | 113M | 768 | 51.1 | 52.3 | 47.7 | 53.1 |
 
 
 
134
 
135
  Aside from high-quality retrieval, arctic delivers embeddings that are easily compressible. By leveraging vector truncation via MRL to decrease vector size by 3x with about 3% degradation in quality.
136
  Combine MRLed vectors with vector compression (Int4) to power retrieval in 128 bytes per doc.
 
185
  model = AutoModel.from_pretrained(model_name, add_pooling_layer=False, trust_remote_code=True)
186
  model.eval()
187
 
188
+ query_prefix = 'query: '
189
  queries = ['what is snowflake?', 'Where can I get the best tacos?']
190
  queries_with_prefix = ["{}{}".format(query_prefix, i) for i in queries]
191
  query_tokens = tokenizer(queries_with_prefix, padding=True, truncation=True, return_tensors='pt', max_length=8192)