新闻 | News

[2024-04-06] 开源puff系列模型,专门针对检索和语义匹配任务,更多的考虑泛化性和私有通用测试集效果,向量维度可变,中英双语

[2024-02-27] 开源stella-mrl-large-zh-v3.5-1792d模型,支持向量可变维度

[2024-02-17] 开源stella v3系列、dialogue编码模型和相关训练数据。

[2023-10-19] 开源stella-base-en-v2 使用简单,不需要任何前缀文本

[2023-10-12] 开源stella-base-zh-v2和stella-large-zh-v2, 效果更好且使用简单,不需要任何前缀文本

[2023-09-11] 开源stella-base-zh和stella-large-zh

欢迎去本人主页查看最新模型,并提出您的宝贵意见!

1 开源模型

本次开源stella-mrl-large-zh-v3.5-1792d模型, 本模型是在stella-large-zh-v3-1792d的基础上使用MRL方法训练而成。 其主要特点是可变的向量维度

2 使用方法

from sentence_transformers import SentenceTransformer
from sklearn.preprocessing import normalize

model = SentenceTransformer("infgrad/stella-mrl-large-zh-v3.5-1792d")
# 注意先不要normalize! 选取前n维后再normalize
vectors = model.encode(["text1", "text2"], normalize_embeddings=False)
print(vectors.shape)  # shape is [2,1792]
# n_dims越大效果越好,但是时空消耗就越大。建议维度选取128的倍数,因为是这么训练的
n_dims = 768
cut_vecs = normalize(vectors[:, :n_dims])

3 不同向量维度的CMTEB得分

stella-mrl-large-zh-v3.5-1792d_1024 代表取前1024维。整体趋势是维度越大效果越好。

Model Retrieval STS PairClassification Classification Reranking Clustering CMTEB-Score
stella-mrl-large-zh-v3.5-1792d_128 70.01 62.17 87.99 70.67 66.77 53.55 67.16
stella-mrl-large-zh-v3.5-1792d_256 72.19 62.41 88.09 71.22 68.32 53.38 68.02
stella-mrl-large-zh-v3.5-1792d_384 72.77 62.43 88.26 71.34 68.31 53.87 68.25
stella-mrl-large-zh-v3.5-1792d_512 73.11 62.45 88.16 71.46 68.32 53.28 68.29
stella-mrl-large-zh-v3.5-1792d_640 73.27 62.49 88.21 71.46 68.69 53.63 68.42
stella-mrl-large-zh-v3.5-1792d_768 73.38 62.5 88.19 71.49 68.64 53.77 68.47
stella-mrl-large-zh-v3.5-1792d_896 73.37 62.5 88.14 71.51 68.44 54.13 68.49
stella-mrl-large-zh-v3.5-1792d_1024 73.43 62.51 88.16 71.52 68.59 53.43 68.44
stella-mrl-large-zh-v3.5-1792d_1152 73.46 62.49 88.16 71.57 68.55 53.67 68.49
stella-mrl-large-zh-v3.5-1792d_1280 73.48 62.51 88.12 71.55 68.44 53.74 68.48
stella-mrl-large-zh-v3.5-1792d_1408 73.48 62.51 88.14 71.58 68.46 53.69 68.48
stella-mrl-large-zh-v3.5-1792d_1536 73.49 62.5 88.11 71.55 68.5 54.06 68.52
stella-mrl-large-zh-v3.5-1792d_1664 73.56 62.49 88.06 71.56 68.47 54.28 68.56
stella-mrl-large-zh-v3.5-1792d_1792 73.51 62.48 88.09 71.56 68.45 54.39 68.56

上述表格中stella-mrl-large-zh-v3.5-1792d_1792的得分为68.56和榜单68.55得分不一致,原因和权重类型有关,小差异请忽略不计。

Downloads last month
14,156
Safetensors
Model size
326M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Spaces using dunzhang/stella-mrl-large-zh-v3.5-1792d 4

Evaluation results