--- base_model: sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 datasets: [] language: [] library_name: sentence-transformers pipeline_tag: sentence-similarity tags: - sentence-transformers - sentence-similarity - feature-extraction - generated_from_trainer - dataset_size:64000 - loss:DenoisingAutoEncoderLoss widget: - source_sentence: 𑀟ā¤šā¤¨đ‘€™đ‘€ĸ𑀟 𑀞ā¤šđ‘€Ēā¤šđ‘€ ā¤š đ‘€Ģđ‘Ŗā¤Ēđ‘Ŗ 𑀞ā¤¨đ‘€ ā¤š 𑀞đ‘Ŗ𑀱ā¤š ā¤Ŧđ‘€ĸđ‘€Ē𑀠ā¤šđ‘€¯ sentences: - ' ā¤Ŗā¤š ā¤Ŧđ‘€ĸđ‘€Ē𑀠ā¤š ā¤Ēā¤šđ‘€Ēđ‘Ļ đ‘€Ŗā¤š 𑀠ā¤šđ‘€Ģā¤šđ‘€ĸ𑀲đ‘€ĸā¤Ŗā¤šđ‘€Ēđ‘€ŗā¤š đ‘€Ŗā¤š ā¤ā¤šđ‘€Ÿđ‘Ļ𑀟đ‘€ŗā¤š ā¤žā¤šā¤Ŗā¤šđ‘€Ļ 𑀞ā¤šđ‘€ ā¤šđ‘€Ē ā¤Ŗā¤šđ‘€Ŗđ‘€Ŗā¤š 𑀠ā¤šđ‘€Ģā¤šđ‘€ĸ𑀲đ‘€ĸ𑀟đ‘€ŗā¤š ā¤Ŗā¤š ā¤ĸā¤šđ‘€Ē đ‘€ĸā¤Ŗā¤šā¤˛đ‘€ĸđ‘€¯' - ' đ‘€Ŗā¤šđ‘€Ÿā¤Ŧā¤šđ‘€Ÿđ‘Ļ đ‘€Ŗā¤š 𑀟ā¤šā¤¨đ‘€™đ‘€ĸ𑀟 𑀠đ‘Ŗā¤Ēā¤šđ‘€Ēđ‘€Ļ ā¤Ēā¤šđ‘€Ÿā¤š đ‘€ĸā¤Ŗā¤š 𑀤ā¤šđ‘€ ā¤š ā¤ĸā¤šā¤ĸā¤ĸā¤š 𑀞đ‘Ŗ 𑀞ā¤šđ‘€Ēā¤šđ‘€ ā¤š đ‘€ĸđ‘€Ŗā¤šđ‘€Ÿ ā¤šđ‘€žā¤š 𑀞𑀱ā¤šā¤Ēā¤šđ‘€Ÿā¤Ēā¤š đ‘€Ŗā¤š 𑀠đ‘Ŗā¤Ēā¤šđ‘€Ē đ‘€Ŗā¤šā¤¨đ‘€žā¤šđ‘€Ē đ‘€Ģđ‘Ŗā¤Ēđ‘Ŗ đ‘€Ŗā¤š đ‘€ŗā¤¨ā¤–đ‘€Ļ 𑀞ā¤¨đ‘€ ā¤š ā¤Ŗā¤š 𑀲đ‘€ĸ 𑀟ā¤š 𑀞đ‘Ŗ𑀱ā¤š ā¤Ŧđ‘€ĸđ‘€Ē𑀠ā¤šđ‘€¯' - ā¤Ēā¤šđ‘€Ēđ‘Ļ𑀠đ‘€ĸ ā¤Ŗā¤š ā¤ĸā¤¨ā¤Ŧā¤š 𑀱ā¤š ā¤ā¤¨đ‘€Ÿā¤Ŧđ‘€ĸā¤Ŗā¤šđ‘€Ē ā¤đ‘€ąā¤šā¤˛ā¤˛đ‘Ŗ𑀟 ā¤ā¤šđ‘€˛ā¤š ā¤Ēā¤š ā¤žā¤šā¤˛đ‘€ĸā¤ĸđ‘€ĸ𑀟 ā¤ā¤šđ‘€ŗā¤šđ‘€Ē đ‘€ĸđ‘€Ēā¤šđ‘€Ÿ ā¤š ā¤Ŧā¤šđ‘€ŗā¤šđ‘€Ē ā¤Ēā¤¨đ‘€Ē𑀞đ‘€ĸā¤Ŗā¤Ŗā¤š 𑀞ā¤¨đ‘€ ā¤š ā¤Ŗā¤š ā¤¤đ‘€ĸ 𑀱ā¤š ā¤ā¤¨đ‘€Ÿā¤Ŧđ‘€ĸā¤Ŗā¤šđ‘€Ē 𑀞𑀱ā¤šā¤˛ā¤˛ā¤šā¤Ŗđ‘Ļ ā¤Ĩđ‘€¯ - source_sentence: ā¤Ŗā¤šđ‘€Ÿā¤š ā¤Ŧā¤šā¤ĸā¤š đ‘€Ŗā¤š ā¤˛ā¤¨đ‘€Ēā¤š đ‘€Ŗā¤š đ‘€Ŗā¤š ā¤Ēā¤š 𑀲đ‘€ĸ đ‘€Ŗā¤š sentences: - 𑀘đ‘Ŗđ‘€Ģ𑀟 𑀠đ‘€ĸā¤¤đ‘€Ģā¤šđ‘Ļā¤˛ đ‘Ŗā¤Ŧđ‘€ĸđ‘€Ŗđ‘€ĸ 𑀝ā¤šđ‘€Ÿ đ‘€Ģā¤šđ‘€ĸ𑀲đ‘Ļđ‘€ŗđ‘€Ģđ‘€ĸ đ‘€Ēā¤šđ‘€Ÿā¤šđ‘€Ē 𑀗 ā¤Ŧā¤š 𑀱ā¤šā¤Ēā¤šđ‘€Ÿ đ‘€Ŗđ‘€ĸđ‘€ŗā¤šđ‘€ ā¤ĸā¤šđ‘€Ļ 𑀭ā¤Ĩ𑀖ā¤Ĩđ‘€Žđ‘€¯ - ' 𑀱ā¤šđ‘€Ÿđ‘€Ÿā¤šđ‘€Ÿ ā¤Ŗā¤šđ‘€Ÿā¤š ā¤Ēā¤šđ‘€ĸ𑀠ā¤šđ‘€žā¤š 𑀱ā¤š ā¤đ‘€ąā¤šđ‘€Ēā¤šđ‘€Ēđ‘€Ēā¤¨đ‘€Ÿ đ‘€Ģđ‘€Ē đ‘€ŗā¤¨ ā¤¤đ‘€ĸ ā¤Ŧā¤šā¤ĸā¤š đ‘€Ŗā¤š ā¤˛ā¤¨đ‘€Ēā¤š đ‘€Ŗā¤š đ‘€Ŗā¤¨đ‘€ž ā¤ĸā¤¨ā¤žā¤šā¤žā¤žđ‘Ļ𑀟 ā¤šā¤Ŗā¤Ŗā¤¨đ‘€žā¤šđ‘€Ÿđ‘€ŗā¤¨ đ‘€Ŗā¤š 𑀠ā¤šđ‘€ŗā¤¨ 𑀟đ‘Ļ𑀠ā¤š ā¤Ēā¤š đ‘€Ģā¤šđ‘€Ÿā¤Ŗā¤šđ‘€Ē đ‘€Ŗā¤š ā¤Ēā¤š 𑀲đ‘€ĸ đ‘€ŗā¤šā¤¨đ‘€Ēđ‘€ĸ đ‘€Ŗā¤š đ‘€ŗā¤šā¤¨ā¤đ‘€ĸ 𑀲đ‘€ĸā¤Ŗđ‘Ļ đ‘€Ŗā¤š đ‘€Ŗā¤šđ‘€¯' - ' ā¤š 𑀞ā¤šđ‘€Ē𑀞ā¤šđ‘€ŗđ‘€Ģđ‘€ĸ𑀟 đ‘€Ŗđ‘Ŗ𑀞ā¤šđ‘€Ēđ‘€Ļ 𑀠ā¤šđ‘€˜ā¤šā¤˛đ‘€ĸđ‘€ŗā¤šđ‘€Ē ā¤˛ā¤šā¤¨ā¤Ŗđ‘Ŗā¤Ŗđ‘€ĸ𑀟 đ‘€ĸ𑀟đ‘€Ŗđ‘€ĸā¤Ŗā¤š đ‘€ĸā¤Ēā¤š ā¤¤đ‘Ļ ā¤ĸā¤šā¤ĸā¤ĸā¤šđ‘€Ē đ‘€Ģā¤¨đ‘€žā¤¨đ‘€ ā¤šđ‘€Ē 𑀞ā¤¨ā¤˛ā¤š đ‘€Ŗā¤š đ‘€Ģā¤šđ‘€Ē𑀞đ‘Ŗ𑀞đ‘€ĸ𑀟 đ‘€ŗđ‘€Ģā¤šđ‘€Ēđ‘€ĸ𑀙ā¤š ā¤š đ‘€ĸ𑀟đ‘€Ŗđ‘€ĸā¤Ŗā¤š đ‘€Ŗā¤š 𑀞ā¤¨đ‘€ ā¤š ā¤Ēā¤šā¤ĸā¤ĸā¤šā¤Ēā¤šđ‘€Ē đ‘€Ŗā¤š ā¤ĸđ‘€ĸ𑀟 đ‘€Ŗđ‘Ŗ𑀞ā¤š đ‘€Ŗā¤š 𑀞đ‘€ĸā¤Ŗā¤šā¤Ŗđ‘Ļ 𑀞ā¤šđ‘€™đ‘€ĸđ‘€Ŗđ‘Ŗ𑀘đ‘€ĸ𑀟 𑀞𑀱ā¤šđ‘€Ēā¤šđ‘€Ēđ‘€Ēā¤¨ ā¤Ēā¤š đ‘€Ģā¤šđ‘€Ÿā¤Ŗā¤šđ‘€Ē 𑀞𑀱ā¤šđ‘€Ēā¤šđ‘€Ēđ‘€Ēā¤¨đ‘€Ÿ ā¤˛ā¤šā¤¨ā¤Ŗā¤š ā¤š 𑀞ā¤šđ‘€ŗā¤šđ‘€Ēđ‘€¯' - source_sentence: đ‘€Ŗā¤¨ā¤ĸā¤š ā¤ĸā¤ĸā¤¤đ‘€• 𑀠ā¤šđ‘€ ā¤šđ‘€Ē ā¤šā¤˛ā¤šā¤Ēđ‘Ŗā¤¨đ‘€ đ‘€ĸ sentences: - đ‘€Ŗā¤¨ā¤ĸā¤š 𑀞ā¤¨đ‘€ ā¤š đ‘€Ŗđ‘Ļ𑀟𑀞ā¤ˇđ‘€Ŗđ‘Ļ𑀟𑀞𑀠ā¤šđ‘€Ÿā¤šđ‘€¤ā¤šđ‘€Ēā¤Ēā¤š ā¤ĸā¤ĸā¤¤đ‘€• 𑀠ā¤šđ‘€ ā¤šđ‘€Ē 𑀞ā¤šđ‘€ŗđ‘€ŗđ‘Ļā¤Ŗ ā¤šā¤˛ā¤šā¤Ēđ‘Ŗā¤¨đ‘€ đ‘€ĸ đ‘€¯ - ' ā¤šđ‘€Ÿ 𑀲ā¤šđ‘€Ēā¤š đ‘€ŗā¤šđ‘€ ā¤šđ‘€Ē𑀱ā¤š 𑀞ā¤¨đ‘€ ā¤š đ‘€Ŗā¤šā¤Ŧā¤š ā¤ĸā¤šā¤Ŗā¤š ā¤šđ‘€Ÿ 𑀲ā¤šđ‘€Ŗā¤šđ‘€Ŗā¤š ā¤šā¤Ŗā¤Ŗā¤¨đ‘€žā¤šđ‘€Ÿ ā¤Ŧā¤š đ‘€ŗā¤šā¤¨đ‘€Ēā¤šđ‘€Ÿ đ‘€ĸā¤Ŗā¤šā¤˛ā¤šđ‘€ĸ 𑀟ā¤š 𑀟ā¤šđ‘€˜đ‘Ļđ‘€Ēđ‘€ĸā¤Ŗā¤š 𑀠ā¤šđ‘€ŗā¤¨ ā¤Ŗā¤šđ‘€Ēā¤šđ‘€¯' - ' đ‘€Ģā¤šđ‘Ĩā¤šđ‘€žā¤š ā¤šđ‘€¤ā¤šā¤ĸā¤Ēā¤šđ‘€Ē𑀱ā¤š ā¤Ŗā¤šđ‘€Ÿā¤š đ‘€Ŗā¤š 𑀱ā¤šđ‘€Ģā¤šā¤˛ā¤š 𑀠ā¤¨đ‘€ŗā¤šđ‘€ đ‘€ ā¤šđ‘€Ÿ ā¤š ā¤¤đ‘€ĸ𑀞đ‘€ĸ𑀟 ā¤šā¤Ŗā¤Ŗā¤¨đ‘€žā¤šđ‘€Ÿ ā¤Ŗā¤šā¤đ‘€ĸ đ‘€Ŗā¤š ā¤Ēā¤šđ‘€ąā¤šā¤Ŧā¤šđ‘€Ēđ‘€¯' - source_sentence: ā¤šđ‘€Ÿ sentences: - 𑀠ā¤¨ā¤Ēā¤¨đ‘€ąā¤š ā¤š đ‘€Ēā¤šđ‘€Ÿā¤šđ‘€Ē ā¤° ā¤Ŧā¤š 𑀱ā¤šā¤Ēā¤šđ‘€Ÿ 𑀠ā¤šā¤Ŗā¤¨đ‘€Ÿ ā¤ đ‘€§đ‘€§ā¤ đ‘€Ļ ā¤šđ‘€žā¤¨ 𑀟ā¤š ā¤¤đ‘€ĸ𑀞đ‘€ĸ𑀟 𑀲ā¤šđ‘€ŗđ‘€ĸ𑀟𑀘đ‘Ŗ𑀘đ‘€ĸ đ‘€Ŧ𑀧 đ‘€Ŗā¤š 𑀞đ‘Ļ ā¤¤đ‘€ĸ𑀞đ‘€ĸ𑀟 𑀱ā¤šđ‘€Ÿđ‘€ĸ 𑀘đ‘€ĸđ‘€Ēā¤Ŧđ‘€ĸ𑀟 đ‘€Ŗā¤š ā¤Ŗā¤š ā¤Ŗđ‘€ĸ đ‘€Ģā¤šā¤Ēđ‘€ŗā¤šđ‘€Ēđ‘€ĸ𑀟 𑀠đ‘€ĸ𑀟ā¤Ēā¤¨đ‘€Ÿā¤š 𑀞ā¤šā¤žā¤šđ‘€Ÿ ā¤ĸā¤šā¤Ŗā¤šđ‘€Ÿ ā¤Ēā¤šđ‘€ŗđ‘€Ģđ‘€ĸ𑀟đ‘€ŗā¤š ā¤š 𑀞ā¤šđ‘€Ÿđ‘Ŗđ‘€¯ - ' ā¤šđ‘€Ÿ ā¤Ŗđ‘€ĸ đ‘€ĸ𑀠ā¤šđ‘€Ÿđ‘€ĸ𑀟 đ‘€ŗđ‘€¯' - ' 𑀲ā¤šđ‘€Ģā¤šđ‘€Ŗ ā¤Ŗā¤š 𑀞ā¤šđ‘€ đ‘€ ā¤šā¤˛ā¤š 𑀞ā¤šđ‘€žā¤šđ‘€Ē ā¤ đ‘€§đ‘€­ā¤ ā¤Ÿđ‘€­đ‘€° đ‘€Ŗā¤š 𑀞𑀱ā¤šā¤˛ā¤˛ā¤šā¤Ŗđ‘Ļ 𑀭𑀧 𑀠ā¤šđ‘€ŗā¤¨ ā¤ĸā¤šđ‘€Ÿ đ‘€ŗđ‘€Ģā¤šđ‘€™ā¤šđ‘€ąā¤š ā¤š 𑀱ā¤šđ‘€ŗā¤šđ‘€Ÿđ‘€Ÿđ‘€ĸ ā¤ đ‘ĸ ā¤š đ‘€Ŗā¤¨đ‘€ž ā¤Ŧā¤šđ‘€ŗā¤šđ‘€¯' - source_sentence: ā¤Ŧđ‘€Ģđ‘Ŗđ‘€ŗā¤Ē đ‘€ĸđ‘€ĸ đ‘€ŗđ‘€Ģđ‘€ĸ𑀟đ‘Ļ 𑀠ā¤šđ‘€˛đ‘€ĸ 𑀠ā¤šđ‘€Ģđ‘€ĸ𑀠𑀠ā¤šđ‘€Ÿā¤¤đ‘€ĸđ‘€Ļ ā¤Ēā¤šđ‘€ĸ𑀠ā¤šđ‘€žđ‘Ŗ𑀟 đ‘€Ŗā¤š 𑀲ā¤šđ‘€ŗā¤šā¤˛ā¤¨ā¤˛ā¤˛ā¤¨đ‘€žā¤š ā¤Ŗā¤šđ‘€Ÿā¤š ā¤ĸā¤š 𑀠ā¤šđ‘€¤ā¤šā¤¨đ‘€Ÿā¤š ā¤¤đ‘€ĸ𑀞đ‘€ĸ𑀟 đ‘€Ģā¤šđ‘€Ÿđ‘€žā¤šā¤˛đ‘€ĸ ā¤Ŗā¤šā¤Ŗđ‘€ĸ𑀟 sentences: - ā¤šđ‘€ đ‘€ĸ𑀟ā¤Ēā¤šā¤¤ā¤¤đ‘€ĸā¤Ŗā¤š ā¤š ā¤¤đ‘€ĸ𑀞đ‘€ĸ𑀟 ā¤Ŧđ‘€Ģđ‘Ŗđ‘€ŗā¤Ē đ‘€ŗđ‘Ļđ‘€Ēđ‘€ĸđ‘Ļđ‘€ŗ đ‘€ĸđ‘€ĸ đ‘€ŗđ‘€Ģđ‘€ĸ𑀟đ‘Ļ 𑀠ā¤šđ‘€˛đ‘€ĸ 𑀠ā¤šđ‘€Ģđ‘€ĸ𑀠𑀠ā¤šđ‘€Ÿā¤¤đ‘€ĸđ‘€Ļ ā¤Ēā¤šđ‘€Ēđ‘Ļ đ‘€Ŗā¤š ā¤žđ‘€ĸ𑀠ā¤ĸđ‘€ĸ𑀟 đ‘€ĸ𑀟ā¤Ŧā¤šđ‘€Ÿā¤Ēā¤šā¤Ēā¤Ēā¤¨đ‘€Ÿ ā¤Ēđ‘€ŗā¤šđ‘€Ēđ‘€ĸ𑀟 ā¤Ēā¤šđ‘€ĸ𑀠ā¤šđ‘€žđ‘Ŗ𑀟 đ‘€Ŗđ‘€ĸđ‘€Ēđ‘Ļā¤ĸā¤š đ‘€Ŗā¤š 𑀲ā¤šđ‘€ŗā¤šā¤˛ā¤¨ā¤˛ā¤˛ā¤¨đ‘€žā¤š 𑀟ā¤š ā¤šđ‘€ đ‘€ĸ𑀟ā¤¤đ‘€ĸđ‘€Ļ ā¤Ŗā¤šđ‘€Ÿā¤š ā¤ĸā¤š 𑀠ā¤šđ‘€¤ā¤šā¤¨đ‘€Ÿā¤š ā¤¤đ‘€ĸ𑀞đ‘€ĸ𑀟 𑀞𑀱ā¤šđ‘€Ÿā¤¤đ‘€ĸā¤Ŗā¤šđ‘€Ē đ‘€Ģā¤šđ‘€Ÿđ‘€žā¤šā¤˛đ‘€ĸ ā¤Ŗā¤šā¤Ŗđ‘€ĸ𑀟 ā¤Ēā¤šđ‘€˛đ‘€ĸā¤Ŗā¤šđ‘€Ēđ‘€ŗā¤¨đ‘€¯ - ā¤Ēđ‘Ŗā¤§đ‘€ŗā¤Ŗ ā¤§đ‘€Ģđ‘€ĸđ‘€Ēđ‘€ĸ 𑀝ā¤šđ‘€Ÿ đ‘€Ģā¤šđ‘€ĸ𑀲đ‘Ļ đ‘€ŗđ‘€Ģđ‘€ĸ ā¤š đ‘€Ēā¤šđ‘€Ÿā¤šđ‘€Ē 𑀭𑀭 ā¤Ŧā¤š 𑀱ā¤šā¤Ēā¤šđ‘€Ÿ ā¤šā¤Ŧā¤¨đ‘€ŗā¤Ēā¤š 𑀭ā¤Ĩ𑀗𑀧𑀮 ā¤žā¤šđ‘€Ÿ 𑀱ā¤šđ‘€ŗā¤šđ‘€Ÿ ā¤ĸā¤šđ‘€Ŗ𑀠đ‘€ĸ𑀟ā¤Ēđ‘Ŗ𑀟 ā¤žā¤šđ‘€Ÿ 𑀤ā¤šđ‘€ ā¤ĸđ‘€ĸā¤š 𑀟đ‘Ļđ‘€¯ - ā¤Ēā¤šā¤Ŧā¤Ŧā¤šđ‘€˛ā¤šđ‘€Ŗđ‘€ĸ 𑀠ā¤šā¤Ēđ‘€ŗā¤¨ā¤Ŧā¤¨đ‘€Ÿđ‘€ĸ𑀟 𑀠ā¤¨ā¤Ēā¤šđ‘€Ÿđ‘Ļ 𑀟đ‘Ļ ā¤š đ‘€ŗā¤šđ‘€ŗđ‘€Ģđ‘Ļ𑀟 ā¤šđ‘€Ēā¤˛đ‘€ĸā¤Ē đ‘€Ŗā¤šđ‘€žđ‘Ļ ā¤Ŗā¤šđ‘€Ÿđ‘€žđ‘€ĸ𑀟 ā¤šā¤Ŧā¤šđ‘€Ŗđ‘Ļ𑀤 ā¤š ā¤šđ‘€Ēđ‘Ļ𑀱ā¤š ā¤Ēā¤š ā¤Ēđ‘€ŗā¤šđ‘€žđ‘€ĸā¤Ŗā¤šđ‘€Ē 𑀟đ‘€ĸ𑀘ā¤šđ‘€Ēđ‘€¯ --- # SentenceTransformer based on sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2](https://huggingface.co./sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2). It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more. ## Model Details ### Model Description - **Model Type:** Sentence Transformer - **Base model:** [sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2](https://huggingface.co./sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2) - **Maximum Sequence Length:** 512 tokens - **Output Dimensionality:** 384 tokens - **Similarity Function:** Cosine Similarity ### Model Sources - **Documentation:** [Sentence Transformers Documentation](https://sbert.net) - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers) - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co./models?library=sentence-transformers) ### Full Model Architecture ``` SentenceTransformer( (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True}) ) ``` ## Usage ### Direct Usage (Sentence Transformers) First install the Sentence Transformers library: ```bash pip install -U sentence-transformers ``` Then you can load this model and run inference. ```python from sentence_transformers import SentenceTransformer # Download from the 🤗 Hub model = SentenceTransformer("T-Blue/tsdae_pro_MiniLM_L12_2") # Run inference sentences = [ 'ā¤Ŧđ‘€Ģđ‘Ŗđ‘€ŗā¤Ē đ‘€ĸđ‘€ĸ đ‘€ŗđ‘€Ģđ‘€ĸ𑀟đ‘Ļ 𑀠ā¤šđ‘€˛đ‘€ĸ 𑀠ā¤šđ‘€Ģđ‘€ĸ𑀠𑀠ā¤šđ‘€Ÿā¤¤đ‘€ĸđ‘€Ļ ā¤Ēā¤šđ‘€ĸ𑀠ā¤šđ‘€žđ‘Ŗ𑀟 đ‘€Ŗā¤š 𑀲ā¤šđ‘€ŗā¤šā¤˛ā¤¨ā¤˛ā¤˛ā¤¨đ‘€žā¤š ā¤Ŗā¤šđ‘€Ÿā¤š ā¤ĸā¤š 𑀠ā¤šđ‘€¤ā¤šā¤¨đ‘€Ÿā¤š ā¤¤đ‘€ĸ𑀞đ‘€ĸ𑀟 đ‘€Ģā¤šđ‘€Ÿđ‘€žā¤šā¤˛đ‘€ĸ ā¤Ŗā¤šā¤Ŗđ‘€ĸ𑀟', 'ā¤šđ‘€ đ‘€ĸ𑀟ā¤Ēā¤šā¤¤ā¤¤đ‘€ĸā¤Ŗā¤š ā¤š ā¤¤đ‘€ĸ𑀞đ‘€ĸ𑀟 ā¤Ŧđ‘€Ģđ‘Ŗđ‘€ŗā¤Ē đ‘€ŗđ‘Ļđ‘€Ēđ‘€ĸđ‘Ļđ‘€ŗ đ‘€ĸđ‘€ĸ đ‘€ŗđ‘€Ģđ‘€ĸ𑀟đ‘Ļ 𑀠ā¤šđ‘€˛đ‘€ĸ 𑀠ā¤šđ‘€Ģđ‘€ĸ𑀠𑀠ā¤šđ‘€Ÿā¤¤đ‘€ĸđ‘€Ļ ā¤Ēā¤šđ‘€Ēđ‘Ļ đ‘€Ŗā¤š ā¤žđ‘€ĸ𑀠ā¤ĸđ‘€ĸ𑀟 đ‘€ĸ𑀟ā¤Ŧā¤šđ‘€Ÿā¤Ēā¤šā¤Ēā¤Ēā¤¨đ‘€Ÿ ā¤Ēđ‘€ŗā¤šđ‘€Ēđ‘€ĸ𑀟 ā¤Ēā¤šđ‘€ĸ𑀠ā¤šđ‘€žđ‘Ŗ𑀟 đ‘€Ŗđ‘€ĸđ‘€Ēđ‘Ļā¤ĸā¤š đ‘€Ŗā¤š 𑀲ā¤šđ‘€ŗā¤šā¤˛ā¤¨ā¤˛ā¤˛ā¤¨đ‘€žā¤š 𑀟ā¤š ā¤šđ‘€ đ‘€ĸ𑀟ā¤¤đ‘€ĸđ‘€Ļ ā¤Ŗā¤šđ‘€Ÿā¤š ā¤ĸā¤š 𑀠ā¤šđ‘€¤ā¤šā¤¨đ‘€Ÿā¤š ā¤¤đ‘€ĸ𑀞đ‘€ĸ𑀟 𑀞𑀱ā¤šđ‘€Ÿā¤¤đ‘€ĸā¤Ŗā¤šđ‘€Ē đ‘€Ģā¤šđ‘€Ÿđ‘€žā¤šā¤˛đ‘€ĸ ā¤Ŗā¤šā¤Ŗđ‘€ĸ𑀟 ā¤Ēā¤šđ‘€˛đ‘€ĸā¤Ŗā¤šđ‘€Ēđ‘€ŗā¤¨đ‘€¯', 'ā¤Ēđ‘Ŗā¤§đ‘€ŗā¤Ŗ ā¤§đ‘€Ģđ‘€ĸđ‘€Ēđ‘€ĸ 𑀝ā¤šđ‘€Ÿ đ‘€Ģā¤šđ‘€ĸ𑀲đ‘Ļ đ‘€ŗđ‘€Ģđ‘€ĸ ā¤š đ‘€Ēā¤šđ‘€Ÿā¤šđ‘€Ē 𑀭𑀭 ā¤Ŧā¤š 𑀱ā¤šā¤Ēā¤šđ‘€Ÿ ā¤šā¤Ŧā¤¨đ‘€ŗā¤Ēā¤š 𑀭ā¤Ĩ𑀗𑀧𑀮 ā¤žā¤šđ‘€Ÿ 𑀱ā¤šđ‘€ŗā¤šđ‘€Ÿ ā¤ĸā¤šđ‘€Ŗ𑀠đ‘€ĸ𑀟ā¤Ēđ‘Ŗ𑀟 ā¤žā¤šđ‘€Ÿ 𑀤ā¤šđ‘€ ā¤ĸđ‘€ĸā¤š 𑀟đ‘Ļđ‘€¯', ] embeddings = model.encode(sentences) print(embeddings.shape) # [3, 384] # Get the similarity scores for the embeddings similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] ``` ## Training Details ### Training Dataset #### Unnamed Dataset * Size: 64,000 training samples * Columns: sentence_0 and sentence_1 * Approximate statistics based on the first 1000 samples: | | sentence_0 | sentence_1 | |:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------| | type | string | string | | details | | | * Samples: | sentence_0 | sentence_1 | |:---------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------| | 𑀞ā¤¨đ‘€Ŗā¤¨ ā¤ĸđ‘€ĸđ‘€Ē𑀟đ‘€ĸ𑀟đ‘€Ļ𑀞ā¤¨đ‘€ŗā¤š ā¤Ēđ‘Ļ𑀞ā¤¨đ‘€Ÿ | ā¤Ēđ‘Ļ𑀞ā¤¨đ‘€Ÿ ā¤Ēā¤šā¤Ŧā¤š ā¤Ŗā¤šđ‘€Ÿā¤š 𑀞ā¤¨đ‘€Ŗā¤¨ đ‘€Ŗā¤š ā¤ĸđ‘€ĸđ‘€Ē𑀟đ‘€ĸ𑀟đ‘€Ļ𑀞ā¤¨đ‘€ŗā¤š đ‘€Ŗā¤š ā¤Ēđ‘Ļ𑀞ā¤¨đ‘€Ÿ ā¤Ēā¤šā¤¤đ‘€Ģđ‘Ŗā¤Ŧā¤šđ‘€¯ | | ā¤š ā¤¤đ‘€ĸā¤ĸđ‘€ĸā¤Ŗđ‘Ŗā¤Ŗđ‘€ĸ𑀟 đ‘€ŗā¤šđ‘€Ŗā¤šđ‘€Ē𑀱ā¤šđ‘€Ē đ‘€ŗā¤¨ ā¤ā¤šđ‘€Ēā¤š 𑀠ā¤šā¤Ēđ‘€ŗā¤šā¤Ŗđ‘€ĸ𑀟 | ā¤šā¤ĸđ‘Ŗ𑀞ā¤šđ‘€ĸ𑀞ā¤šđ‘€ ā¤šđ‘€Ē ā¤š ā¤Ŗā¤šđ‘€ąā¤šđ‘€Ÿā¤¤đ‘€ĸ𑀟 ā¤¤đ‘€ĸā¤ĸđ‘€ĸā¤Ŗđ‘Ŗā¤Ŗđ‘€ĸ𑀟 đ‘€ŗā¤šđ‘€Ŗā¤šđ‘€Ē𑀱ā¤šđ‘€Ē 𑀘ā¤šđ‘€ ā¤šđ‘€™ā¤šđ‘€Ļ 𑀠ā¤šđ‘€ŗā¤¨ ā¤šđ‘€ đ‘€˛ā¤šđ‘€Ÿđ‘€ĸ 𑀤ā¤š đ‘€ŗā¤¨ đ‘€ĸā¤Ŗā¤š ā¤ā¤šđ‘€Ēā¤š 𑀠ā¤¨ā¤Ēā¤šđ‘€Ÿđ‘Ļ ā¤š 𑀠ā¤šā¤Ēđ‘€ŗā¤šā¤Ŗđ‘€ĸ𑀟 ā¤šā¤ĸđ‘Ŗ𑀞ā¤šđ‘€Ÿđ‘€ŗā¤¨đ‘€¯ | | đ‘€Ŗā¤š ā¤Ŧā¤¨đ‘€Ŗā¤¨đ‘€ đ‘€ ā¤šđ‘€ąā¤š 𑀘ā¤šđ‘€Ēđ‘€ĸđ‘€Ŗā¤¨đ‘€Ÿ 𑀠ā¤¨đ‘€˜ā¤šā¤˛ā¤˛ā¤¨ ā¤Ēā¤š đ‘€¯ | ā¤Ēā¤š ā¤ĸā¤š đ‘€Ŗā¤š ā¤Ŧā¤¨đ‘€Ŗā¤¨đ‘€ đ‘€ ā¤šđ‘€ąā¤š ā¤Ŧā¤š 𑀘ā¤šđ‘€Ēđ‘€ĸđ‘€Ŗā¤¨đ‘€Ÿ ā¤šđ‘€Ÿā¤šđ‘€Ēā¤¤đ‘€Ģđ‘€ĸđ‘€ŗā¤Ē đ‘€Ŗā¤šā¤ĸā¤šđ‘€Ÿā¤ˇđ‘€Ŗā¤šā¤ĸā¤šđ‘€Ÿ đ‘€Ŗā¤š 𑀠ā¤¨đ‘€˜ā¤šā¤˛ā¤˛ā¤¨ 𑀠ā¤šđ‘€ŗā¤¨ ā¤šā¤˛ā¤šā¤ā¤š đ‘€Ŗā¤š ā¤ā¤¨đ‘€Ÿā¤Ŧđ‘€ĸā¤Ŗā¤šđ‘€Ē 𑀠ā¤šđ‘€™ā¤šđ‘€ĸ𑀞ā¤šā¤Ēā¤š 𑀙ā¤Ŗā¤šđ‘€Ÿā¤¤đ‘€ĸ ā¤Ēā¤š 𑀘ā¤šđ‘€ ā¤¨đ‘€ŗ đ‘€¯ | * Loss: [DenoisingAutoEncoderLoss](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#denoisingautoencoderloss) ### Training Hyperparameters #### Non-Default Hyperparameters - `per_device_train_batch_size`: 16 - `per_device_eval_batch_size`: 16 - `multi_dataset_batch_sampler`: round_robin #### All Hyperparameters
Click to expand - `overwrite_output_dir`: False - `do_predict`: False - `eval_strategy`: no - `prediction_loss_only`: True - `per_device_train_batch_size`: 16 - `per_device_eval_batch_size`: 16 - `per_gpu_train_batch_size`: None - `per_gpu_eval_batch_size`: None - `gradient_accumulation_steps`: 1 - `eval_accumulation_steps`: None - `learning_rate`: 5e-05 - `weight_decay`: 0.0 - `adam_beta1`: 0.9 - `adam_beta2`: 0.999 - `adam_epsilon`: 1e-08 - `max_grad_norm`: 1 - `num_train_epochs`: 3 - `max_steps`: -1 - `lr_scheduler_type`: linear - `lr_scheduler_kwargs`: {} - `warmup_ratio`: 0.0 - `warmup_steps`: 0 - `log_level`: passive - `log_level_replica`: warning - `log_on_each_node`: True - `logging_nan_inf_filter`: True - `save_safetensors`: True - `save_on_each_node`: False - `save_only_model`: False - `restore_callback_states_from_checkpoint`: False - `no_cuda`: False - `use_cpu`: False - `use_mps_device`: False - `seed`: 42 - `data_seed`: None - `jit_mode_eval`: False - `use_ipex`: False - `bf16`: False - `fp16`: False - `fp16_opt_level`: O1 - `half_precision_backend`: auto - `bf16_full_eval`: False - `fp16_full_eval`: False - `tf32`: None - `local_rank`: 0 - `ddp_backend`: None - `tpu_num_cores`: None - `tpu_metrics_debug`: False - `debug`: [] - `dataloader_drop_last`: False - `dataloader_num_workers`: 0 - `dataloader_prefetch_factor`: None - `past_index`: -1 - `disable_tqdm`: False - `remove_unused_columns`: True - `label_names`: None - `load_best_model_at_end`: False - `ignore_data_skip`: False - `fsdp`: [] - `fsdp_min_num_params`: 0 - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False} - `fsdp_transformer_layer_cls_to_wrap`: None - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None} - `deepspeed`: None - `label_smoothing_factor`: 0.0 - `optim`: adamw_torch - `optim_args`: None - `adafactor`: False - `group_by_length`: False - `length_column_name`: length - `ddp_find_unused_parameters`: None - `ddp_bucket_cap_mb`: None - `ddp_broadcast_buffers`: False - `dataloader_pin_memory`: True - `dataloader_persistent_workers`: False - `skip_memory_metrics`: True - `use_legacy_prediction_loop`: False - `push_to_hub`: False - `resume_from_checkpoint`: None - `hub_model_id`: None - `hub_strategy`: every_save - `hub_private_repo`: False - `hub_always_push`: False - `gradient_checkpointing`: False - `gradient_checkpointing_kwargs`: None - `include_inputs_for_metrics`: False - `eval_do_concat_batches`: True - `fp16_backend`: auto - `push_to_hub_model_id`: None - `push_to_hub_organization`: None - `mp_parameters`: - `auto_find_batch_size`: False - `full_determinism`: False - `torchdynamo`: None - `ray_scope`: last - `ddp_timeout`: 1800 - `torch_compile`: False - `torch_compile_backend`: None - `torch_compile_mode`: None - `dispatch_batches`: None - `split_batches`: None - `include_tokens_per_second`: False - `include_num_input_tokens_seen`: False - `neftune_noise_alpha`: None - `optim_target_modules`: None - `batch_eval_metrics`: False - `eval_on_start`: False - `batch_sampler`: batch_sampler - `multi_dataset_batch_sampler`: round_robin
### Training Logs | Epoch | Step | Training Loss | |:-----:|:-----:|:-------------:| | 0.125 | 500 | 2.5392 | | 0.25 | 1000 | 1.4129 | | 0.375 | 1500 | 1.3383 | | 0.5 | 2000 | 1.288 | | 0.625 | 2500 | 1.2627 | | 0.75 | 3000 | 1.239 | | 0.875 | 3500 | 1.2208 | | 1.0 | 4000 | 1.2041 | | 1.125 | 4500 | 1.1743 | | 1.25 | 5000 | 1.1633 | | 1.375 | 5500 | 1.1526 | | 1.5 | 6000 | 1.1375 | | 1.625 | 6500 | 1.1313 | | 1.75 | 7000 | 1.1246 | | 1.875 | 7500 | 1.1162 | | 2.0 | 8000 | 1.1096 | | 2.125 | 8500 | 1.0876 | | 2.25 | 9000 | 1.0839 | | 2.375 | 9500 | 1.0791 | | 2.5 | 10000 | 1.0697 | | 2.625 | 10500 | 1.0671 | | 2.75 | 11000 | 1.0644 | | 2.875 | 11500 | 1.0579 | | 3.0 | 12000 | 1.0528 | ### Framework Versions - Python: 3.10.12 - Sentence Transformers: 3.0.1 - Transformers: 4.42.4 - PyTorch: 2.3.1+cu121 - Accelerate: 0.33.0 - Datasets: 2.18.0 - Tokenizers: 0.19.1 ## Citation ### BibTeX #### Sentence Transformers ```bibtex @inproceedings{reimers-2019-sentence-bert, title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks", author = "Reimers, Nils and Gurevych, Iryna", booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing", month = "11", year = "2019", publisher = "Association for Computational Linguistics", url = "https://arxiv.org/abs/1908.10084", } ``` #### DenoisingAutoEncoderLoss ```bibtex @inproceedings{wang-2021-TSDAE, title = "TSDAE: Using Transformer-based Sequential Denoising Auto-Encoderfor Unsupervised Sentence Embedding Learning", author = "Wang, Kexin and Reimers, Nils and Gurevych, Iryna", booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2021", month = nov, year = "2021", address = "Punta Cana, Dominican Republic", publisher = "Association for Computational Linguistics", pages = "671--688", url = "https://arxiv.org/abs/2104.06979", } ```