SentenceTransformer based on cl-nagoya/sup-simcse-ja-base

This is a sentence-transformers model finetuned from cl-nagoya/sup-simcse-ja-base. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: cl-nagoya/sup-simcse-ja-base
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Detomo/cl-nagoya-sup-simcse-ja-for-standard-name-v1_0")
# Run inference
sentences = [
    '科目:土工。名称:水替。',
    '科目:既製コンクリート。名称:押出成形セメント板水抜パイプ。',
    '科目:既製コンクリート。名称:地下二重壁押出成型セメントパネル足元金物。',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 5,777 training samples
  • Columns: sentence and label
  • Approximate statistics based on the first 1000 samples:
    sentence label
    type string int
    details
    • min: 11 tokens
    • mean: 17.53 tokens
    • max: 29 tokens
    • 0: ~0.10%
    • 1: ~0.10%
    • 2: ~0.10%
    • 3: ~0.10%
    • 4: ~0.10%
    • 5: ~0.10%
    • 6: ~0.10%
    • 7: ~0.10%
    • 8: ~0.10%
    • 9: ~0.10%
    • 10: ~0.10%
    • 11: ~0.10%
    • 12: ~0.10%
    • 13: ~0.10%
    • 14: ~0.10%
    • 15: ~0.10%
    • 16: ~0.10%
    • 17: ~0.10%
    • 18: ~0.10%
    • 19: ~0.10%
    • 20: ~0.10%
    • 21: ~0.10%
    • 22: ~0.10%
    • 23: ~0.10%
    • 24: ~0.10%
    • 25: ~0.10%
    • 26: ~0.10%
    • 27: ~0.10%
    • 28: ~0.10%
    • 29: ~0.10%
    • 30: ~0.10%
    • 31: ~0.10%
    • 32: ~0.20%
    • 33: ~0.10%
    • 34: ~0.10%
    • 35: ~0.10%
    • 36: ~0.10%
    • 37: ~0.60%
    • 38: ~0.40%
    • 39: ~0.10%
    • 40: ~0.10%
    • 41: ~0.10%
    • 42: ~0.10%
    • 43: ~0.20%
    • 44: ~0.10%
    • 45: ~0.10%
    • 46: ~0.10%
    • 47: ~0.10%
    • 48: ~0.10%
    • 49: ~0.10%
    • 50: ~0.10%
    • 51: ~0.10%
    • 52: ~0.10%
    • 53: ~0.20%
    • 54: ~0.10%
    • 55: ~0.10%
    • 56: ~0.20%
    • 57: ~0.10%
    • 58: ~0.50%
    • 59: ~0.50%
    • 60: ~0.10%
    • 61: ~0.10%
    • 62: ~0.10%
    • 63: ~0.10%
    • 64: ~0.10%
    • 65: ~0.10%
    • 66: ~0.10%
    • 67: ~0.10%
    • 68: ~0.10%
    • 69: ~0.10%
    • 70: ~0.10%
    • 71: ~0.10%
    • 72: ~0.10%
    • 73: ~0.10%
    • 74: ~0.10%
    • 75: ~0.10%
    • 76: ~0.20%
    • 77: ~0.10%
    • 78: ~0.10%
    • 79: ~0.10%
    • 80: ~0.10%
    • 81: ~0.10%
    • 82: ~0.10%
    • 83: ~0.10%
    • 84: ~0.20%
    • 85: ~0.10%
    • 86: ~0.10%
    • 87: ~0.20%
    • 88: ~0.10%
    • 89: ~0.10%
    • 90: ~0.10%
    • 91: ~0.10%
    • 92: ~0.10%
    • 93: ~0.10%
    • 94: ~0.20%
    • 95: ~0.20%
    • 96: ~0.10%
    • 97: ~0.20%
    • 98: ~0.10%
    • 99: ~0.20%
    • 100: ~0.10%
    • 101: ~0.30%
    • 102: ~0.60%
    • 103: ~0.10%
    • 104: ~1.00%
    • 105: ~0.10%
    • 106: ~0.10%
    • 107: ~0.10%
    • 108: ~0.10%
    • 109: ~0.20%
    • 110: ~0.10%
    • 111: ~0.20%
    • 112: ~0.20%
    • 113: ~0.10%
    • 114: ~0.10%
    • 115: ~0.10%
    • 116: ~0.10%
    • 117: ~0.10%
    • 118: ~0.10%
    • 119: ~0.10%
    • 120: ~0.20%
    • 121: ~0.10%
    • 122: ~0.10%
    • 123: ~0.10%
    • 124: ~0.10%
    • 125: ~0.10%
    • 126: ~0.10%
    • 127: ~0.10%
    • 128: ~0.10%
    • 129: ~0.10%
    • 130: ~0.40%
    • 131: ~0.10%
    • 132: ~0.10%
    • 133: ~0.20%
    • 134: ~0.20%
    • 135: ~0.20%
    • 136: ~0.20%
    • 137: ~0.10%
    • 138: ~0.10%
    • 139: ~0.10%
    • 140: ~0.10%
    • 141: ~0.10%
    • 142: ~0.10%
    • 143: ~0.20%
    • 144: ~0.10%
    • 145: ~0.10%
    • 146: ~0.10%
    • 147: ~0.10%
    • 148: ~0.10%
    • 149: ~0.10%
    • 150: ~0.10%
    • 151: ~0.20%
    • 152: ~0.10%
    • 153: ~0.10%
    • 154: ~0.20%
    • 155: ~0.10%
    • 156: ~0.10%
    • 157: ~0.10%
    • 158: ~0.20%
    • 159: ~0.20%
    • 160: ~0.10%
    • 161: ~0.10%
    • 162: ~0.20%
    • 163: ~0.20%
    • 164: ~0.10%
    • 165: ~0.10%
    • 166: ~0.10%
    • 167: ~0.10%
    • 168: ~0.10%
    • 169: ~0.10%
    • 170: ~0.10%
    • 171: ~0.10%
    • 172: ~0.10%
    • 173: ~0.10%
    • 174: ~0.20%
    • 175: ~0.10%
    • 176: ~0.10%
    • 177: ~0.10%
    • 178: ~0.10%
    • 179: ~0.10%
    • 180: ~0.10%
    • 181: ~0.10%
    • 182: ~0.10%
    • 183: ~0.10%
    • 184: ~0.10%
    • 185: ~0.10%
    • 186: ~0.10%
    • 187: ~0.10%
    • 188: ~0.10%
    • 189: ~0.10%
    • 190: ~0.10%
    • 191: ~0.20%
    • 192: ~0.20%
    • 193: ~0.10%
    • 194: ~0.20%
    • 195: ~0.10%
    • 196: ~0.10%
    • 197: ~0.10%
    • 198: ~0.10%
    • 199: ~0.20%
    • 200: ~0.20%
    • 201: ~0.10%
    • 202: ~0.10%
    • 203: ~0.10%
    • 204: ~0.10%
    • 205: ~0.10%
    • 206: ~0.10%
    • 207: ~0.10%
    • 208: ~0.10%
    • 209: ~0.10%
    • 210: ~0.10%
    • 211: ~0.10%
    • 212: ~0.10%
    • 213: ~2.30%
    • 214: ~1.10%
    • 215: ~0.10%
    • 216: ~0.10%
    • 217: ~0.10%
    • 218: ~0.20%
    • 219: ~0.10%
    • 220: ~0.30%
    • 221: ~0.50%
    • 222: ~0.10%
    • 223: ~0.20%
    • 224: ~0.10%
    • 225: ~0.10%
    • 226: ~0.20%
    • 227: ~0.10%
    • 228: ~0.30%
    • 229: ~0.10%
    • 230: ~0.10%
    • 231: ~0.10%
    • 232: ~0.10%
    • 233: ~0.20%
    • 234: ~0.10%
    • 235: ~0.10%
    • 236: ~0.10%
    • 237: ~0.10%
    • 238: ~0.10%
    • 239: ~0.10%
    • 240: ~0.10%
    • 241: ~0.10%
    • 242: ~0.10%
    • 243: ~0.10%
    • 244: ~0.10%
    • 245: ~0.10%
    • 246: ~0.20%
    • 247: ~0.10%
    • 248: ~0.10%
    • 249: ~0.10%
    • 250: ~0.10%
    • 251: ~0.10%
    • 252: ~0.10%
    • 253: ~0.10%
    • 254: ~0.10%
    • 255: ~0.30%
    • 256: ~0.10%
    • 257: ~0.40%
    • 258: ~0.10%
    • 259: ~0.10%
    • 260: ~0.10%
    • 261: ~0.10%
    • 262: ~0.20%
    • 263: ~0.20%
    • 264: ~0.20%
    • 265: ~0.10%
    • 266: ~0.30%
    • 267: ~0.20%
    • 268: ~0.10%
    • 269: ~0.10%
    • 270: ~0.10%
    • 271: ~0.10%
    • 272: ~0.10%
    • 273: ~0.30%
    • 274: ~0.10%
    • 275: ~0.10%
    • 276: ~0.10%
    • 277: ~0.10%
    • 278: ~0.20%
    • 279: ~0.10%
    • 280: ~0.20%
    • 281: ~0.10%
    • 282: ~0.10%
    • 283: ~0.10%
    • 284: ~0.10%
    • 285: ~0.10%
    • 286: ~0.10%
    • 287: ~0.10%
    • 288: ~0.10%
    • 289: ~0.10%
    • 290: ~0.10%
    • 291: ~0.10%
    • 292: ~0.10%
    • 293: ~0.10%
    • 294: ~0.20%
    • 295: ~0.10%
    • 296: ~0.10%
    • 297: ~0.10%
    • 298: ~0.10%
    • 299: ~0.20%
    • 300: ~0.20%
    • 301: ~0.10%
    • 302: ~0.10%
    • 303: ~0.10%
    • 304: ~0.10%
    • 305: ~0.10%
    • 306: ~0.10%
    • 307: ~0.30%
    • 308: ~0.10%
    • 309: ~0.10%
    • 310: ~0.10%
    • 311: ~0.10%
    • 312: ~0.10%
    • 313: ~0.10%
    • 314: ~0.10%
    • 315: ~0.10%
    • 316: ~0.10%
    • 317: ~0.10%
    • 318: ~0.10%
    • 319: ~0.20%
    • 320: ~0.10%
    • 321: ~0.10%
    • 322: ~0.10%
    • 323: ~0.10%
    • 324: ~0.10%
    • 325: ~0.10%
    • 326: ~0.20%
    • 327: ~0.10%
    • 328: ~0.10%
    • 329: ~0.10%
    • 330: ~0.10%
    • 331: ~0.10%
    • 332: ~0.20%
    • 333: ~0.10%
    • 334: ~0.10%
    • 335: ~0.10%
    • 336: ~0.30%
    • 337: ~0.60%
    • 338: ~0.80%
    • 339: ~0.50%
    • 340: ~1.40%
    • 341: ~0.70%
    • 342: ~0.10%
    • 343: ~0.10%
    • 344: ~0.10%
    • 345: ~1.00%
    • 346: ~0.20%
    • 347: ~0.10%
    • 348: ~1.10%
    • 349: ~0.10%
    • 350: ~0.10%
    • 351: ~0.10%
    • 352: ~0.10%
    • 353: ~1.10%
    • 354: ~0.10%
    • 355: ~0.10%
    • 356: ~0.20%
    • 357: ~0.10%
    • 358: ~0.20%
    • 359: ~0.10%
    • 360: ~0.10%
    • 361: ~0.10%
    • 362: ~0.10%
    • 363: ~0.10%
    • 364: ~0.10%
    • 365: ~0.10%
    • 366: ~0.20%
    • 367: ~0.10%
    • 368: ~0.10%
    • 369: ~0.10%
    • 370: ~0.10%
    • 371: ~0.10%
    • 372: ~0.10%
    • 373: ~0.10%
    • 374: ~0.10%
    • 375: ~0.10%
    • 376: ~0.20%
    • 377: ~0.10%
    • 378: ~0.10%
    • 379: ~0.20%
    • 380: ~0.10%
    • 381: ~0.20%
    • 382: ~0.10%
    • 383: ~0.10%
    • 384: ~0.10%
    • 385: ~0.10%
    • 386: ~0.10%
    • 387: ~0.10%
    • 388: ~0.20%
    • 389: ~0.30%
    • 390: ~0.20%
    • 391: ~0.10%
    • 392: ~0.20%
    • 393: ~0.10%
    • 394: ~0.10%
    • 395: ~0.10%
    • 396: ~0.10%
    • 397: ~0.10%
    • 398: ~0.30%
    • 399: ~0.10%
    • 400: ~0.30%
    • 401: ~0.10%
    • 402: ~0.10%
    • 403: ~0.10%
    • 404: ~0.10%
    • 405: ~0.10%
    • 406: ~0.10%
    • 407: ~0.10%
    • 408: ~0.10%
    • 409: ~0.10%
    • 410: ~0.20%
    • 411: ~0.20%
    • 412: ~0.80%
    • 413: ~0.20%
    • 414: ~0.20%
    • 415: ~0.10%
    • 416: ~0.10%
    • 417: ~0.90%
    • 418: ~0.10%
    • 419: ~0.10%
    • 420: ~0.10%
    • 421: ~0.10%
    • 422: ~0.10%
    • 423: ~1.00%
    • 424: ~0.10%
    • 425: ~0.10%
    • 426: ~0.30%
    • 427: ~0.10%
    • 428: ~0.30%
    • 429: ~0.10%
    • 430: ~0.10%
    • 431: ~0.10%
    • 432: ~0.20%
    • 433: ~0.10%
    • 434: ~0.10%
    • 435: ~0.20%
    • 436: ~0.10%
    • 437: ~0.10%
    • 438: ~0.20%
    • 439: ~0.10%
    • 440: ~0.20%
    • 441: ~0.10%
    • 442: ~0.10%
    • 443: ~0.10%
    • 444: ~0.10%
    • 445: ~0.10%
    • 446: ~0.10%
    • 447: ~0.10%
    • 448: ~0.20%
    • 449: ~0.20%
    • 450: ~0.10%
    • 451: ~0.10%
    • 452: ~0.30%
    • 453: ~0.20%
    • 454: ~0.10%
    • 455: ~0.10%
    • 456: ~0.10%
    • 457: ~0.70%
    • 458: ~0.20%
    • 459: ~0.50%
    • 460: ~0.20%
    • 461: ~0.10%
    • 462: ~0.10%
    • 463: ~0.40%
    • 464: ~0.60%
    • 465: ~0.20%
    • 466: ~0.10%
    • 467: ~0.20%
    • 468: ~0.10%
    • 469: ~0.10%
    • 470: ~0.10%
    • 471: ~0.10%
    • 472: ~0.10%
    • 473: ~0.10%
    • 474: ~0.10%
    • 475: ~0.20%
    • 476: ~0.10%
    • 477: ~0.10%
    • 478: ~0.10%
    • 479: ~0.10%
    • 480: ~0.10%
    • 481: ~0.10%
    • 482: ~0.10%
    • 483: ~0.10%
    • 484: ~0.10%
    • 485: ~0.10%
    • 486: ~0.10%
    • 487: ~0.10%
    • 488: ~0.40%
    • 489: ~0.10%
    • 490: ~0.10%
    • 491: ~0.10%
    • 492: ~0.10%
    • 493: ~0.10%
    • 494: ~0.10%
    • 495: ~0.10%
    • 496: ~0.20%
    • 497: ~0.20%
    • 498: ~0.10%
    • 499: ~0.10%
    • 500: ~0.20%
    • 501: ~0.10%
    • 502: ~0.20%
    • 503: ~0.10%
    • 504: ~0.10%
    • 505: ~0.20%
    • 506: ~0.20%
    • 507: ~0.20%
    • 508: ~0.10%
    • 509: ~0.10%
    • 510: ~0.10%
    • 511: ~0.10%
    • 512: ~0.10%
    • 513: ~0.10%
    • 514: ~0.40%
    • 515: ~0.30%
    • 516: ~0.10%
    • 517: ~0.10%
    • 518: ~0.10%
    • 519: ~0.20%
    • 520: ~0.20%
    • 521: ~0.20%
    • 522: ~0.20%
    • 523: ~0.10%
    • 524: ~0.10%
    • 525: ~0.10%
    • 526: ~0.10%
    • 527: ~0.10%
    • 528: ~0.10%
    • 529: ~0.10%
    • 530: ~0.10%
    • 531: ~0.10%
    • 532: ~0.10%
    • 533: ~0.10%
    • 534: ~0.10%
    • 535: ~0.10%
    • 536: ~0.10%
    • 537: ~0.10%
    • 538: ~0.10%
    • 539: ~0.10%
    • 540: ~0.10%
    • 541: ~0.10%
    • 542: ~0.20%
    • 543: ~0.10%
    • 544: ~0.10%
    • 545: ~0.20%
    • 546: ~0.10%
    • 547: ~0.10%
    • 548: ~0.10%
    • 549: ~0.10%
    • 550: ~0.10%
    • 551: ~0.10%
    • 552: ~0.10%
    • 553: ~0.10%
    • 554: ~0.10%
    • 555: ~0.10%
    • 556: ~0.10%
    • 557: ~0.10%
    • 558: ~0.10%
    • 559: ~0.10%
    • 560: ~0.10%
    • 561: ~0.10%
    • 562: ~0.10%
    • 563: ~0.10%
    • 564: ~0.10%
    • 565: ~0.10%
    • 566: ~0.10%
    • 567: ~0.10%
    • 568: ~0.10%
    • 569: ~0.10%
    • 570: ~0.10%
    • 571: ~0.10%
    • 572: ~0.10%
    • 573: ~0.10%
    • 574: ~0.10%
    • 575: ~0.10%
    • 576: ~0.10%
    • 577: ~0.10%
    • 578: ~0.10%
    • 579: ~0.10%
    • 580: ~0.10%
    • 581: ~0.10%
    • 582: ~0.10%
    • 583: ~0.10%
    • 584: ~0.10%
    • 585: ~0.10%
    • 586: ~0.10%
    • 587: ~0.10%
    • 588: ~0.10%
    • 589: ~0.10%
    • 590: ~0.10%
    • 591: ~0.10%
    • 592: ~0.10%
    • 593: ~0.10%
    • 594: ~0.10%
    • 595: ~0.10%
    • 596: ~0.10%
    • 597: ~0.10%
    • 598: ~0.10%
    • 599: ~0.10%
    • 600: ~0.10%
    • 601: ~0.10%
    • 602: ~0.10%
    • 603: ~0.10%
    • 604: ~0.10%
    • 605: ~0.10%
    • 606: ~0.10%
    • 607: ~0.10%
    • 608: ~0.10%
    • 609: ~0.10%
    • 610: ~0.10%
    • 611: ~0.10%
    • 612: ~0.10%
    • 613: ~0.10%
    • 614: ~0.10%
    • 615: ~0.10%
    • 616: ~0.10%
    • 617: ~0.10%
    • 618: ~0.10%
    • 619: ~0.10%
    • 620: ~0.10%
    • 621: ~0.10%
    • 622: ~0.10%
    • 623: ~0.10%
    • 624: ~0.10%
    • 625: ~0.10%
    • 626: ~0.10%
    • 627: ~0.10%
    • 628: ~0.10%
    • 629: ~0.10%
    • 630: ~0.10%
    • 631: ~0.10%
    • 632: ~0.10%
    • 633: ~0.10%
    • 634: ~0.20%
    • 635: ~0.10%
    • 636: ~0.10%
    • 637: ~0.10%
    • 638: ~0.10%
    • 639: ~0.10%
    • 640: ~0.10%
    • 641: ~0.10%
    • 642: ~0.10%
    • 643: ~0.10%
    • 644: ~0.10%
    • 645: ~0.10%
    • 646: ~0.10%
    • 647: ~0.10%
    • 648: ~0.10%
    • 649: ~0.10%
    • 650: ~0.10%
    • 651: ~0.10%
    • 652: ~0.10%
    • 653: ~0.10%
    • 654: ~0.10%
    • 655: ~0.10%
    • 656: ~0.10%
    • 657: ~0.10%
    • 658: ~0.10%
    • 659: ~0.10%
    • 660: ~0.10%
    • 661: ~0.10%
    • 662: ~0.10%
    • 663: ~0.10%
    • 664: ~0.10%
    • 665: ~0.10%
    • 666: ~0.20%
    • 667: ~0.10%
    • 668: ~0.10%
    • 669: ~0.10%
    • 670: ~0.10%
    • 671: ~0.10%
    • 672: ~0.10%
    • 673: ~0.10%
    • 674: ~0.10%
    • 675: ~0.10%
    • 676: ~0.10%
    • 677: ~0.10%
    • 678: ~0.10%
    • 679: ~0.10%
    • 680: ~0.10%
    • 681: ~0.10%
    • 682: ~0.10%
    • 683: ~0.20%
    • 684: ~0.10%
    • 685: ~0.10%
    • 686: ~0.10%
    • 687: ~0.10%
    • 688: ~0.10%
    • 689: ~0.10%
    • 690: ~0.10%
    • 691: ~0.10%
    • 692: ~0.10%
    • 693: ~0.10%
    • 694: ~0.10%
  • Samples:
    sentence label
    科目:共通仮設費。名称:仮囲い。 0
    科目:共通仮設費。名称:電動パネルゲート。 1
    科目:共通仮設費。名称:タワークレーン。 2
  • Loss: BatchAllTripletLoss

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 256
  • per_device_eval_batch_size: 256
  • learning_rate: 1e-05
  • weight_decay: 0.01
  • num_train_epochs: 200
  • warmup_ratio: 0.1
  • fp16: True
  • batch_sampler: group_by_label

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 256
  • per_device_eval_batch_size: 256
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 1e-05
  • weight_decay: 0.01
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 200
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: group_by_label
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss
3.0870 20 0.8892
6.1739 40 0.8935
9.2609 60 0.862
13.0870 80 0.803
16.1739 100 0.8154
19.2609 120 0.7741
23.0870 140 0.7383
26.1739 160 0.7381
29.2609 180 0.7082
33.0870 200 0.6593
36.1739 220 0.6816
39.2609 240 0.6507
43.0870 260 0.6357
46.1739 280 0.643
49.2609 300 0.6336
53.0870 320 0.6392
56.1739 340 0.6153
59.2609 360 0.6385
63.0870 380 0.6034
66.1739 400 0.6194
69.2609 420 0.6334
73.0870 440 0.5934
76.1739 460 0.6216
79.2609 480 0.6211
83.0870 500 0.5974
86.1739 520 0.6612
89.2609 540 0.5143
93.0870 560 0.5871
96.1739 580 0.5752
99.2609 600 0.5661
103.0870 620 0.5879
106.1739 640 0.5866
109.2609 660 0.5677
113.0870 680 0.4864
116.1739 700 0.5891
119.2609 720 0.617
123.0870 740 0.5785
126.1739 760 0.534
129.2609 780 0.5854
133.0870 800 0.5971
136.1739 820 0.5309
139.2609 840 0.5514
143.0870 860 0.5656
146.1739 880 0.5106
149.2609 900 0.4831
153.0870 920 0.497
156.1739 940 0.4606
159.2609 960 0.4699
163.0870 980 0.5007
166.1739 1000 0.5483
169.2609 1020 0.4527
173.0870 1040 0.448
176.1739 1060 0.4639
179.2609 1080 0.6067
183.0870 1100 0.4516
186.1739 1120 0.4747
189.2609 1140 0.4732
193.0870 1160 0.5844
196.1739 1180 0.4461
199.2609 1200 0.4609

Framework Versions

  • Python: 3.11.11
  • Sentence Transformers: 3.3.1
  • Transformers: 4.48.3
  • PyTorch: 2.5.1+cu124
  • Accelerate: 1.3.0
  • Datasets: 3.3.2
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

BatchAllTripletLoss

@misc{hermans2017defense,
    title={In Defense of the Triplet Loss for Person Re-Identification},
    author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
    year={2017},
    eprint={1703.07737},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}
Downloads last month
137
Safetensors
Model size
111M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for Detomo/cl-nagoya-sup-simcse-ja-for-standard-name-v1_0

Finetuned
(2)
this model

Space using Detomo/cl-nagoya-sup-simcse-ja-for-standard-name-v1_0 1