anindya-hf-2002's picture
upload models
69fa0d3 verified
metadata
base_model: BAAI/bge-base-en-v1.5
datasets:
  - sentence-transformers/hotpotqa
language:
  - en
library_name: sentence-transformers
license: apache-2.0
metrics:
  - cosine_accuracy
  - dot_accuracy
  - manhattan_accuracy
  - euclidean_accuracy
  - max_accuracy
pipeline_tag: sentence-similarity
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:76064
  - loss:MatryoshkaLoss
  - loss:TripletLoss
widget:
  - source_sentence: >-
      Which survey in 2010 recommended the The Wide Field Infrared Survey
      Telescope as the top priority for astronomy?
    sentences:
      - >-
        High Energy Astronomy Observatory 1 HEAO-1 was an X-ray telescope
        launched in 1977. HEAO-1 surveyed the sky in the X-ray portion of the
        electromagnetic spectrum (0.2 keV - 10 MeV), providing nearly constant
        monitoring of X-ray sources near the ecliptic poles and more detailed
        studies of a number of objects by observations lasting 3-6 hours. It was
        the first of NASA's three High Energy Astronomy Observatories, HEAO 1,
        launched August 12, 1977 aboard an Atlas rocket with a Centaur upper
        stage, operated until 9 January 1979. During that time, it scanned the
        X-ray sky almost three times
      - >-
        Wide Field Infrared Survey Telescope The Wide Field Infrared Survey
        Telescope (WFIRST) is a future infrared space observatory that was
        recommended in 2010 by United States National Research Council Decadal
        Survey committee as the top priority for the next decade of astronomy.
        On February 17, 2016, WFIRST was formally designated as a mission by
        NASA.
      - >-
        The Bluebells The Bluebells were a Scottish indie rock band, active
        between 1981 and 1986 (later briefly reforming in 1993, 2008–2009 and
        2011).
  - source_sentence: >-
      Near what river is the library that contains the Aberdeen Bestiary
      located?
    sentences:
      - >-
        Joseph Roth Joseph Roth, born Moses Joseph Roth (2 September 1894 – 27
        May 1939), was an Austrian-Jewish journalist and novelist, best known
        for his family saga "Radetzky March" (1932), about the decline and fall
        of the Austro-Hungarian Empire, his novel of Jewish life, "Job" (1930),
        and his seminal essay "Juden auf Wanderschaft" (1927; translated into
        English in "The Wandering Jews"), a fragmented account of the Jewish
        migrations from eastern to western Europe in the aftermath of World War
        I and the Russian Revolution. In the 21st century, publications in
        English of "Radetzky March" and of collections of his journalism from
        Berlin and Paris created a revival of interest in Roth.
      - >-
        Aberdeen Bestiary The Aberdeen Bestiary (Aberdeen University Library,
        Univ Lib. MS 24) is a 12th-century English illuminated manuscript
        bestiary that was first listed in 1542 in the inventory of the Old Royal
        Library at the Palace of Westminster.
      - >-
        House of Monymusk The House of Monymusk is located on the outskirts of
        the Scottish village of Monymusk, in the Marr region of Aberdeenshire.
        The house is located near the "river Don", which is known for its
        spectacular trout-fishing. The village, which history dates back to
        1170, was bought by the Forbses in the 1560s, who later built the House
        of Monymusk. The Forbses claim they built the present House of Monymusk
        from the blackened stones of the old Priory.
  - source_sentence: >-
      The Stage" is a song by Avenged Sevenfold and the first single from their
      seventh studio album of the same name, which was released on which date?
    sentences:
      - >-
        Allegra Stratton Allegra Stratton (born 25 November 1980) is a British
        journalist and writer. Since January 2016, she has been the National
        Editor of ITV News after four years as political editor on BBC Two's
        "Newsnight". She has also co-presented "Peston on Sunday" with Robert
        Peston since May 2016.
      - >-
        Appetite for Destruction Appetite for Destruction is the debut studio
        album by American hard rock band Guns N' Roses. It was released on July
        21, 1987, by Geffen Records to massive commercial success. It topped the
        "Billboard" 200 and became the best-selling debut album as well as the
        11th best-selling album in the United States. With about 30 million
        copies sold worldwide, it is also one of the best-selling records ever.
        Although critics were ambivalent toward the album when it was first
        released, "Appetite for Destruction" has since received retrospective
        acclaim and been viewed as one of the greatest albums of all time.
      - >-
        The Stage (Avenged Sevenfold song) "The Stage" is a song by Avenged
        Sevenfold and the first single from their seventh studio album of the
        same name, which was released on October 28, 2016.
  - source_sentence: >-
      Union County Speedway is home to what type of motorsports that are usually
      held at county fairs and festivals?
    sentences:
      - >-
        Long Beach, New York Long Beach is a city in Nassau County, New York,
        United States. Just south of Long Island, it is located on Long Beach
        Barrier Island, which is the westernmost of the outer barrier islands
        off Long Island's South Shore. As of the United States 2010 Census, the
        city population was 33,275. It was incorporated in 1922, and is
        nicknamed "The City By the Sea" (as seen in Latin on its official seal).
      - >-
        Union County Speedway Union County Speedway is a dirt racetrack in
        Liberty, Indiana, United States. It features races with cars such as,
        late models, Modifieds, Sidestroke, Bombers, Road Hogs, and Street
        Stocks. UCS is also host to dirtbike, quad, Mini-Sprint, and Demolition
        Derbies.
      - >-
        Mercer County Fairgrounds The Mercer County Fairgrounds, located on 12th
        Avenue SW in Aledo, are the home of the annual county fair in Mercer
        County, Illinois. The fairgrounds were established in 1869 when the fair
        moved to Aledo; from its creation in 1853 until then, it had taken place
        in Millersburg. The early fairs mainly focused on agricultural
        exhibitions, and the first two buildings were used for horticulture
        exhibits and household floral shows; these fairs also included
        entertainment such as baseball games and band concerts. By the end of
        the century, the fair had grown to host 8,000 visitors, many who came
        from neighboring counties by train, and show 3,000 entries in its
        various agricultural competitions. The fair added traveling
        entertainment and grew to host over 20,000 visitors in the 20th century;
        it is still held annually at the fairgrounds. In addition to the county
        fair, the fairgrounds have also held horse races, political events,
        picnics, and other community events.
  - source_sentence: >-
      James D. Farley, Jr. had an early interest in automobiles because of his
      grandfather who worked for what company?
    sentences:
      - >-
        Continental Motors Company Continental Motors Company was an American
        manufacturer of internal combustion engines. The company produced
        engines as a supplier to many independent manufacturers of automobiles,
        tractors, trucks, and stationary equipment (such as pumps, generators,
        and industrial machinery drives) from the 1900s through the 1960s.
        Continental Motors also produced Continental-branded automobiles in
        1932–1933. The Continental Aircraft Engine Company was formed in 1929 to
        develop and produce its aircraft engines, and would become the core
        business of Continental Motors, Inc.
      - >-
        The Atlantic The Atlantic is an American magazine and multi-platform
        publisher, founded in 1857 as The Atlantic Monthly in Boston,
        Massachusetts.
      - >-
        Jim Farley (businessman) James D. Farley, Jr. (born June 1962) is an
        American automobile executive that currently serves as Ford Motor
        Company's Executive Vice President and president, Global Markets since
        June 2017. From 2015 to 2017, he was CEO and Chairman of Ford Europe. He
        had an early interest in automobiles, primarily spurred from his
        grandfather who worked at Henry Ford's River Rouge Plant starting in
        1914.
model-index:
  - name: BGE-base-en-v1.5-Hotpotqa
    results:
      - task:
          type: triplet
          name: Triplet
        dataset:
          name: dim 768
          type: dim_768
        metrics:
          - type: cosine_accuracy
            value: 0.9068859441552295
            name: Cosine Accuracy
          - type: dot_accuracy
            value: 0.09311405584477046
            name: Dot Accuracy
          - type: manhattan_accuracy
            value: 0.9066493137718883
            name: Manhattan Accuracy
          - type: euclidean_accuracy
            value: 0.9068859441552295
            name: Euclidean Accuracy
          - type: max_accuracy
            value: 0.9068859441552295
            name: Max Accuracy
      - task:
          type: triplet
          name: Triplet
        dataset:
          name: dim 512
          type: dim_512
        metrics:
          - type: cosine_accuracy
            value: 0.9074775201135826
            name: Cosine Accuracy
          - type: dot_accuracy
            value: 0.09311405584477046
            name: Dot Accuracy
          - type: manhattan_accuracy
            value: 0.9055844770468529
            name: Manhattan Accuracy
          - type: euclidean_accuracy
            value: 0.9064126833885471
            name: Euclidean Accuracy
          - type: max_accuracy
            value: 0.9074775201135826
            name: Max Accuracy
      - task:
          type: triplet
          name: Triplet
        dataset:
          name: dim 256
          type: dim_256
        metrics:
          - type: cosine_accuracy
            value: 0.907359204921912
            name: Cosine Accuracy
          - type: dot_accuracy
            value: 0.09311405584477046
            name: Dot Accuracy
          - type: manhattan_accuracy
            value: 0.9062943681968765
            name: Manhattan Accuracy
          - type: euclidean_accuracy
            value: 0.9062943681968765
            name: Euclidean Accuracy
          - type: max_accuracy
            value: 0.907359204921912
            name: Max Accuracy
      - task:
          type: triplet
          name: Triplet
        dataset:
          name: dim 128
          type: dim_128
        metrics:
          - type: cosine_accuracy
            value: 0.9060577378135353
            name: Cosine Accuracy
          - type: dot_accuracy
            value: 0.09488878371982963
            name: Dot Accuracy
          - type: manhattan_accuracy
            value: 0.9014434453383815
            name: Manhattan Accuracy
          - type: euclidean_accuracy
            value: 0.9035731187884525
            name: Euclidean Accuracy
          - type: max_accuracy
            value: 0.9060577378135353
            name: Max Accuracy
      - task:
          type: triplet
          name: Triplet
        dataset:
          name: dim 64
          type: dim_64
        metrics:
          - type: cosine_accuracy
            value: 0.9054661618551822
            name: Cosine Accuracy
          - type: dot_accuracy
            value: 0.09808329389493611
            name: Dot Accuracy
          - type: manhattan_accuracy
            value: 0.8983672503549456
            name: Manhattan Accuracy
          - type: euclidean_accuracy
            value: 0.9013251301467108
            name: Euclidean Accuracy
          - type: max_accuracy
            value: 0.9054661618551822
            name: Max Accuracy

BGE-base-en-v1.5-Hotpotqa

This is a sentence-transformers model finetuned from BAAI/bge-base-en-v1.5 on the sentence-transformers/hotpotqa dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: BAAI/bge-base-en-v1.5
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 tokens
  • Similarity Function: Cosine Similarity
  • Training Dataset:
  • Language: en
  • License: apache-2.0

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'James D. Farley, Jr. had an early interest in automobiles because of his grandfather who worked for what company?',
    "Jim Farley (businessman) James D. Farley, Jr. (born June 1962) is an American automobile executive that currently serves as Ford Motor Company's Executive Vice President and president, Global Markets since June 2017. From 2015 to 2017, he was CEO and Chairman of Ford Europe. He had an early interest in automobiles, primarily spurred from his grandfather who worked at Henry Ford's River Rouge Plant starting in 1914.",
    'Continental Motors Company Continental Motors Company was an American manufacturer of internal combustion engines. The company produced engines as a supplier to many independent manufacturers of automobiles, tractors, trucks, and stationary equipment (such as pumps, generators, and industrial machinery drives) from the 1900s through the 1960s. Continental Motors also produced Continental-branded automobiles in 1932–1933. The Continental Aircraft Engine Company was formed in 1929 to develop and produce its aircraft engines, and would become the core business of Continental Motors, Inc.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Triplet

Metric Value
cosine_accuracy 0.9069
dot_accuracy 0.0931
manhattan_accuracy 0.9066
euclidean_accuracy 0.9069
max_accuracy 0.9069

Triplet

Metric Value
cosine_accuracy 0.9075
dot_accuracy 0.0931
manhattan_accuracy 0.9056
euclidean_accuracy 0.9064
max_accuracy 0.9075

Triplet

Metric Value
cosine_accuracy 0.9074
dot_accuracy 0.0931
manhattan_accuracy 0.9063
euclidean_accuracy 0.9063
max_accuracy 0.9074

Triplet

Metric Value
cosine_accuracy 0.9061
dot_accuracy 0.0949
manhattan_accuracy 0.9014
euclidean_accuracy 0.9036
max_accuracy 0.9061

Triplet

Metric Value
cosine_accuracy 0.9055
dot_accuracy 0.0981
manhattan_accuracy 0.8984
euclidean_accuracy 0.9013
max_accuracy 0.9055

Training Details

Training Dataset

sentence-transformers/hotpotqa

  • Dataset: sentence-transformers/hotpotqa at f07d3cd
  • Size: 76,064 training samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 8 tokens
    • mean: 24.49 tokens
    • max: 108 tokens
    • min: 21 tokens
    • mean: 101.27 tokens
    • max: 512 tokens
    • min: 14 tokens
    • mean: 87.44 tokens
    • max: 407 tokens
  • Samples:
    anchor positive negative
    What historical geographic region in Central-Eastern Europe was the birthplace of a soldier of the Austro-Hungarian Army? Bruno Olbrycht Bruno Olbrycht (nom de guerre: Olza; 6 October 1895 – 23 March 1951) was a soldier of the Austro-Hungarian Army and officer (later general) of the Polish Army both in the Second Polish Republic and postwar Poland. Born on 6 October 1895 in Sanok, Austrian Galicia, Olbrycht fought in Polish Legions in World War I, Polish–Ukrainian War, Polish–Soviet War and the Invasion of Poland. He died on 23 March 1951 in Kraków. Padáň The village was first recorded in 1254 as "Padan", an old Pecheneg settlement. On the territory of the village, there used to be "Petény" village as well, which was mentioned in 1298 as the appurtenance of Pressburg Castle. Until the end of World War I, it was part of Hungary and fell within the Dunaszerdahely district of Pozsony County. After the Austro-Hungarian army disintegrated in November 1918, Czechoslovakian troops occupied the area. After the Treaty of Trianon of 1920, the village became officially part of Czechoslovakia. In November 1938, the First Vienna Award granted the area to Hungary and it was held by Hungary until 1945. After Soviet occupation in 1945, Czechoslovakian administration returned and the village became officially part of Czechoslovakia in 1947.
    Full Scale Assault is the fourth studio album by Dutch punk hardcore band Vitamin X, the album was recorded at Electrical Audio in Chicago by Steve Albini who previously recorded The Stooges, also known as Iggy and the Stooges, were an American rock band formed in Ann Arbor, Michigan in what year? Full Scale Assault Full Scale Assault is the fourth studio album by Dutch punk hardcore band Vitamin X. Released through Tankcrimes on October 10, 2008 in the US, and Agipunk in Europe. The album was recorded at Electrical Audio in Chicago by Steve Albini who previously recorded Nirvana, Neurosis, PJ Harvey, High on Fire, Iggy Pop & The Stooges. It features guest vocals from Negative Approach's singer John Brannon. Art is by John Dyer Baizley. The Dogs (US punk band) The Dogs are a three-piece proto-punk band formed in Lansing, Michigan, United States in 1969. They are noted for presaging the energy and sound of the later punk and hardcore genres.
    Which popular music style was a modification of the marches from "The March King" with heavy influences from African American communities? Ragtime Ragtime – also spelled rag-time or rag time – is a musical style that enjoyed its peak popularity between 1895 and 1918. Its cardinal trait is its syncopated, or "ragged", rhythm. The style has its origins in African-American communities in cities such as St. Louis. Ernest Hogan (1865–1909) was a pioneer of ragtime and was the first composer to have his ragtime pieces (or "rags") published as sheet music, beginning with the song "LA Pas Ma LA," published in 1895. Hogan has also been credited for coining the term "ragtime". The term is actually derived from his hometown "Shake Rag" in Bowling Green, Kentucky. Ben Harney, another Kentucky native, has often been credited for introducing the music to the mainstream public. His first ragtime composition, "You've Been a Good Old Wagon But You Done Broke", helped popularize the style. The composition was published in 1895, a few months after Ernest Hogan's "LA Pas Ma LA." Ragtime was also a modification of the march style popularized by John Philip Sousa, with additional polyrhythms coming from African music. Ragtime composer Scott Joplin ("ca." 1868–1917) became famous through the publication of the "Maple Leaf Rag" (1899) and a string of ragtime hits such as "The Entertainer" (1902), although he was later forgotten by all but a small, dedicated community of ragtime aficionados until the major ragtime revival in the early 1970s. For at least 12 years after its publication, "Maple Leaf Rag" heavily influenced subsequent ragtime composers with its melody lines, harmonic progressions or metric patterns. Joropo The Joropo is a musical style resembling the fandango, and an accompanying dance. It has African, Native South American and European influences and originated in the plains called "Los Llanos" of what is now Colombia and Venezuela. It is a fundamental genre of "música criolla" (creole music). It is also the most popular "folk rhythm": the well-known song "Alma Llanera" is a joropo, considered the unofficial national anthem of Venezuela.
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "TripletLoss",
        "matryoshka_dims": [
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Evaluation Dataset

sentence-transformers/hotpotqa

  • Dataset: sentence-transformers/hotpotqa at f07d3cd
  • Size: 8,452 evaluation samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 8 tokens
    • mean: 23.94 tokens
    • max: 87 tokens
    • min: 16 tokens
    • mean: 101.15 tokens
    • max: 447 tokens
    • min: 12 tokens
    • mean: 86.87 tokens
    • max: 407 tokens
  • Samples:
    anchor positive negative
    What is the birthdate of this American dancer and choreographer of modern dance, who helped found the Joseph Campbell Foundation with Robert Walter? Robert Walter (editor) Robert Walter is an editor and an executive with several not-for-profit organizations. Most notably, he is the executive director and board president of the Joseph Campbell Foundation (JCF), an organization that he helped found in 1990 with choreographer Jean Erdman, Joseph Campbell's widow. Miguel Terekhov Miguel Terekhov (August 22, 1928 – January 3, 2012) was a Uruguayan-born American ballet dancer and ballet instructor. Terekhov and his wife, Yvonne Chouteau, one of the Five Moons, a group of Native American ballet dancers, founded the School of Dance at the University of Oklahoma in 1961.
    What is the difference between Konstantin Orbelyan and Haig P. Manoogian Konstantin Orbelyan Konstantin Aghaparoni Orbelyan (Armenian: Կոնստանտին Աղապարոնի Օրբելյան ; Russian: Константин Агапаронович Орбелян , July 29, 1928 – April 24, 2014) was an Armenian pianist, composer, head of the State Estrada Orchestra of Armenia. Mitrofan Lodyzhensky Mitrofan Vasilyevich Lodyzhensky (Russian: Митрофа́н Васи́льевич Лоды́женский , in some sources Лады́женский (Ladyzhensky ); February 27 [O.S. February 15] 1852 – May 31 [O.S. May 18] 1917 ) was a Russian religious philosopher, playwright, and statesman, best known for his "Mystical Trilogy" comprising "Super-consciousness and the Ways to Achieve It", "Light Invisible", and "Dark Force".
    Which movie has more producers, Laura's Star or 9? Laura's Star Laura's Star (German: Lauras Stern ) is a 2004 German animated feature film produced and directed by Thilo Rothkirch. It is based on the children's book "Lauras Stern" by Klaus Baumgart. It was released by Warner Bros. Family Entertainment. Laura Mañá Laura Mañá (born January 12, 1968 in Barcelona, Catalonia, Spain) is an actress, film director and screenwriter.
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "TripletLoss",
        "matryoshka_dims": [
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • gradient_accumulation_steps: 16
  • learning_rate: 2e-05
  • num_train_epochs: 5
  • lr_scheduler_type: cosine
  • warmup_ratio: 0.1
  • bf16: True
  • tf32: True
  • load_best_model_at_end: True
  • optim: adamw_torch_fused
  • resume_from_checkpoint: bge-base-hotpotwa-matryoshka
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 16
  • eval_accumulation_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 5
  • max_steps: -1
  • lr_scheduler_type: cosine
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: True
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: bge-base-hotpotwa-matryoshka
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss loss dim_128_cosine_accuracy dim_256_cosine_accuracy dim_512_cosine_accuracy dim_64_cosine_accuracy dim_768_cosine_accuracy
0.3366 50 23.6925 21.8521 0.9285 0.9288 0.9334 0.9226 0.9365
0.6731 100 22.4254 20.8726 0.9102 0.9110 0.9156 0.9063 0.9168
1.0097 150 22.046 20.7027 0.9142 0.9162 0.9188 0.9098 0.9200
1.3462 200 21.871 20.6600 0.9227 0.9198 0.9233 0.9159 0.9232
1.6828 250 21.7 20.6425 0.9193 0.9192 0.9203 0.9148 0.9217
2.0194 300 21.5785 20.6416 0.9113 0.9133 0.9149 0.9082 0.9142
2.3559 350 21.4963 20.5366 0.9141 0.9139 0.9162 0.9107 0.9177
2.6925 400 21.4012 20.5315 0.9103 0.9114 0.9135 0.9081 0.9136
3.0290 450 21.3447 20.5096 0.9093 0.9089 0.9102 0.9057 0.9106
3.3656 500 21.3029 20.5548 0.9061 0.9074 0.9075 0.9055 0.9069

Framework Versions

  • Python: 3.10.10
  • Sentence Transformers: 3.0.1
  • Transformers: 4.41.2
  • PyTorch: 2.1.2+cu121
  • Accelerate: 0.31.0
  • Datasets: 2.19.1
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning}, 
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

TripletLoss

@misc{hermans2017defense,
    title={In Defense of the Triplet Loss for Person Re-Identification}, 
    author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
    year={2017},
    eprint={1703.07737},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}