metadata
base_model: BAAI/bge-base-en-v1.5
datasets:
- sentence-transformers/hotpotqa
language:
- en
library_name: sentence-transformers
license: apache-2.0
metrics:
- cosine_accuracy
- dot_accuracy
- manhattan_accuracy
- euclidean_accuracy
- max_accuracy
pipeline_tag: sentence-similarity
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:76064
- loss:MatryoshkaLoss
- loss:TripletLoss
widget:
- source_sentence: >-
Which survey in 2010 recommended the The Wide Field Infrared Survey
Telescope as the top priority for astronomy?
sentences:
- >-
High Energy Astronomy Observatory 1 HEAO-1 was an X-ray telescope
launched in 1977. HEAO-1 surveyed the sky in the X-ray portion of the
electromagnetic spectrum (0.2 keV - 10 MeV), providing nearly constant
monitoring of X-ray sources near the ecliptic poles and more detailed
studies of a number of objects by observations lasting 3-6 hours. It was
the first of NASA's three High Energy Astronomy Observatories, HEAO 1,
launched August 12, 1977 aboard an Atlas rocket with a Centaur upper
stage, operated until 9 January 1979. During that time, it scanned the
X-ray sky almost three times
- >-
Wide Field Infrared Survey Telescope The Wide Field Infrared Survey
Telescope (WFIRST) is a future infrared space observatory that was
recommended in 2010 by United States National Research Council Decadal
Survey committee as the top priority for the next decade of astronomy.
On February 17, 2016, WFIRST was formally designated as a mission by
NASA.
- >-
The Bluebells The Bluebells were a Scottish indie rock band, active
between 1981 and 1986 (later briefly reforming in 1993, 2008–2009 and
2011).
- source_sentence: >-
Near what river is the library that contains the Aberdeen Bestiary
located?
sentences:
- >-
Joseph Roth Joseph Roth, born Moses Joseph Roth (2 September 1894 – 27
May 1939), was an Austrian-Jewish journalist and novelist, best known
for his family saga "Radetzky March" (1932), about the decline and fall
of the Austro-Hungarian Empire, his novel of Jewish life, "Job" (1930),
and his seminal essay "Juden auf Wanderschaft" (1927; translated into
English in "The Wandering Jews"), a fragmented account of the Jewish
migrations from eastern to western Europe in the aftermath of World War
I and the Russian Revolution. In the 21st century, publications in
English of "Radetzky March" and of collections of his journalism from
Berlin and Paris created a revival of interest in Roth.
- >-
Aberdeen Bestiary The Aberdeen Bestiary (Aberdeen University Library,
Univ Lib. MS 24) is a 12th-century English illuminated manuscript
bestiary that was first listed in 1542 in the inventory of the Old Royal
Library at the Palace of Westminster.
- >-
House of Monymusk The House of Monymusk is located on the outskirts of
the Scottish village of Monymusk, in the Marr region of Aberdeenshire.
The house is located near the "river Don", which is known for its
spectacular trout-fishing. The village, which history dates back to
1170, was bought by the Forbses in the 1560s, who later built the House
of Monymusk. The Forbses claim they built the present House of Monymusk
from the blackened stones of the old Priory.
- source_sentence: >-
The Stage" is a song by Avenged Sevenfold and the first single from their
seventh studio album of the same name, which was released on which date?
sentences:
- >-
Allegra Stratton Allegra Stratton (born 25 November 1980) is a British
journalist and writer. Since January 2016, she has been the National
Editor of ITV News after four years as political editor on BBC Two's
"Newsnight". She has also co-presented "Peston on Sunday" with Robert
Peston since May 2016.
- >-
Appetite for Destruction Appetite for Destruction is the debut studio
album by American hard rock band Guns N' Roses. It was released on July
21, 1987, by Geffen Records to massive commercial success. It topped the
"Billboard" 200 and became the best-selling debut album as well as the
11th best-selling album in the United States. With about 30 million
copies sold worldwide, it is also one of the best-selling records ever.
Although critics were ambivalent toward the album when it was first
released, "Appetite for Destruction" has since received retrospective
acclaim and been viewed as one of the greatest albums of all time.
- >-
The Stage (Avenged Sevenfold song) "The Stage" is a song by Avenged
Sevenfold and the first single from their seventh studio album of the
same name, which was released on October 28, 2016.
- source_sentence: >-
Union County Speedway is home to what type of motorsports that are usually
held at county fairs and festivals?
sentences:
- >-
Long Beach, New York Long Beach is a city in Nassau County, New York,
United States. Just south of Long Island, it is located on Long Beach
Barrier Island, which is the westernmost of the outer barrier islands
off Long Island's South Shore. As of the United States 2010 Census, the
city population was 33,275. It was incorporated in 1922, and is
nicknamed "The City By the Sea" (as seen in Latin on its official seal).
- >-
Union County Speedway Union County Speedway is a dirt racetrack in
Liberty, Indiana, United States. It features races with cars such as,
late models, Modifieds, Sidestroke, Bombers, Road Hogs, and Street
Stocks. UCS is also host to dirtbike, quad, Mini-Sprint, and Demolition
Derbies.
- >-
Mercer County Fairgrounds The Mercer County Fairgrounds, located on 12th
Avenue SW in Aledo, are the home of the annual county fair in Mercer
County, Illinois. The fairgrounds were established in 1869 when the fair
moved to Aledo; from its creation in 1853 until then, it had taken place
in Millersburg. The early fairs mainly focused on agricultural
exhibitions, and the first two buildings were used for horticulture
exhibits and household floral shows; these fairs also included
entertainment such as baseball games and band concerts. By the end of
the century, the fair had grown to host 8,000 visitors, many who came
from neighboring counties by train, and show 3,000 entries in its
various agricultural competitions. The fair added traveling
entertainment and grew to host over 20,000 visitors in the 20th century;
it is still held annually at the fairgrounds. In addition to the county
fair, the fairgrounds have also held horse races, political events,
picnics, and other community events.
- source_sentence: >-
James D. Farley, Jr. had an early interest in automobiles because of his
grandfather who worked for what company?
sentences:
- >-
Continental Motors Company Continental Motors Company was an American
manufacturer of internal combustion engines. The company produced
engines as a supplier to many independent manufacturers of automobiles,
tractors, trucks, and stationary equipment (such as pumps, generators,
and industrial machinery drives) from the 1900s through the 1960s.
Continental Motors also produced Continental-branded automobiles in
1932–1933. The Continental Aircraft Engine Company was formed in 1929 to
develop and produce its aircraft engines, and would become the core
business of Continental Motors, Inc.
- >-
The Atlantic The Atlantic is an American magazine and multi-platform
publisher, founded in 1857 as The Atlantic Monthly in Boston,
Massachusetts.
- >-
Jim Farley (businessman) James D. Farley, Jr. (born June 1962) is an
American automobile executive that currently serves as Ford Motor
Company's Executive Vice President and president, Global Markets since
June 2017. From 2015 to 2017, he was CEO and Chairman of Ford Europe. He
had an early interest in automobiles, primarily spurred from his
grandfather who worked at Henry Ford's River Rouge Plant starting in
1914.
model-index:
- name: BGE-base-en-v1.5-Hotpotqa
results:
- task:
type: triplet
name: Triplet
dataset:
name: dim 768
type: dim_768
metrics:
- type: cosine_accuracy
value: 0.9068859441552295
name: Cosine Accuracy
- type: dot_accuracy
value: 0.09311405584477046
name: Dot Accuracy
- type: manhattan_accuracy
value: 0.9066493137718883
name: Manhattan Accuracy
- type: euclidean_accuracy
value: 0.9068859441552295
name: Euclidean Accuracy
- type: max_accuracy
value: 0.9068859441552295
name: Max Accuracy
- task:
type: triplet
name: Triplet
dataset:
name: dim 512
type: dim_512
metrics:
- type: cosine_accuracy
value: 0.9074775201135826
name: Cosine Accuracy
- type: dot_accuracy
value: 0.09311405584477046
name: Dot Accuracy
- type: manhattan_accuracy
value: 0.9055844770468529
name: Manhattan Accuracy
- type: euclidean_accuracy
value: 0.9064126833885471
name: Euclidean Accuracy
- type: max_accuracy
value: 0.9074775201135826
name: Max Accuracy
- task:
type: triplet
name: Triplet
dataset:
name: dim 256
type: dim_256
metrics:
- type: cosine_accuracy
value: 0.907359204921912
name: Cosine Accuracy
- type: dot_accuracy
value: 0.09311405584477046
name: Dot Accuracy
- type: manhattan_accuracy
value: 0.9062943681968765
name: Manhattan Accuracy
- type: euclidean_accuracy
value: 0.9062943681968765
name: Euclidean Accuracy
- type: max_accuracy
value: 0.907359204921912
name: Max Accuracy
- task:
type: triplet
name: Triplet
dataset:
name: dim 128
type: dim_128
metrics:
- type: cosine_accuracy
value: 0.9060577378135353
name: Cosine Accuracy
- type: dot_accuracy
value: 0.09488878371982963
name: Dot Accuracy
- type: manhattan_accuracy
value: 0.9014434453383815
name: Manhattan Accuracy
- type: euclidean_accuracy
value: 0.9035731187884525
name: Euclidean Accuracy
- type: max_accuracy
value: 0.9060577378135353
name: Max Accuracy
- task:
type: triplet
name: Triplet
dataset:
name: dim 64
type: dim_64
metrics:
- type: cosine_accuracy
value: 0.9054661618551822
name: Cosine Accuracy
- type: dot_accuracy
value: 0.09808329389493611
name: Dot Accuracy
- type: manhattan_accuracy
value: 0.8983672503549456
name: Manhattan Accuracy
- type: euclidean_accuracy
value: 0.9013251301467108
name: Euclidean Accuracy
- type: max_accuracy
value: 0.9054661618551822
name: Max Accuracy
BGE-base-en-v1.5-Hotpotqa
This is a sentence-transformers model finetuned from BAAI/bge-base-en-v1.5 on the sentence-transformers/hotpotqa dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: BAAI/bge-base-en-v1.5
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 768 tokens
- Similarity Function: Cosine Similarity
- Training Dataset:
- Language: en
- License: apache-2.0
Model Sources
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("sentence_transformers_model_id")
sentences = [
'James D. Farley, Jr. had an early interest in automobiles because of his grandfather who worked for what company?',
"Jim Farley (businessman) James D. Farley, Jr. (born June 1962) is an American automobile executive that currently serves as Ford Motor Company's Executive Vice President and president, Global Markets since June 2017. From 2015 to 2017, he was CEO and Chairman of Ford Europe. He had an early interest in automobiles, primarily spurred from his grandfather who worked at Henry Ford's River Rouge Plant starting in 1914.",
'Continental Motors Company Continental Motors Company was an American manufacturer of internal combustion engines. The company produced engines as a supplier to many independent manufacturers of automobiles, tractors, trucks, and stationary equipment (such as pumps, generators, and industrial machinery drives) from the 1900s through the 1960s. Continental Motors also produced Continental-branded automobiles in 1932–1933. The Continental Aircraft Engine Company was formed in 1929 to develop and produce its aircraft engines, and would become the core business of Continental Motors, Inc.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
Evaluation
Metrics
Triplet
Metric |
Value |
cosine_accuracy |
0.9069 |
dot_accuracy |
0.0931 |
manhattan_accuracy |
0.9066 |
euclidean_accuracy |
0.9069 |
max_accuracy |
0.9069 |
Triplet
Metric |
Value |
cosine_accuracy |
0.9075 |
dot_accuracy |
0.0931 |
manhattan_accuracy |
0.9056 |
euclidean_accuracy |
0.9064 |
max_accuracy |
0.9075 |
Triplet
Metric |
Value |
cosine_accuracy |
0.9074 |
dot_accuracy |
0.0931 |
manhattan_accuracy |
0.9063 |
euclidean_accuracy |
0.9063 |
max_accuracy |
0.9074 |
Triplet
Metric |
Value |
cosine_accuracy |
0.9061 |
dot_accuracy |
0.0949 |
manhattan_accuracy |
0.9014 |
euclidean_accuracy |
0.9036 |
max_accuracy |
0.9061 |
Triplet
Metric |
Value |
cosine_accuracy |
0.9055 |
dot_accuracy |
0.0981 |
manhattan_accuracy |
0.8984 |
euclidean_accuracy |
0.9013 |
max_accuracy |
0.9055 |
Training Details
Training Dataset
sentence-transformers/hotpotqa
- Dataset: sentence-transformers/hotpotqa at f07d3cd
- Size: 76,064 training samples
- Columns:
anchor
, positive
, and negative
- Approximate statistics based on the first 1000 samples:
|
anchor |
positive |
negative |
type |
string |
string |
string |
details |
- min: 8 tokens
- mean: 24.49 tokens
- max: 108 tokens
|
- min: 21 tokens
- mean: 101.27 tokens
- max: 512 tokens
|
- min: 14 tokens
- mean: 87.44 tokens
- max: 407 tokens
|
- Samples:
anchor |
positive |
negative |
What historical geographic region in Central-Eastern Europe was the birthplace of a soldier of the Austro-Hungarian Army? |
Bruno Olbrycht Bruno Olbrycht (nom de guerre: Olza; 6 October 1895 – 23 March 1951) was a soldier of the Austro-Hungarian Army and officer (later general) of the Polish Army both in the Second Polish Republic and postwar Poland. Born on 6 October 1895 in Sanok, Austrian Galicia, Olbrycht fought in Polish Legions in World War I, Polish–Ukrainian War, Polish–Soviet War and the Invasion of Poland. He died on 23 March 1951 in Kraków. |
Padáň The village was first recorded in 1254 as "Padan", an old Pecheneg settlement. On the territory of the village, there used to be "Petény" village as well, which was mentioned in 1298 as the appurtenance of Pressburg Castle. Until the end of World War I, it was part of Hungary and fell within the Dunaszerdahely district of Pozsony County. After the Austro-Hungarian army disintegrated in November 1918, Czechoslovakian troops occupied the area. After the Treaty of Trianon of 1920, the village became officially part of Czechoslovakia. In November 1938, the First Vienna Award granted the area to Hungary and it was held by Hungary until 1945. After Soviet occupation in 1945, Czechoslovakian administration returned and the village became officially part of Czechoslovakia in 1947. |
Full Scale Assault is the fourth studio album by Dutch punk hardcore band Vitamin X, the album was recorded at Electrical Audio in Chicago by Steve Albini who previously recorded The Stooges, also known as Iggy and the Stooges, were an American rock band formed in Ann Arbor, Michigan in what year? |
Full Scale Assault Full Scale Assault is the fourth studio album by Dutch punk hardcore band Vitamin X. Released through Tankcrimes on October 10, 2008 in the US, and Agipunk in Europe. The album was recorded at Electrical Audio in Chicago by Steve Albini who previously recorded Nirvana, Neurosis, PJ Harvey, High on Fire, Iggy Pop & The Stooges. It features guest vocals from Negative Approach's singer John Brannon. Art is by John Dyer Baizley. |
The Dogs (US punk band) The Dogs are a three-piece proto-punk band formed in Lansing, Michigan, United States in 1969. They are noted for presaging the energy and sound of the later punk and hardcore genres. |
Which popular music style was a modification of the marches from "The March King" with heavy influences from African American communities? |
Ragtime Ragtime – also spelled rag-time or rag time – is a musical style that enjoyed its peak popularity between 1895 and 1918. Its cardinal trait is its syncopated, or "ragged", rhythm. The style has its origins in African-American communities in cities such as St. Louis. Ernest Hogan (1865–1909) was a pioneer of ragtime and was the first composer to have his ragtime pieces (or "rags") published as sheet music, beginning with the song "LA Pas Ma LA," published in 1895. Hogan has also been credited for coining the term "ragtime". The term is actually derived from his hometown "Shake Rag" in Bowling Green, Kentucky. Ben Harney, another Kentucky native, has often been credited for introducing the music to the mainstream public. His first ragtime composition, "You've Been a Good Old Wagon But You Done Broke", helped popularize the style. The composition was published in 1895, a few months after Ernest Hogan's "LA Pas Ma LA." Ragtime was also a modification of the march style popularized by John Philip Sousa, with additional polyrhythms coming from African music. Ragtime composer Scott Joplin ("ca." 1868–1917) became famous through the publication of the "Maple Leaf Rag" (1899) and a string of ragtime hits such as "The Entertainer" (1902), although he was later forgotten by all but a small, dedicated community of ragtime aficionados until the major ragtime revival in the early 1970s. For at least 12 years after its publication, "Maple Leaf Rag" heavily influenced subsequent ragtime composers with its melody lines, harmonic progressions or metric patterns. |
Joropo The Joropo is a musical style resembling the fandango, and an accompanying dance. It has African, Native South American and European influences and originated in the plains called "Los Llanos" of what is now Colombia and Venezuela. It is a fundamental genre of "música criolla" (creole music). It is also the most popular "folk rhythm": the well-known song "Alma Llanera" is a joropo, considered the unofficial national anthem of Venezuela. |
- Loss:
MatryoshkaLoss
with these parameters:{
"loss": "TripletLoss",
"matryoshka_dims": [
768,
512,
256,
128,
64
],
"matryoshka_weights": [
1,
1,
1,
1,
1
],
"n_dims_per_step": -1
}
Evaluation Dataset
sentence-transformers/hotpotqa
- Dataset: sentence-transformers/hotpotqa at f07d3cd
- Size: 8,452 evaluation samples
- Columns:
anchor
, positive
, and negative
- Approximate statistics based on the first 1000 samples:
|
anchor |
positive |
negative |
type |
string |
string |
string |
details |
- min: 8 tokens
- mean: 23.94 tokens
- max: 87 tokens
|
- min: 16 tokens
- mean: 101.15 tokens
- max: 447 tokens
|
- min: 12 tokens
- mean: 86.87 tokens
- max: 407 tokens
|
- Samples:
anchor |
positive |
negative |
What is the birthdate of this American dancer and choreographer of modern dance, who helped found the Joseph Campbell Foundation with Robert Walter? |
Robert Walter (editor) Robert Walter is an editor and an executive with several not-for-profit organizations. Most notably, he is the executive director and board president of the Joseph Campbell Foundation (JCF), an organization that he helped found in 1990 with choreographer Jean Erdman, Joseph Campbell's widow. |
Miguel Terekhov Miguel Terekhov (August 22, 1928 – January 3, 2012) was a Uruguayan-born American ballet dancer and ballet instructor. Terekhov and his wife, Yvonne Chouteau, one of the Five Moons, a group of Native American ballet dancers, founded the School of Dance at the University of Oklahoma in 1961. |
What is the difference between Konstantin Orbelyan and Haig P. Manoogian |
Konstantin Orbelyan Konstantin Aghaparoni Orbelyan (Armenian: Կոնստանտին Աղապարոնի Օրբելյան ; Russian: Константин Агапаронович Орбелян , July 29, 1928 – April 24, 2014) was an Armenian pianist, composer, head of the State Estrada Orchestra of Armenia. |
Mitrofan Lodyzhensky Mitrofan Vasilyevich Lodyzhensky (Russian: Митрофа́н Васи́льевич Лоды́женский , in some sources Лады́женский (Ladyzhensky ); February 27 [O.S. February 15] 1852 – May 31 [O.S. May 18] 1917 ) was a Russian religious philosopher, playwright, and statesman, best known for his "Mystical Trilogy" comprising "Super-consciousness and the Ways to Achieve It", "Light Invisible", and "Dark Force". |
Which movie has more producers, Laura's Star or 9? |
Laura's Star Laura's Star (German: Lauras Stern ) is a 2004 German animated feature film produced and directed by Thilo Rothkirch. It is based on the children's book "Lauras Stern" by Klaus Baumgart. It was released by Warner Bros. Family Entertainment. |
Laura Mañá Laura Mañá (born January 12, 1968 in Barcelona, Catalonia, Spain) is an actress, film director and screenwriter. |
- Loss:
MatryoshkaLoss
with these parameters:{
"loss": "TripletLoss",
"matryoshka_dims": [
768,
512,
256,
128,
64
],
"matryoshka_weights": [
1,
1,
1,
1,
1
],
"n_dims_per_step": -1
}
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: steps
per_device_train_batch_size
: 32
per_device_eval_batch_size
: 32
gradient_accumulation_steps
: 16
learning_rate
: 2e-05
num_train_epochs
: 5
lr_scheduler_type
: cosine
warmup_ratio
: 0.1
bf16
: True
tf32
: True
load_best_model_at_end
: True
optim
: adamw_torch_fused
resume_from_checkpoint
: bge-base-hotpotwa-matryoshka
batch_sampler
: no_duplicates
All Hyperparameters
Click to expand
overwrite_output_dir
: False
do_predict
: False
eval_strategy
: steps
prediction_loss_only
: True
per_device_train_batch_size
: 32
per_device_eval_batch_size
: 32
per_gpu_train_batch_size
: None
per_gpu_eval_batch_size
: None
gradient_accumulation_steps
: 16
eval_accumulation_steps
: None
learning_rate
: 2e-05
weight_decay
: 0.0
adam_beta1
: 0.9
adam_beta2
: 0.999
adam_epsilon
: 1e-08
max_grad_norm
: 1.0
num_train_epochs
: 5
max_steps
: -1
lr_scheduler_type
: cosine
lr_scheduler_kwargs
: {}
warmup_ratio
: 0.1
warmup_steps
: 0
log_level
: passive
log_level_replica
: warning
log_on_each_node
: True
logging_nan_inf_filter
: True
save_safetensors
: True
save_on_each_node
: False
save_only_model
: False
restore_callback_states_from_checkpoint
: False
no_cuda
: False
use_cpu
: False
use_mps_device
: False
seed
: 42
data_seed
: None
jit_mode_eval
: False
use_ipex
: False
bf16
: True
fp16
: False
fp16_opt_level
: O1
half_precision_backend
: auto
bf16_full_eval
: False
fp16_full_eval
: False
tf32
: True
local_rank
: 0
ddp_backend
: None
tpu_num_cores
: None
tpu_metrics_debug
: False
debug
: []
dataloader_drop_last
: False
dataloader_num_workers
: 0
dataloader_prefetch_factor
: None
past_index
: -1
disable_tqdm
: False
remove_unused_columns
: True
label_names
: None
load_best_model_at_end
: True
ignore_data_skip
: False
fsdp
: []
fsdp_min_num_params
: 0
fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
fsdp_transformer_layer_cls_to_wrap
: None
accelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
deepspeed
: None
label_smoothing_factor
: 0.0
optim
: adamw_torch_fused
optim_args
: None
adafactor
: False
group_by_length
: False
length_column_name
: length
ddp_find_unused_parameters
: None
ddp_bucket_cap_mb
: None
ddp_broadcast_buffers
: False
dataloader_pin_memory
: True
dataloader_persistent_workers
: False
skip_memory_metrics
: True
use_legacy_prediction_loop
: False
push_to_hub
: False
resume_from_checkpoint
: bge-base-hotpotwa-matryoshka
hub_model_id
: None
hub_strategy
: every_save
hub_private_repo
: False
hub_always_push
: False
gradient_checkpointing
: False
gradient_checkpointing_kwargs
: None
include_inputs_for_metrics
: False
eval_do_concat_batches
: True
fp16_backend
: auto
push_to_hub_model_id
: None
push_to_hub_organization
: None
mp_parameters
:
auto_find_batch_size
: False
full_determinism
: False
torchdynamo
: None
ray_scope
: last
ddp_timeout
: 1800
torch_compile
: False
torch_compile_backend
: None
torch_compile_mode
: None
dispatch_batches
: None
split_batches
: None
include_tokens_per_second
: False
include_num_input_tokens_seen
: False
neftune_noise_alpha
: None
optim_target_modules
: None
batch_eval_metrics
: False
batch_sampler
: no_duplicates
multi_dataset_batch_sampler
: proportional
Training Logs
Epoch |
Step |
Training Loss |
loss |
dim_128_cosine_accuracy |
dim_256_cosine_accuracy |
dim_512_cosine_accuracy |
dim_64_cosine_accuracy |
dim_768_cosine_accuracy |
0.3366 |
50 |
23.6925 |
21.8521 |
0.9285 |
0.9288 |
0.9334 |
0.9226 |
0.9365 |
0.6731 |
100 |
22.4254 |
20.8726 |
0.9102 |
0.9110 |
0.9156 |
0.9063 |
0.9168 |
1.0097 |
150 |
22.046 |
20.7027 |
0.9142 |
0.9162 |
0.9188 |
0.9098 |
0.9200 |
1.3462 |
200 |
21.871 |
20.6600 |
0.9227 |
0.9198 |
0.9233 |
0.9159 |
0.9232 |
1.6828 |
250 |
21.7 |
20.6425 |
0.9193 |
0.9192 |
0.9203 |
0.9148 |
0.9217 |
2.0194 |
300 |
21.5785 |
20.6416 |
0.9113 |
0.9133 |
0.9149 |
0.9082 |
0.9142 |
2.3559 |
350 |
21.4963 |
20.5366 |
0.9141 |
0.9139 |
0.9162 |
0.9107 |
0.9177 |
2.6925 |
400 |
21.4012 |
20.5315 |
0.9103 |
0.9114 |
0.9135 |
0.9081 |
0.9136 |
3.0290 |
450 |
21.3447 |
20.5096 |
0.9093 |
0.9089 |
0.9102 |
0.9057 |
0.9106 |
3.3656 |
500 |
21.3029 |
20.5548 |
0.9061 |
0.9074 |
0.9075 |
0.9055 |
0.9069 |
Framework Versions
- Python: 3.10.10
- Sentence Transformers: 3.0.1
- Transformers: 4.41.2
- PyTorch: 2.1.2+cu121
- Accelerate: 0.31.0
- Datasets: 2.19.1
- Tokenizers: 0.19.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MatryoshkaLoss
@misc{kusupati2024matryoshka,
title={Matryoshka Representation Learning},
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
year={2024},
eprint={2205.13147},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
TripletLoss
@misc{hermans2017defense,
title={In Defense of the Triplet Loss for Person Re-Identification},
author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
year={2017},
eprint={1703.07737},
archivePrefix={arXiv},
primaryClass={cs.CV}
}