LeoChiuu commited on
Commit
b21f908
1 Parent(s): 4c40e6c

Add new SentenceTransformer model.

Browse files
Files changed (2) hide show
  1. README.md +36 -34
  2. model.safetensors +1 -1
README.md CHANGED
@@ -11,38 +11,40 @@ tags:
11
  base_model: sentence-transformers/all-MiniLM-L6-v2
12
  datasets: []
13
  widget:
14
- - source_sentence: This store featured in the SavaCentre TV adverts in 1983.
15
  sentences:
16
- - I love the Scream movies and all horror movies and this one ranks way up there.
17
- - Development of synchronous toothed-belts was halted by the Gilmer company prior
18
- to 1940.
19
- - This store was not featured in the SavaCentre TV promotions in 1983.
20
- - source_sentence: In 2014, Nextgen earns KLAS Top Performance Honors for Ambulatory
21
- RCM Services.
22
  sentences:
23
- - These strategies employ reporter transposon s and in vitro expression technology
24
- (IVET).
 
 
 
 
 
 
 
 
 
25
  - In 2014, Nextgen fails to achieve KLAS Top Performance Honors for Ambulatory RCM
26
  Services.
27
- - The film's sole bright spot was Jonah Hill (who will look almost unrecognizable
28
- to fans of the recent Superbad due to the amount of weight he lost in the interim).
29
- - source_sentence: E105 has never been implicated in atopic asthma.
30
- sentences:
31
- - E105 has been implicated in non-atopic asthma.
32
- - The species is named in honor of the divorce of Sara Anderson and Malcolm Slaney.
33
- - Each annex to a filed document is not required to have page numbering.
34
- - source_sentence: Additionally, a church at San Lazaro in Orange Walk District escaped
35
- all damage.
36
  sentences:
37
- - Kuwait has a reputation for being the central music influence of the GCC countries.
38
- - Early settlers may have introduced it 4,000 years ago.
39
- - Additionally, a church at San Lazaro in Orange Walk District suffered severe damage.
40
- - source_sentence: The content in Australia is lower than in other reports.
 
 
41
  sentences:
42
- - Other reports also show a content lower than 0.1% in Australia.
43
- - Commercial DNP is unable to be utilized as an antiseptic or as a non-selective
44
- bioaccumulating pesticide.
45
- - Installation of Halon systems is mandated by the European Union.
46
  pipeline_tag: sentence-similarity
47
  ---
48
 
@@ -96,9 +98,9 @@ from sentence_transformers import SentenceTransformer
96
  model = SentenceTransformer("LeoChiuu/all-MiniLM-L6-v2-negations")
97
  # Run inference
98
  sentences = [
99
- 'The content in Australia is lower than in other reports.',
100
- 'Other reports also show a content lower than 0.1% in Australia.',
101
- 'Installation of Halon systems is mandated by the European Union.',
102
  ]
103
  embeddings = model.encode(sentences)
104
  print(embeddings.shape)
@@ -161,11 +163,11 @@ You can finetune this model on your own dataset.
161
  | type | string | string | int |
162
  | details | <ul><li>min: 9 tokens</li><li>mean: 16.36 tokens</li><li>max: 39 tokens</li></ul> | <ul><li>min: 7 tokens</li><li>mean: 16.55 tokens</li><li>max: 43 tokens</li></ul> | <ul><li>0: ~61.33%</li><li>1: ~38.67%</li></ul> |
163
  * Samples:
164
- | sentence_0 | sentence_1 | label |
165
- |:---------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------|:---------------|
166
- | <code>It wasn't an inexpensive piece, but I would still have expected better quality.</code> | <code>It was an inexpensive piece, but I would still have expected better quality.</code> | <code>0</code> |
167
- | <code>My name is noncrucial.</code> | <code>My name is important.</code> | <code>0</code> |
168
- | <code>Hawthorne mostly wrote against his own religious belief.</code> | <code>Hawthorne wrote against his beliefs.</code> | <code>1</code> |
169
  * Loss: [<code>CosineSimilarityLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosinesimilarityloss) with these parameters:
170
  ```json
171
  {
 
11
  base_model: sentence-transformers/all-MiniLM-L6-v2
12
  datasets: []
13
  widget:
14
+ - source_sentence: Truck production was added in May 1993.
15
  sentences:
16
+ - Commercial DNP is unable to be utilized as an antiseptic or as a non-selective
17
+ bioaccumulating pesticide.
18
+ - Research has suggested that the attentional focus is variable in size.
19
+ - Truck creation was removed in May 1993.
20
+ - source_sentence: This theory is controversial and has been rejected by other researchers.
 
21
  sentences:
22
+ - The problem of head-of-line blocking using Virtual Output Queues was discussed
23
+ in the paper.
24
+ - The theory has been rejected by other researchers.
25
+ - Omaha Public Power District (OPPD) is considered the state's smallest purchaser
26
+ of wind energy.
27
+ - source_sentence: In Yunnan, China, several ethnic minority groups produce Rushan
28
+ and Rubing from cow's milk.
29
+ sentences:
30
+ - In Yunnan, China, only the ethnic majority groups produce Rushan and Rubing from
31
+ cow's milk.
32
+ - I have always used corded headsets and the freedom from the wireless is very helpful.
33
  - In 2014, Nextgen fails to achieve KLAS Top Performance Honors for Ambulatory RCM
34
  Services.
35
+ - source_sentence: The sultan Muhammad al-Nasir Ibn Qalawun demolished the minaret
36
+ permanently after an earthquake in October 1318.
 
 
 
 
 
 
 
37
  sentences:
38
+ - Six other dams were unsuccessful that day, two were small and four were minor
39
+ in size.
40
+ - This store was not featured in the SavaCentre TV promotions in 1983.
41
+ - The sultan Muhammad al-Nasir Ibn Qalawun renovated the minaret after an earthquake
42
+ in October 1318.
43
+ - source_sentence: It must be taken at the start of main meals to have maximal effect.
44
  sentences:
45
+ - It must be taken at the start of the main meal.
46
+ - My name is important.
47
+ - The species is named in honor of the divorce of Sara Anderson and Malcolm Slaney.
 
48
  pipeline_tag: sentence-similarity
49
  ---
50
 
 
98
  model = SentenceTransformer("LeoChiuu/all-MiniLM-L6-v2-negations")
99
  # Run inference
100
  sentences = [
101
+ 'It must be taken at the start of main meals to have maximal effect.',
102
+ 'It must be taken at the start of the main meal.',
103
+ 'The species is named in honor of the divorce of Sara Anderson and Malcolm Slaney.',
104
  ]
105
  embeddings = model.encode(sentences)
106
  print(embeddings.shape)
 
163
  | type | string | string | int |
164
  | details | <ul><li>min: 9 tokens</li><li>mean: 16.36 tokens</li><li>max: 39 tokens</li></ul> | <ul><li>min: 7 tokens</li><li>mean: 16.55 tokens</li><li>max: 43 tokens</li></ul> | <ul><li>0: ~61.33%</li><li>1: ~38.67%</li></ul> |
165
  * Samples:
166
+ | sentence_0 | sentence_1 | label |
167
+ |:--------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------|:---------------|
168
+ | <code>And I tell you something, it's fair.</code> | <code>And I tell you something, it's not fair.</code> | <code>0</code> |
169
+ | <code>The meeting was commemorated with his image on a stamp from the Vatican post office.</code> | <code>The meeting was commemorated with only his name on a stamp from the Vatican post office.</code> | <code>0</code> |
170
+ | <code>On June 22, 2009, the SEC also filed civil fraud charges.</code> | <code>On June 22, 2009, the SEC kept itself from filing civil fraud charges.</code> | <code>0</code> |
171
  * Loss: [<code>CosineSimilarityLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosinesimilarityloss) with these parameters:
172
  ```json
173
  {
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:2b3f93fc93c0fbdf4be9f9217841543915515a6610212538520a608457a9d4a7
3
  size 90864192
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cacdfc32e4c184d153ee335484bf5b37116ffec455f4fb6f046a962c9d09db7a
3
  size 90864192