knguyennguyen commited on
Commit
18a57a7
·
verified ·
1 Parent(s): 7cb3a85

Add new SentenceTransformer model.

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,507 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - sentence-transformers
4
+ - sentence-similarity
5
+ - feature-extraction
6
+ - generated_from_trainer
7
+ - dataset_size:7598
8
+ - loss:MultipleNegativesRankingLoss
9
+ base_model: sentence-transformers/all-mpnet-base-v2
10
+ widget:
11
+ - source_sentence: men's jacket featuring a waterproof outer layer, an insulated inner
12
+ layer, and multiple secure pockets for storage.
13
+ sentences:
14
+ - "Title: Pure Leather Biker Short Jacket Black Men’s Bike Jackets - Cowhide Top\
15
+ \ Grain Classic Belted Strap Descripion: ['We Believe in Beauty of Simplicity'\
16
+ \ 'PURE LEATHER'\n 'jackets are negligible and made for regular wear. Our plans\
17
+ \ are ageless any style without overwhelming it, in light of the fact that there\
18
+ \ is'\n 'Beauty in Simplicity' '.' 'Product Detail:'\n 'This black leather jacket\
19
+ \ for men is traditional yet contemporary and useful ranging from your weekend\
20
+ \ sport coat. With great style, this cowhide leather cool leather jackets for\
21
+ \ men will keep you warm and comfortable in all types of weather. The style is\
22
+ \ similar to our PR-40 in Natural Sheep Leather but this one comes in a different\
23
+ \ leather altogether Cow Natural. Molds need to be removed because they are prone\
24
+ \ to breeding when three Parts of it are in excess. Parts are dirt, its multiplicity,\
25
+ \ and temperature. Firstly, you have to hang it on a hanger in a nicely-ventilated\
26
+ \ place.'\n 'Note:'\n 'The color will become slightly darker due to its transparent\
27
+ \ appearance. Do not stretch it for extended time as it will fall prey to different\
28
+ \ proportions. Jackets from Pure Leather are made by caring people who love their\
29
+ \ jobs, so you won’t get annoyed or disappointed when wearing them!! (Do Not Wash.\
30
+ \ Do Not Bleach. Do Not Tumble. Do Not Iron, Dry Clean only- Satisfaction guaranteed'\n\
31
+ \ 'A Great Gift:'\n 'These beautiful Jackets Would be a perfect gift for that\
32
+ \ special someone in your life. Buy these jackets for your Husband, Son, Brother,\
33
+ \ or best friend, and without a doubt you have purchased the perfect present for\
34
+ \ the any occasion, whether it is for Father’s Day, Valentine-day, Christmas,\
35
+ \ Graduation or their Birthday.'\n 'Material:' 'Cowhide Leather' 'Colors:' 'Black'\
36
+ \ 'Click' 'ADD TO CART'\n 'to order your Ancient Jacket' 'TODAY!']"
37
+ - 'Title: Pendleton Men''s Jacquard Sherpa-Lined Shirt Jacket Descripion: [''New
38
+ style for Pendleton but using our iconic Harding print. Taking our classic shirt
39
+ jacket and lining it with soft sherpa for warmth.'']'
40
+ - "Title: THE NORTH FACE Men's Clement Triclimate Jacket Descripion: ['The North\
41
+ \ Face Size Chart' 'The North Face Size Chart'\n 'Please note, the logo and hardware\
42
+ \ color may vary in styles marked as Prior Season.'\n 'Please note, the logo and\
43
+ \ hardware color may vary in styles marked as Prior Season.'\n 'Your go-to jacket\
44
+ \ all season long, this versatile The North Face® Clement Triclimate Jacket system\
45
+ \ pairs a breathable, waterproof shell with a warm, insulated liner jacket'\n\
46
+ \ 'Your go-to jacket all season long, this versatile The North Face® Clement Triclimate\
47
+ \ Jacket system pairs a breathable, waterproof shell with a warm, insulated liner\
48
+ \ jacket'\n 'Waterproof, insulated three-in-one ski jacket.'\n 'Waterproof, insulated\
49
+ \ three-in-one ski jacket.'\n 'Removable, helmet-compatible hood.' 'Removable,\
50
+ \ helmet-compatible hood.'\n 'Secure-zip chest and hand pockets.' 'Secure-zip\
51
+ \ chest and hand pockets.'\n 'Internal goggle pocket.' 'Internal goggle pocket.'\n\
52
+ \ 'Underarm vents for added breathability.'\n 'Underarm vents for added breathability.'\n\
53
+ \ 'Secure-zip wrist pocket with goggle wipe.'\n 'Secure-zip wrist pocket with\
54
+ \ goggle wipe.' 'Zip-in integration.'\n 'Zip-in integration.' 'Secure-zip hand\
55
+ \ pockets.'\n 'Secure-zip hand pockets.' '100% polyester.' '100% polyester.'\n\
56
+ \ '50D 73 G/M² 100% Recycled Polyester With Non-PFC DWR Finish.'\n '50D 73 G/M²\
57
+ \ 100% Recycled Polyester With Non-PFC DWR Finish.'\n 'Machine wash cold, hang\
58
+ \ dry.' 'Machine wash cold, hang dry.' 'Imported.'\n 'Imported.']"
59
+ - source_sentence: a lightweight performance jacket for outdoor activities
60
+ sentences:
61
+ - 'Title: Puma Golf Men''s First Mile Wind Jacket Descripion: [''Head turner. Impact
62
+ maker. The First Mile Wind Jacket is all about the important details—from its
63
+ classic style to sustainable materials. The jacket uses yarn from recycled plastic
64
+ bottles, sustainably sourced from communities of collectors.'']'
65
+ - 'Title: Tommy Hilfiger Men''s Lightweight Performance Softshell Hoody Jacket Descripion:
66
+ [''Lightweight perfomance jacket in a breathable and water resistant softshell
67
+ with microfleece backing center front zipper closure adjustable drawstring hood
68
+ adjustable snap cuffs Tommy Hilfiger flag screenprint on left chest round silicon
69
+ logo patch on left sleeve and two lower welt zipper pockets'']'
70
+ - 'Title: Columbia Women''s Pike Lake Long Jacket Descripion: ["Don''t let cold
71
+ weather ice out your plans. This puffer is ideal when the temp drops, combining
72
+ a Storm-Lite shell for durability, a heat-holding thermal-reflective lining, and
73
+ Thermarator insulation for warmth without bulk. Snaps at the side seams make it
74
+ easy to move. Your new puffy favorite, complete with zippered hand pockets, interior
75
+ security pocket, drawcord adjustable hood and hem, plus handy snap-adjustable
76
+ side seams. This is the perfect long coat to manage the harshest of cold weather
77
+ adventures. This women’s winter jacket is offered in multiple sizes and colors.
78
+ Available in extended sizing. Relaxed and Regular Fit. To ensure the size you
79
+ choose is right, utilize our sizing chart and the following measurement instructions:
80
+ For the sleeves, start at the center back of your neck and measure across the
81
+ shoulder and down to the sleeve. If you come up with a partial number, round up
82
+ to the next even number. For the chest, measure at the fullest part of the chest,
83
+ under the armpits and over the shoulder blades, keeping the tape measure firm
84
+ and level."]'
85
+ - source_sentence: baby's outerwear jumpsuit with a hood, soft fabric, and a front
86
+ zipper for easy access.
87
+ sentences:
88
+ - 'Title: Lykmera Baby Coat Toddler Kimono Solid Silk Robes Kids Clothes Sleepwear
89
+ Bathrobe Girls Baby Satin Girls Coat Jacket Descripion: ["2.Casual style top,
90
+ , cute and comfy baby clothes 3.Great idea for a baby clothes, there''s no doubt
91
+ in our mind your little baby will be the cutest Package include:1PC Bathrobe+1PC
92
+ Ribbons 1.It is made of high quality materials,Soft hand feeling, no any harm
93
+ to your baby skin Clothing Length:Regular Pattern Type:Solid Gender:Girls Please
94
+ note that slight color difference should be acceptable due to the light and screen.
95
+ Both hand wash and machine wash is OK Occasion:Casual Material:Polyester Attention
96
+ plz: If your kid is , we recomend choosing a larger size, thanks."]'
97
+ - 'Title: Loloda Baby Girls Toddler Fur Faux Coat Vest Winter Warm Solid Color Sleeveless
98
+ Waistcoat Jacket Descripion: ["Set Include: 1Pc Vest Condition: Brand New Material:
99
+ Polyester Tag No.---|---Recommended Size---|------Chest------|------Length---
100
+ ---80-----|------9-12 Months-----|---23.6''''/60cm---|----13.8''''/35cm ---90-----|-----12-18
101
+ Months-----|---25.2''''/64cm---|----14.2''''/36cm --100-----|-----18-24 Months-----|---26.0''''/66cm---|----14.6''''/37cm
102
+ --110-----|----------2-3 Years---------|---26.8''''/68cm---|----15.0''''/38cm
103
+ --120-----|----------3-4 Years---------|---27.6''''/70cm---|----16.5''''/42cm
104
+ --130-----|----------5-6 Years---------|---28.3''''/72cm---|---17.3''''/44 cm
105
+ --140-----|----------7-8 Years---------|---30.7''''/78cm---|---19.3''''/49 cm
106
+ --150-----|---------9-10 Years---------|---31.5''''/80cm---|---19.7''''/50 cm"]'
107
+ - 'Title: Baby''s hooded jumpsuit outerwear toddler jacket, Cosplay zipper full-body
108
+ jumpsuit Descripion: [''Product Description: Features: Cute animal hooded onesie,
109
+ warm onesie tunic. High-grade facecloth fabric, soft to the touch, so that the
110
+ baby is comfortable and cozy. Hooded design makes the baby warm all over. Anti-pinch
111
+ collar design takes care of baby\''s neck. Suitable for loungewear, parties, holidays,
112
+ festivals, photo shoots, etc. One-piece only, does not include any other accessories.
113
+ Specification.: Color: fox (orange) / tiger (yellow) / rabbit (pink) optional
114
+ Size: 80CM / 90CM / 100CM (optional) Note: 1. The size is just for your reference.
115
+ Please check the measurements to choose the right size for your baby! Meanwhile,
116
+ please choose the larger size because babies at the same age may have different
117
+ height. 2. Please allow 1-3cm (0.4-1.18") difference due to manual measurement
118
+ and slight color variation for different display setting Package Including: 1
119
+ x Jumpsuit'']'
120
+ - source_sentence: a packable rain jacket for bowhunters during early season downpours
121
+ sentences:
122
+ - 'Title: OAKI Rain & Trail Suit - Adult One Piece Rain Jacket & Pant Orange, Medium
123
+ Descripion: [''Oaki Adults One-Piece Waterproof Trail Rain Suit are tough and
124
+ weather-resistant, giving your overall protection with ripstop fabric. This one-piece
125
+ rain suit is made of Nylon Taslon PU coated fabric. Unlike rubber jumpsuits, this
126
+ rain suit is breathable. Whether you’re camping or playing with your kids in the
127
+ backyard, these comfortably breathable rainsuits will keep you dry.'']'
128
+ - 'Title: HIDDEN MAKERS Womens Trendy Thickened Parka Jacket with Muskrat Fur &
129
+ Belt Descripion: [''You can never be ready to face the winter chills unless you
130
+ have your down puffed winter jacket, here’s one with a trendy twist to it! A
131
+ sight to behold, designed with an edgy twist to parka jackets, this one is surely
132
+ gonna keep you warm and comfortable because of double front layering. The muskrat
133
+ fur and detachable belt adds to the sleek silhouette of the long winter jacket.
134
+ Padded with goose feather down to lock the heat, the slim-fit cut will add glamor
135
+ to your walks down the frosty city streets. COLOR VARIATIONS Black SIZE (inch) ·
136
+ S : Shoulder 15.5, Chest 38.5, Waist 35.8, Bottom Width 43.3, Body Length(back)
137
+ 32.6, Sleeve Width 13.7, Sleeve Length 23.4 · M<: Shoulder 15.9, Chest 41.3,
138
+ Waist 38.1, Bottom Width 47.2, Body Length(back) 33.6, Sleeve Width 14.5, Sleeve
139
+ Length 23.6 · L : Shoulder 16.1, Chest 41.4, Waist 39.3 , Bottom Width 48, Body
140
+ Length(back) 34, Sleeve Width 15.3, Sleeve Length 23.8 MODEL SIZE · Height
141
+ 5.6ft MATERIAL & COMPOSITION · Outshell & Lining : 100% polyester · Coloration
142
+ : 100% muskrat fur · Filling1 : 80% goose down & 20% goose feather · Filling2
143
+ : 100% polyester WASHING & KEEPING TIPS · Wash inside out and dry clean or
144
+ hand wash · Do not use chlorine or oxygen detergents · Avoid direct sunlight
145
+ and dry In shade · Wash dark colors separately · Keep away from fire DESIGN ·
146
+ Snap button closure on front · Two outside pockets · Dual Layered front design
147
+ to block wind · 100% muskrat fur from hoodie to hem · Removable fur on the hood ·
148
+ Detachable waist belt · Windbreaker and waterproof fabric · Suitable length
149
+ and fit to allow optimum movement MULTI PURPOSE · Made of waterproof and windproof
150
+ fabrics for everyday life · It can be worn comfortably and warmly in everyday
151
+ life'']'
152
+ - 'Title: ASIO Gear Packable Rain Jacket Descripion: [''Ideal for keeping bowhunters
153
+ dry during those early season pop-up downpours, this durable camo hunting jacket
154
+ features pit zips for superior breathability, slips easily over ASIO Gear’s mid
155
+ and late-season gear for ultimate rain protection, and, when paired with our packable
156
+ hunting rain pants, packs down to the size of a softball. No matter what part
157
+ of the whitetail range you hunt, you can be sure you’ll be super dry at the end
158
+ of your sit with this packable hunting jacket (and when paired with our equally
159
+ as awesome packable hunting rain pants).'']'
160
+ - source_sentence: a pair of statement stud earrings for brightening up outfits
161
+ sentences:
162
+ - 'Title: Kendra Scott Sienna Ear Jacket Earrings Descripion: [''The Kendra Scott
163
+ Sienna Ear Jacket Earrings are the best bright-colored statement stud. Straight
164
+ post backing. Removeable jackets. Gold plated brass. Shell detailing. Remove jewelry
165
+ when swimming, bathing, or exercising. Imported.'']'
166
+ - 'Title: TrailCrest Infant-Toddler Boys & Girls Fleece Full Zip Mock Neck Soft
167
+ Jacket Descripion: [''Feel the chill! On chilly summer nights out in the wilderness
168
+ or crisp spring and autumn days, these semi-fitted fleece zips are the perfect
169
+ cover to keep you warm. These fleece zips are great for camping, hiking and just
170
+ about any sport related activities. Featuring bright accent colors with realistic
171
+ camo print to make the experience all the more exciting. Fully equipped with a
172
+ mock neck collar for extra warmth, two hand pockets with hidden zippers, an adjustable
173
+ drawstring at the bottom, and elastic cuffs. All set for the outdoors!'']'
174
+ - 'Title: Tainilawo Womens Hooies Long Sleeve Shirts Sweatershirt Giraffe Prints
175
+ Hooded Pullover Overszied Coats Jackets Descripion: ["Size: M US: 10 EU: 40 Bust:
176
+ 100cm/39.37'''' Shoulder: 47.5cm/18.70'''' Sleeve: 55.5cm/21.85'''' Length: 61cm/24.02''''
177
+ Size: L US: 12 EU: 42 Bust: 104cm/40.94'''' Shoulder: 49.5cm/19.49'''' Sleeve:
178
+ 57.5cm/22.64'''' Length: 63cm/24.80'''' Size: XL US: 14 EU: 44 Bust: 108cm/42.52''''
179
+ Shoulder: 51.5cm/20.28'''' Sleeve: 59.5cm/23.43'''' Length: 65cm/25.59'''' Size:
180
+ XXL US: 16 EU: 46 Bust: 112cm/44.09'''' Shoulder: 53.5cm/21.06'''' Sleeve: 61.5cm/24.21''''
181
+ Length: 67cm/26.38'''' Size: XXXL US: 18 EU: 48 Bust: 116cm/45.67'''' Shoulder:
182
+ 55.5cm/21.85'''' Sleeve: 63.5cm/25.00'''' Length: 69cm/27.17'''' cardigan sweaters
183
+ for women sweaters for women womens sweaters oversized sweaters for women fall
184
+ sweaters for women crew neck sweatershirt women white sweaters for women long
185
+ sweaters for women sweaters aesthetic cute sweaters womens cardigan sweaters plus
186
+ size sweaters for women girls sweaters black sweaters for women christmas sweaters
187
+ for women halloween sweaters for women sweaters for women cardigan open front
188
+ crewneck sweatshirts sweatshirts for teen girls womens sweatshirts graphic sweatshirts
189
+ cute sweatshirts hoodies for teen girls hoodies aesthetic hoodies for girls hoodies
190
+ for teens womens hoodies pullover womens hoodies zip up crop hoodies for women
191
+ tie dye hoodies for women zip up hoodie hoodies for teen girls zip up hoodie women
192
+ oversized hoodie brown zip up hoodie cropped zip up hoodie black zip up hoodie
193
+ womens zip up hoodie work blouses for women office plus size blouses for women
194
+ black blouses for women white blouses for women dressy women shirts and blouses
195
+ blouses for women plus size"]'
196
+ pipeline_tag: sentence-similarity
197
+ library_name: sentence-transformers
198
+ ---
199
+
200
+ # SentenceTransformer based on sentence-transformers/all-mpnet-base-v2
201
+
202
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
203
+
204
+ ## Model Details
205
+
206
+ ### Model Description
207
+ - **Model Type:** Sentence Transformer
208
+ - **Base model:** [sentence-transformers/all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2) <!-- at revision 9a3225965996d404b775526de6dbfe85d3368642 -->
209
+ - **Maximum Sequence Length:** 128 tokens
210
+ - **Output Dimensionality:** 768 tokens
211
+ - **Similarity Function:** Cosine Similarity
212
+ <!-- - **Training Dataset:** Unknown -->
213
+ <!-- - **Language:** Unknown -->
214
+ <!-- - **License:** Unknown -->
215
+
216
+ ### Model Sources
217
+
218
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
219
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
220
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
221
+
222
+ ### Full Model Architecture
223
+
224
+ ```
225
+ SentenceTransformer(
226
+ (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: MPNetModel
227
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
228
+ )
229
+ ```
230
+
231
+ ## Usage
232
+
233
+ ### Direct Usage (Sentence Transformers)
234
+
235
+ First install the Sentence Transformers library:
236
+
237
+ ```bash
238
+ pip install -U sentence-transformers
239
+ ```
240
+
241
+ Then you can load this model and run inference.
242
+ ```python
243
+ from sentence_transformers import SentenceTransformer
244
+
245
+ # Download from the 🤗 Hub
246
+ model = SentenceTransformer("knguyennguyen/mpnet_jacket4k")
247
+ # Run inference
248
+ sentences = [
249
+ 'a pair of statement stud earrings for brightening up outfits',
250
+ "Title: Kendra Scott Sienna Ear Jacket Earrings Descripion: ['The Kendra Scott Sienna Ear Jacket Earrings are the best bright-colored statement stud. Straight post backing. Removeable jackets. Gold plated brass. Shell detailing. Remove jewelry when swimming, bathing, or exercising. Imported.']",
251
+ "Title: TrailCrest Infant-Toddler Boys & Girls Fleece Full Zip Mock Neck Soft Jacket Descripion: ['Feel the chill! On chilly summer nights out in the wilderness or crisp spring and autumn days, these semi-fitted fleece zips are the perfect cover to keep you warm. These fleece zips are great for camping, hiking and just about any sport related activities. Featuring bright accent colors with realistic camo print to make the experience all the more exciting. Fully equipped with a mock neck collar for extra warmth, two hand pockets with hidden zippers, an adjustable drawstring at the bottom, and elastic cuffs. All set for the outdoors!']",
252
+ ]
253
+ embeddings = model.encode(sentences)
254
+ print(embeddings.shape)
255
+ # [3, 768]
256
+
257
+ # Get the similarity scores for the embeddings
258
+ similarities = model.similarity(embeddings, embeddings)
259
+ print(similarities.shape)
260
+ # [3, 3]
261
+ ```
262
+
263
+ <!--
264
+ ### Direct Usage (Transformers)
265
+
266
+ <details><summary>Click to see the direct usage in Transformers</summary>
267
+
268
+ </details>
269
+ -->
270
+
271
+ <!--
272
+ ### Downstream Usage (Sentence Transformers)
273
+
274
+ You can finetune this model on your own dataset.
275
+
276
+ <details><summary>Click to expand</summary>
277
+
278
+ </details>
279
+ -->
280
+
281
+ <!--
282
+ ### Out-of-Scope Use
283
+
284
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
285
+ -->
286
+
287
+ <!--
288
+ ## Bias, Risks and Limitations
289
+
290
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
291
+ -->
292
+
293
+ <!--
294
+ ### Recommendations
295
+
296
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
297
+ -->
298
+
299
+ ## Training Details
300
+
301
+ ### Training Dataset
302
+
303
+ #### Unnamed Dataset
304
+
305
+
306
+ * Size: 7,598 training samples
307
+ * Columns: <code>sentence_0</code> and <code>sentence_1</code>
308
+ * Approximate statistics based on the first 1000 samples:
309
+ | | sentence_0 | sentence_1 |
310
+ |:--------|:---------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
311
+ | type | string | string |
312
+ | details | <ul><li>min: 4 tokens</li><li>mean: 18.2 tokens</li><li>max: 51 tokens</li></ul> | <ul><li>min: 35 tokens</li><li>mean: 105.77 tokens</li><li>max: 128 tokens</li></ul> |
313
+ * Samples:
314
+ | sentence_0 | sentence_1 |
315
+ |:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
316
+ | <code>men's parka jacket with a concealed front closure, multiple storage pockets, and a detachable hood.</code> | <code>Title: DKNY Men's Welded Short Parka Jacket Descripion: ["FUNCTIONALITY: Center front placket with hidden zipper closure, two lower flap pockets with side entry, removable hood. STYLISH FEATURES: DKNY logo on wearer's left sleeve, DKNY small logo metal zipper puller. Water resistant outer shell with faux down fill insulation."]</code> |
317
+ | <code>a track jacket and pant set for active girls</code> | <code>Title: PUMA girls Track Jacket & Pant Set Descripion: ['PUMA, a Global athletic brand, provides consumers with innovative products that successfully fuses the creative influences from the world of sport, lifestyle, and fashion.']</code> |
318
+ | <code>kids' hoodies made from soft fleece material, featuring a zip closure and pockets for convenience. suitable for both boys and girls, these garments are designed for warmth and comfort during colder seasons.</code> | <code>Title: Eddie Bauer Kids’ Jacket - 2 Pack Ultra Soft Sherpa Fleece Hoodie Sweatshirt for Boys and Girls (5-20) Descripion: ["Eddie Bauer Kids' Plush Sherpa Fleece Zip Hoodie Sweatshirt for Boys and Girls is a great choice for the chilly fall and winter weather. This fun sweatshirt is high quality, and long lasting. This everyday clothing is a great gift for birthdays and holidays. We have several designs available, perfect for cold weather. High Quality fabric and durable, high stitch density give your boy or girl a long lasting hoodie that they'll be able to wear for years. Fashionable sherpa fleece sweatshirt for a soft feel and comfort. Full Zip jacket and quarter zip pullover allows her to have the ability to layer other tops underneath for a casual look. Comfort Fit jacket has split kangaroo pockets and cuffed sleeves to keep it from bunching and riding up. Simply machine wash and tumble dry; Please Reference the Variations for All Available Sizes & Colors! Eddie Bauer Offers Premium Clothing at Affordable Prices because we value every customer that visits our listings! Stop by Our Storefront to See the Rest of Our Great Deals; we're confident you're going to find items that anyone who needs a gift will absolutely love and adore!"]</code> |
319
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
320
+ ```json
321
+ {
322
+ "scale": 20.0,
323
+ "similarity_fct": "cos_sim"
324
+ }
325
+ ```
326
+
327
+ ### Training Hyperparameters
328
+ #### Non-Default Hyperparameters
329
+
330
+ - `per_device_train_batch_size`: 128
331
+ - `per_device_eval_batch_size`: 128
332
+ - `num_train_epochs`: 5
333
+ - `multi_dataset_batch_sampler`: round_robin
334
+
335
+ #### All Hyperparameters
336
+ <details><summary>Click to expand</summary>
337
+
338
+ - `overwrite_output_dir`: False
339
+ - `do_predict`: False
340
+ - `eval_strategy`: no
341
+ - `prediction_loss_only`: True
342
+ - `per_device_train_batch_size`: 128
343
+ - `per_device_eval_batch_size`: 128
344
+ - `per_gpu_train_batch_size`: None
345
+ - `per_gpu_eval_batch_size`: None
346
+ - `gradient_accumulation_steps`: 1
347
+ - `eval_accumulation_steps`: None
348
+ - `torch_empty_cache_steps`: None
349
+ - `learning_rate`: 5e-05
350
+ - `weight_decay`: 0.0
351
+ - `adam_beta1`: 0.9
352
+ - `adam_beta2`: 0.999
353
+ - `adam_epsilon`: 1e-08
354
+ - `max_grad_norm`: 1
355
+ - `num_train_epochs`: 5
356
+ - `max_steps`: -1
357
+ - `lr_scheduler_type`: linear
358
+ - `lr_scheduler_kwargs`: {}
359
+ - `warmup_ratio`: 0.0
360
+ - `warmup_steps`: 0
361
+ - `log_level`: passive
362
+ - `log_level_replica`: warning
363
+ - `log_on_each_node`: True
364
+ - `logging_nan_inf_filter`: True
365
+ - `save_safetensors`: True
366
+ - `save_on_each_node`: False
367
+ - `save_only_model`: False
368
+ - `restore_callback_states_from_checkpoint`: False
369
+ - `no_cuda`: False
370
+ - `use_cpu`: False
371
+ - `use_mps_device`: False
372
+ - `seed`: 42
373
+ - `data_seed`: None
374
+ - `jit_mode_eval`: False
375
+ - `use_ipex`: False
376
+ - `bf16`: False
377
+ - `fp16`: False
378
+ - `fp16_opt_level`: O1
379
+ - `half_precision_backend`: auto
380
+ - `bf16_full_eval`: False
381
+ - `fp16_full_eval`: False
382
+ - `tf32`: None
383
+ - `local_rank`: 0
384
+ - `ddp_backend`: None
385
+ - `tpu_num_cores`: None
386
+ - `tpu_metrics_debug`: False
387
+ - `debug`: []
388
+ - `dataloader_drop_last`: False
389
+ - `dataloader_num_workers`: 0
390
+ - `dataloader_prefetch_factor`: None
391
+ - `past_index`: -1
392
+ - `disable_tqdm`: False
393
+ - `remove_unused_columns`: True
394
+ - `label_names`: None
395
+ - `load_best_model_at_end`: False
396
+ - `ignore_data_skip`: False
397
+ - `fsdp`: []
398
+ - `fsdp_min_num_params`: 0
399
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
400
+ - `fsdp_transformer_layer_cls_to_wrap`: None
401
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
402
+ - `deepspeed`: None
403
+ - `label_smoothing_factor`: 0.0
404
+ - `optim`: adamw_torch
405
+ - `optim_args`: None
406
+ - `adafactor`: False
407
+ - `group_by_length`: False
408
+ - `length_column_name`: length
409
+ - `ddp_find_unused_parameters`: None
410
+ - `ddp_bucket_cap_mb`: None
411
+ - `ddp_broadcast_buffers`: False
412
+ - `dataloader_pin_memory`: True
413
+ - `dataloader_persistent_workers`: False
414
+ - `skip_memory_metrics`: True
415
+ - `use_legacy_prediction_loop`: False
416
+ - `push_to_hub`: False
417
+ - `resume_from_checkpoint`: None
418
+ - `hub_model_id`: None
419
+ - `hub_strategy`: every_save
420
+ - `hub_private_repo`: False
421
+ - `hub_always_push`: False
422
+ - `gradient_checkpointing`: False
423
+ - `gradient_checkpointing_kwargs`: None
424
+ - `include_inputs_for_metrics`: False
425
+ - `eval_do_concat_batches`: True
426
+ - `fp16_backend`: auto
427
+ - `push_to_hub_model_id`: None
428
+ - `push_to_hub_organization`: None
429
+ - `mp_parameters`:
430
+ - `auto_find_batch_size`: False
431
+ - `full_determinism`: False
432
+ - `torchdynamo`: None
433
+ - `ray_scope`: last
434
+ - `ddp_timeout`: 1800
435
+ - `torch_compile`: False
436
+ - `torch_compile_backend`: None
437
+ - `torch_compile_mode`: None
438
+ - `dispatch_batches`: None
439
+ - `split_batches`: None
440
+ - `include_tokens_per_second`: False
441
+ - `include_num_input_tokens_seen`: False
442
+ - `neftune_noise_alpha`: None
443
+ - `optim_target_modules`: None
444
+ - `batch_eval_metrics`: False
445
+ - `eval_on_start`: False
446
+ - `use_liger_kernel`: False
447
+ - `eval_use_gather_object`: False
448
+ - `batch_sampler`: batch_sampler
449
+ - `multi_dataset_batch_sampler`: round_robin
450
+
451
+ </details>
452
+
453
+ ### Framework Versions
454
+ - Python: 3.11.11
455
+ - Sentence Transformers: 3.1.1
456
+ - Transformers: 4.45.2
457
+ - PyTorch: 2.5.1+cu121
458
+ - Accelerate: 1.2.1
459
+ - Datasets: 3.2.0
460
+ - Tokenizers: 0.20.3
461
+
462
+ ## Citation
463
+
464
+ ### BibTeX
465
+
466
+ #### Sentence Transformers
467
+ ```bibtex
468
+ @inproceedings{reimers-2019-sentence-bert,
469
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
470
+ author = "Reimers, Nils and Gurevych, Iryna",
471
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
472
+ month = "11",
473
+ year = "2019",
474
+ publisher = "Association for Computational Linguistics",
475
+ url = "https://arxiv.org/abs/1908.10084",
476
+ }
477
+ ```
478
+
479
+ #### MultipleNegativesRankingLoss
480
+ ```bibtex
481
+ @misc{henderson2017efficient,
482
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
483
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
484
+ year={2017},
485
+ eprint={1705.00652},
486
+ archivePrefix={arXiv},
487
+ primaryClass={cs.CL}
488
+ }
489
+ ```
490
+
491
+ <!--
492
+ ## Glossary
493
+
494
+ *Clearly define terms in order to be accessible across audiences.*
495
+ -->
496
+
497
+ <!--
498
+ ## Model Card Authors
499
+
500
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
501
+ -->
502
+
503
+ <!--
504
+ ## Model Card Contact
505
+
506
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
507
+ -->
config.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "sentence-transformers/all-mpnet-base-v2",
3
+ "architectures": [
4
+ "MPNetModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "bos_token_id": 0,
8
+ "eos_token_id": 2,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 768,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 3072,
14
+ "layer_norm_eps": 1e-05,
15
+ "max_position_embeddings": 514,
16
+ "model_type": "mpnet",
17
+ "num_attention_heads": 12,
18
+ "num_hidden_layers": 12,
19
+ "pad_token_id": 1,
20
+ "relative_attention_num_buckets": 32,
21
+ "torch_dtype": "float32",
22
+ "transformers_version": "4.45.2",
23
+ "vocab_size": 30527
24
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.1.1",
4
+ "transformers": "4.45.2",
5
+ "pytorch": "2.5.1+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": null
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2f6b2e0cccb11c8e7a240beb60e88d1cdd6a5a1b57c525b519f99c617b2111d5
3
+ size 437967672
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 128,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<s>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "cls_token": {
10
+ "content": "<s>",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "eos_token": {
17
+ "content": "</s>",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "mask_token": {
24
+ "content": "<mask>",
25
+ "lstrip": true,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "pad_token": {
31
+ "content": "<pad>",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ },
37
+ "sep_token": {
38
+ "content": "</s>",
39
+ "lstrip": false,
40
+ "normalized": false,
41
+ "rstrip": false,
42
+ "single_word": false
43
+ },
44
+ "unk_token": {
45
+ "content": "[UNK]",
46
+ "lstrip": false,
47
+ "normalized": false,
48
+ "rstrip": false,
49
+ "single_word": false
50
+ }
51
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,72 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "<s>",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "1": {
12
+ "content": "<pad>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "2": {
20
+ "content": "</s>",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "3": {
28
+ "content": "<unk>",
29
+ "lstrip": false,
30
+ "normalized": true,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "104": {
36
+ "content": "[UNK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ },
43
+ "30526": {
44
+ "content": "<mask>",
45
+ "lstrip": true,
46
+ "normalized": false,
47
+ "rstrip": false,
48
+ "single_word": false,
49
+ "special": true
50
+ }
51
+ },
52
+ "bos_token": "<s>",
53
+ "clean_up_tokenization_spaces": false,
54
+ "cls_token": "<s>",
55
+ "do_lower_case": true,
56
+ "eos_token": "</s>",
57
+ "mask_token": "<mask>",
58
+ "max_length": 128,
59
+ "model_max_length": 128,
60
+ "pad_to_multiple_of": null,
61
+ "pad_token": "<pad>",
62
+ "pad_token_type_id": 0,
63
+ "padding_side": "right",
64
+ "sep_token": "</s>",
65
+ "stride": 0,
66
+ "strip_accents": null,
67
+ "tokenize_chinese_chars": true,
68
+ "tokenizer_class": "MPNetTokenizer",
69
+ "truncation_side": "right",
70
+ "truncation_strategy": "longest_first",
71
+ "unk_token": "[UNK]"
72
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff