knguyennguyen commited on
Commit
ebb673c
·
verified ·
1 Parent(s): 400dffd

Add new SentenceTransformer model.

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,508 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - sentence-transformers
4
+ - sentence-similarity
5
+ - feature-extraction
6
+ - generated_from_trainer
7
+ - dataset_size:4693
8
+ - loss:MultipleNegativesRankingLoss
9
+ base_model: FacebookAI/roberta-base
10
+ widget:
11
+ - source_sentence: I'm looking for a pair of sleek and comfortable footwear designed
12
+ for running. They should offer a lightweight feel and have a striking appearance.
13
+ sentences:
14
+ - 'Title: Wood Sunglasses for Men and Women Vintage Polarized Lenses Uv Protection
15
+ Bamboo Wooden Sun Glasses Descripion: [''SUNMER WOOD (Brand) Your love for the
16
+ outdoors and warm summer days shouldn’t come at the cost of your eyes. While the
17
+ sun creates the perfect setting for exploring nature, its harmful UVA and UVB
18
+ rays can cause immense damage to your eyes and skin, causing not just visual deterioration
19
+ but wrinkles too. At Sunmer Wood, we are a team of designers specializing in making
20
+ premium sunglasses that give your eyes protection and your outfit a stylish upgrade.
21
+ We use organic wood and heavy-duty materials to ensure unmatched durability and
22
+ world-class comfort.'']'
23
+ - 'Title: Nike Men''s Epic React Flyknit Running Shoes Descripion: [''The Nike Epic
24
+ React Flyknit provides a smooth, lightweight performance and a bold look.'']'
25
+ - 'Title: Metal D Rings Heavy Duty 1 Inch D Shape Rings for Sewing, Keychains, Straps
26
+ Ties, Belts, Crafts and Dog Leash (50 Pack) Descripion: [''50pcs 1inch Metal D
27
+ Rings Buckles for Straps Ties Belts Bags, Silver ☛Specification : √Material:
28
+ alloy √Color: Silver √Inner width: 25mm √Inner high: 16mm √Thickness: 3mm ☛LOAD
29
+ BEARING : Made from strong metal, these D ring fasteners have good load bearing
30
+ characteristics and can resist sudden impact. ☛GOOD GIFTS : The package includes
31
+ 50 pieces Silver d rings. You can use them to make beautiful bags to send your
32
+ friends or your family. ☛APPLICATION: Suitable for DIY Fashion Belts. Suitable
33
+ for Pets Collars. Suitable for Strap. ☛Package included: 50 x D Rings(1 inch) ☛Note:
34
+ 1.Please allow 1-3mm minor deviation due to manual measurement. 2.Due to the difference
35
+ between different monitors, the picture may not reflect the actual color of the
36
+ item.'']'
37
+ - source_sentence: I'm looking for a versatile outdoor garment that can handle various
38
+ weather conditions while providing ample storage options. It should be comfortable
39
+ to wear for activities like fishing or photography and have a hood that can be
40
+ removed.
41
+ sentences:
42
+ - "Title: Yimidear Unisex Outdoor Casual Quick-Drying Extra Pockets Fishing Vest\
43
+ \ Travel Photography Vest with Detachable Hood Descripion: ['Features:'\n 'The\
44
+ \ characteristics of the multi-functional leisure vest features large capacity,\
45
+ \ highly breathable, can protect you from wind and rain. And this multi-purpose\
46
+ \ vest with high strength and good extensibility, will make you feel comfortable.Condition:\
47
+ \ 100% Brand New.Material: Nylon & Polyester MeshGender: Men&Women'\n 'Size:'\n\
48
+ \ 'M: Bust: 110cm/43.3\", Shoulder Width: 42cm/16.54\", Clothes Length: 62cm/24.4\"\
49
+ .L: Bust: 114cm/44.88\", Shoulder Width: 43cm/16.93\", Clothes Length: 66cm/25.98\"\
50
+ .XL: Bust: 116cm/45.67\", Shoulder Width: 44cm/17.32\", Clothes Length: 68cm/26.77\"\
51
+ .XXL: Bust: 122cm/48\", Shoulder Width: 46cm/18.1\", Clothes Length: 70cm/27.56\"\
52
+ .'\n 'Note:'\n '1-2cm error of measuring is a reasonable range due to different\
53
+ \ measurement methods.Please kingly understand that.Due to different camera lens\
54
+ \ and light environments, the real item color which you receive may be a little\
55
+ \ vary from the listing picture. Thanks for your understanding.'\n 'Package include'\
56
+ \ '1 x Vest']"
57
+ - 'Title: 1-3 Pack Famous TIK Tok Butt Lift High Waist Yoga Workout Pants Pattern
58
+ Scrunch Tummy Control Sliming Leggings for Women Descripion: [''72% Polyester,
59
+ 28% Spandex Tiktok Internet celebrity recommendation. Turn your gear inside out
60
+ when washing; wash separately. Air dry or tumble dry low Crafted from a brushed,
61
+ buttery soft and stretch fabric for delivering you the naked sensation and the
62
+ unrestricted movement Experience the comfort of yoga pants with a seamless waistband
63
+ that does not dig in Side pockets allow you to store your personal items when
64
+ you’re on the go Flatlock construction minimize chafe. Approx. 25” inseam; 7/8
65
+ length'']'
66
+ - 'Title: I Love You 3000 Keychain Iron Man for Women Men Valentine Day Gifts for
67
+ Lover Couple Christmas Birthday Anniversary Keychain Gifts for Boyfriend Husband
68
+ Love You Gifts for Fiance for Him Her Descripion: [''★"I love you 3,000", a line
69
+ originally said by Tony Stark\''s daughter Morgan in Avengers: Endgame, is definitely
70
+ going down in history as one of the most impactful quotes in the Marvel Cinematic
71
+ Universe (MCU).★Although he may be gone, his influence lives in every one of us.
72
+ Buy these keychains to show as a proof to others that Tony Stark has a Heart and
73
+ your love for him is 3000.★Exquisite and Useful: Delicate Keychain pendant look
74
+ chic, great on your purses backpack handbags, and also fit for as DIY accessories
75
+ to connect charms, links and other ornaments★After-Sales Service: 90-Day money
76
+ back guarantee or replacement; We are engaged in providing the best shopping experience
77
+ for you.★Notice: The little connecting ring is soldered to hold the plates securely.'']'
78
+ - source_sentence: I'm looking for a collectible set that celebrates a specific game
79
+ location, featuring a unique character. It should include a decorative pin and
80
+ be a fun addition to a gaming collection.
81
+ sentences:
82
+ - 'Title: Pokemon Champions Path Pin Collection Hammerlocke Gym Featuring Duraludon
83
+ Descripion: [''Pokemon Champions Path Pin Collection Hammerlocke Gym Featuring
84
+ Duraludon'']'
85
+ - 'Title: Kiddus Fashionable Girls Watch for Kids. Children’s Analogue Wristwatch
86
+ with Educational Exercises. Japanese Quartz Movement. Cute, Stylish, Elegant &
87
+ Fabulous Descripion: ["STYLISH & EDUCATIONAL WATCH FOR KIDS: Designed for children
88
+ who are learning to read the time AND who want to be fabulously fashionable. Cute
89
+ designs, vibrant colours and glitter all over make them super attractive. Your
90
+ child will love his fancy watch! RECOMMENDED AGE: For Children from 5 years old.
91
+ Not suitable for children under 3 years of age due to small parts which may cause
92
+ a chocking hazard. RELIABLE & ADJUSTABLE: Provided with a HIGH-QUALITY Japanese
93
+ Mechanism and LONG LASTING Japanese battery, our kids watch line features also
94
+ a SHOCK RESISTANT CASE, nickel-free stainless steel backside, and 8 adjustment
95
+ holes on the strap to fit wrists large and small. Our watches for kids are water
96
+ resistant, so they can withstand splashes while washing hands or playing in the
97
+ rain but should be removed before bathing or swimming. PERFECT GIFT - GIVE THE
98
+ GIFT OF TIME: Available in a variety of styles and colours, our childrens'' watches
99
+ come packaged in a GIFT BOX so you can watch your child’s face light up as the
100
+ box is opened. Also included is a worksheet with specific Time Teacher exercises
101
+ to learn to read the time with your child. PURCHASE WITHOUT WORRY: As our products
102
+ are rigorously tested and made with love, we firmly believe in the high quality
103
+ of our product. So we offer a 30 day unconditional MONEY BACK GUARANTEE and a
104
+ 12-month warranty. Even if you give it to someone else, it''s still covered! If
105
+ you have problems or issues with the watch, contact us and we will help you."]'
106
+ - 'Title: Kayhoma Extra Soft Artificial Wool Leg Warmer Descripion: [''Kayhoma Extra
107
+ Soft Artificial Wool Leg Warmer Thicker - Increased the density of knitted fabric
108
+ by 30% than last design, which become thicker and warmer. Softer - From natural
109
+ cotton upgrade to artificial wool. It is almost as soft as wool. and not easy
110
+ out of shape after worn and washed. Good elasticity makes it possible to completely
111
+ cover the calf and not be too tight, protecting your circulation. Stay Up Well
112
+ - After multiple tests and experiments, the leg warmer will stay up all day, feel
113
+ free to walk or move as you like, not having to keep tugging at them to keep them
114
+ up.'']'
115
+ - source_sentence: I'm looking for a reusable face covering that offers protection
116
+ against dust. It should have ear loops for a secure fit and come with a filter
117
+ option.
118
+ sentences:
119
+ - "Title: Balaclava Face Mask - New Range 3 Pack Now with 20 PM 2.5 Filters - Comfortable\
120
+ \ Cooling Neck Gaiter with Filter and Ear Loops, Bandana Face Mask Black Grey,\
121
+ \ Silk face mask, Sports Mask with Filters. Descripion: ['Free Shipping $25+ orders,\
122
+ \ save a few dollars, shipping 5-8 days'\n 'Free Shipping $25+ orders, save a\
123
+ \ few dollars, shipping 5-8 days'\n 'Super comfortable spandex material for easy\
124
+ \ fit'\n 'Super comfortable spandex material for easy fit'\n 'Stylish and fashionable,\
125
+ \ can be worn for sports or social'\n 'Stylish and fashionable, can be worn for\
126
+ \ sports or social'\n 'Breathable face mask with filter to capture dust & pollen'\n\
127
+ \ 'Breathable face mask with filter to capture dust & pollen'\n 'One size fits\
128
+ \ most people' 'One size fits most people'\n 'The neck gaiters are a quality made\
129
+ \ product'\n 'The neck gaiters are a quality made product'\n 'Wear with or without\
130
+ \ the filter depending on your requirements'\n 'Wear with or without the filter\
131
+ \ depending on your requirements'\n 'Each pack includes 3 Masks (1x Jet Black,\
132
+ \ 1x Black Violet, 1x Deep Grey) + extra 20x PM 2.5 filters'\n 'Each pack includes\
133
+ \ 3 Masks (1x Jet Black, 1x Black Violet, 1x Deep Grey) + extra 20x PM 2.5 filters'\n\
134
+ \ 'We use these ourselves on the motorcycles and current stage 4 restrictions\
135
+ \ in Australia as an outdoor face mask.'\n 'Very comfortable face mask, stretchy\
136
+ \ with spandex fabric that fits most men, women and teenagers.'\n 'Perfect to\
137
+ \ wear for sports and outdoors while walking, cycling, hiking, skiing, fishing,\
138
+ \ motorcycle or horse riding, also good as a UV protection face mask to protect\
139
+ \ you from direct sunlight and sunburn, would assist people with hay fever.'\n\
140
+ \ 'Not only covers the face but also covers the neck for maximum protection, being\
141
+ \ soft and light weight, the fabric moves over the skin which has a cooling effect.'\n\
142
+ \ 'Reusable simply wash the balaclava and replace the PM 2.5 filters as needed.'\n\
143
+ \ 'More filters can be purchased in large pack to make it affordable to replace\
144
+ \ regularly as recommended.'\n 'Perfect for a thoughtful gift that family and\
145
+ \ friends will definitely use and enjoy for years to come.'\n 'What are your favorite\
146
+ \ colours and styles, let us know, we are expanding our range to meet our clients\
147
+ \ requirements. Shipping directly from the USA home land to your home in 1-5 days,\
148
+ \ we are hope you enjoy this product and can wear while doing most activities\
149
+ \ and social environments.']"
150
+ - 'Title: Men''s genuine fullgrain tanned leather jeans belt with buckle Descripion:
151
+ ["GENUINE MEN’S LEATHER BELT: The ''s Leather Belt is made with 100% genuine leather
152
+ and has a single-loop antique-finish buckle"]'
153
+ - 'Title: Ruikim Mouth Bandana For Dust Protection Face Bandana Washable Earloop
154
+ -Pm2.5 Filter Chip Descripion: [''100% Satisfy Service: 12 Month Quality Guarantee,
155
+ Buy With Confidence'']'
156
+ - source_sentence: I'm looking for a charming ring that embodies innocence and purity,
157
+ suitable for daily wear. It should have a minimalistic design and be stackable
158
+ with other rings. Durability is key, and I prefer it to be available in a unique
159
+ metallic finish.
160
+ sentences:
161
+ - "Title: CozzySayido Daisy Flower Ring Bands for Woman Innocent Daisy Promise Dainty\
162
+ \ Delicate Design Minimalistic Stackable Available in Silver and Rose Gold Descripion:\
163
+ \ ['“Always have something beautiful in sight, even if it’s just a daisy in a\
164
+ \ jelly glass.”'\n '- H. Jackson, Brown Jr.'\n 'Simple, sweet, stackable everyday\
165
+ \ ring for the sweetest ones.' 'Perks'\n 'Silver or rose gold Stainless steel\
166
+ \ No corrosion No peeling No spotting or staining No green fingers Resistant to\
167
+ \ perfume, sweat, and salt water Styling versatility (stack rings, minimalistic\
168
+ \ one-piece, knuckle, toe, stopper) Medical-grade stainless steel Unbreakable\
169
+ \ Unbendable Dainty, sweet, and delicate design 30-day full refund'\n 'Sizing'\n\
170
+ \ 'Available in size 3-10 (For sizing specs and how to know your size, please\
171
+ \ consult out sizing guide in the photo panel above.)'\n 'CozzySayido'\n 'We brainstorm\
172
+ \ for our customer to get the best value product. Proudly present, a daisy flower\
173
+ \ ring. We chose stainless-steel material which is no allergies for sensitive\
174
+ \ skin, durable and tough in any conditions, no matter if you wash your hand with\
175
+ \ alcohol sanitizer or washing tons of dishes or swimming in chlorine and salt\
176
+ \ water, it never gets spotting, staining or turn your finger green or any other\
177
+ \ color. Unbreakable and unbendable no matter, how you wear it.'\n 'Meaningful\
178
+ \ design, daisy flower is the symbol of innocence, purity, true love and new beginning\
179
+ \ to make every of your day the start of something new.'\n 'Designs as well as\
180
+ \ on-trend fashion jewelry for women with minimalist, dainty, sweet, and delicate\
181
+ \ style.']"
182
+ - 'Title: Chelsea FC Official Soccer Gift Mens Graphic T-Shirt Navy XXL Descripion:
183
+ [''Official CFC mens T-shirt Large club crest & text print to front Garment Size
184
+ (Chest): Sm. 40"; Med. 41"; Lge. 42"; XL 44"; XXL 48"; 3XL 52" 100% cotton, top
185
+ quality T-shirt Many more gift ideas for him @ FootballShopOnline'']'
186
+ - "Title: TMVFPYR Youth Pretty Cotton Moisture Wicking Extra Heavy Cushion Crew\
187
+ \ Socks… Descripion: ['PRODUCT SPECIFICATION'\n '- Size: 7.9in- Weight: 0.26ib/120g-\
188
+ \ Material: Polyester- Style: Individual, FashionPackage includes:1 Socks'\n 'PRODUCT\
189
+ \ FEATURES' 'DURABLE AND LONG LASTING'\n 'Superior quality fabrics makes them\
190
+ \ long lasting and durable. They won’t rip, tear or shred and they’ll maintain\
191
+ \ their outstanding look and feel through machine washing.'\n 'IDEAL FOR OUTDOOR\
192
+ \ SPORTS'\n 'Trekking, walking, running, camping, mountaineering, climbing, skiing,\
193
+ \ snowboarding, backpacking, traveling, various athletic pursuits or daily wear'\n\
194
+ \ 'COZY AND COMFORTABLE'\n 'Breathable materials give your feet the comfort they\
195
+ \ deserve. Keep your feet warm, cool and dry all day long. Luxury materials won’t\
196
+ \ absorb sweat and feel great on your feet.']"
197
+ pipeline_tag: sentence-similarity
198
+ library_name: sentence-transformers
199
+ ---
200
+
201
+ # SentenceTransformer based on FacebookAI/roberta-base
202
+
203
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [FacebookAI/roberta-base](https://huggingface.co/FacebookAI/roberta-base). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
204
+
205
+ ## Model Details
206
+
207
+ ### Model Description
208
+ - **Model Type:** Sentence Transformer
209
+ - **Base model:** [FacebookAI/roberta-base](https://huggingface.co/FacebookAI/roberta-base) <!-- at revision e2da8e2f811d1448a5b465c236feacd80ffbac7b -->
210
+ - **Maximum Sequence Length:** 128 tokens
211
+ - **Output Dimensionality:** 768 tokens
212
+ - **Similarity Function:** Cosine Similarity
213
+ <!-- - **Training Dataset:** Unknown -->
214
+ <!-- - **Language:** Unknown -->
215
+ <!-- - **License:** Unknown -->
216
+
217
+ ### Model Sources
218
+
219
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
220
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
221
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
222
+
223
+ ### Full Model Architecture
224
+
225
+ ```
226
+ SentenceTransformer(
227
+ (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: RobertaModel
228
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
229
+ )
230
+ ```
231
+
232
+ ## Usage
233
+
234
+ ### Direct Usage (Sentence Transformers)
235
+
236
+ First install the Sentence Transformers library:
237
+
238
+ ```bash
239
+ pip install -U sentence-transformers
240
+ ```
241
+
242
+ Then you can load this model and run inference.
243
+ ```python
244
+ from sentence_transformers import SentenceTransformer
245
+
246
+ # Download from the 🤗 Hub
247
+ model = SentenceTransformer("knguyennguyen/fashion_5k")
248
+ # Run inference
249
+ sentences = [
250
+ "I'm looking for a charming ring that embodies innocence and purity, suitable for daily wear. It should have a minimalistic design and be stackable with other rings. Durability is key, and I prefer it to be available in a unique metallic finish.",
251
+ "Title: CozzySayido Daisy Flower Ring Bands for Woman Innocent Daisy Promise Dainty Delicate Design Minimalistic Stackable Available in Silver and Rose Gold Descripion: ['“Always have something beautiful in sight, even if it’s just a daisy in a jelly glass.”'\n '- H. Jackson, Brown Jr.'\n 'Simple, sweet, stackable everyday ring for the sweetest ones.' 'Perks'\n 'Silver or rose gold Stainless steel No corrosion No peeling No spotting or staining No green fingers Resistant to perfume, sweat, and salt water Styling versatility (stack rings, minimalistic one-piece, knuckle, toe, stopper) Medical-grade stainless steel Unbreakable Unbendable Dainty, sweet, and delicate design 30-day full refund'\n 'Sizing'\n 'Available in size 3-10 (For sizing specs and how to know your size, please consult out sizing guide in the photo panel above.)'\n 'CozzySayido'\n 'We brainstorm for our customer to get the best value product. Proudly present, a daisy flower ring. We chose stainless-steel material which is no allergies for sensitive skin, durable and tough in any conditions, no matter if you wash your hand with alcohol sanitizer or washing tons of dishes or swimming in chlorine and salt water, it never gets spotting, staining or turn your finger green or any other color. Unbreakable and unbendable no matter, how you wear it.'\n 'Meaningful design, daisy flower is the symbol of innocence, purity, true love and new beginning to make every of your day the start of something new.'\n 'Designs as well as on-trend fashion jewelry for women with minimalist, dainty, sweet, and delicate style.']",
252
+ 'Title: Chelsea FC Official Soccer Gift Mens Graphic T-Shirt Navy XXL Descripion: [\'Official CFC mens T-shirt Large club crest & text print to front Garment Size (Chest): Sm. 40"; Med. 41"; Lge. 42"; XL 44"; XXL 48"; 3XL 52" 100% cotton, top quality T-shirt Many more gift ideas for him @ FootballShopOnline\']',
253
+ ]
254
+ embeddings = model.encode(sentences)
255
+ print(embeddings.shape)
256
+ # [3, 768]
257
+
258
+ # Get the similarity scores for the embeddings
259
+ similarities = model.similarity(embeddings, embeddings)
260
+ print(similarities.shape)
261
+ # [3, 3]
262
+ ```
263
+
264
+ <!--
265
+ ### Direct Usage (Transformers)
266
+
267
+ <details><summary>Click to see the direct usage in Transformers</summary>
268
+
269
+ </details>
270
+ -->
271
+
272
+ <!--
273
+ ### Downstream Usage (Sentence Transformers)
274
+
275
+ You can finetune this model on your own dataset.
276
+
277
+ <details><summary>Click to expand</summary>
278
+
279
+ </details>
280
+ -->
281
+
282
+ <!--
283
+ ### Out-of-Scope Use
284
+
285
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
286
+ -->
287
+
288
+ <!--
289
+ ## Bias, Risks and Limitations
290
+
291
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
292
+ -->
293
+
294
+ <!--
295
+ ### Recommendations
296
+
297
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
298
+ -->
299
+
300
+ ## Training Details
301
+
302
+ ### Training Dataset
303
+
304
+ #### Unnamed Dataset
305
+
306
+
307
+ * Size: 4,693 training samples
308
+ * Columns: <code>sentence_0</code> and <code>sentence_1</code>
309
+ * Approximate statistics based on the first 1000 samples:
310
+ | | sentence_0 | sentence_1 |
311
+ |:--------|:----------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
312
+ | type | string | string |
313
+ | details | <ul><li>min: 26 tokens</li><li>mean: 44.9 tokens</li><li>max: 87 tokens</li></ul> | <ul><li>min: 21 tokens</li><li>mean: 106.64 tokens</li><li>max: 128 tokens</li></ul> |
314
+ * Samples:
315
+ | sentence_0 | sentence_1 |
316
+ |:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
317
+ | <code>I'm looking for a spooky mask that can enhance my costume for a Halloween event. It should have a unique design, feature some lighting effects, and come with hair, suitable for adults.</code> | <code>Title: Scary Clown Mask for Penny It Cosplay Costume Halloween Led Light Up Joker Mask with Hair Latex Horror Adult Mask Party Props Descripion: ['This clown costume mask for adults is available in a standard size that fits most adults and teens and is perfect for Halloween, themed parties, haunted houses, and more. Does not include shoes or costume. Care for this 100% latex mask with attached synthetic polyester hair']</code> |
318
+ | <code>I'm looking for a festive accessory set to celebrate a special birthday. It should include a stylish decorative piece that can be adjusted for comfort and a fun headpiece that adds a touch of sparkle, perfect for both intimate gatherings and lively celebrations.</code> | <code>Title: Birthday Queen Sash & Rhinestone Headband Set - Silver Glitter Birthday Sash Birthday Gifts for Women Birthday Party Supplies Descripion: ['"Birthday Queen" sash & rhinestone headband set ↑ No need to keep looking...it\'s the ultimate birthday party gifts set! ✓ Silver glitter sash + black lettering looks great on Instagram. Make the birthday queen feel special and stand out from the crowd. ✓ It\'s party tested and approved...will last day into night! Not only perfect for the cozy birthday parties with family but also for the crazy night at Vegas. ✓ No size trouble + Comfortable wearing: Sash can be adjust by clip on to fit from all type body figure. Headband can sit comfortably on the head and the letters are large enough to be clearly identifiable We had so much fun designing this birthday gifts set, we hope they add just as much fun to your parties too. Get the sash and headband at the same time and be prepare for the birthday celebration!']</code> |
319
+ | <code>I'm looking for a cozy and stylish outerwear option for the colder months, ideally with a hood and a playful design. It should be warm and plush, perfect for layering, and have a comfortable fit.</code> | <code>Title: OutTop Sherpa Jacket Women Fall Winter Plush Warm Hooded Stripe Color Block Thicken Warm Fleece Coats Parka Outwear Descripion: ['Package Include:1 PC Coats'<br> '==========================================================================='<br> 'SIZE TABLE' ': International standard : 1 inch = 2.54 cm☺'<br> "Size:S____US:4____Bust:100cm/39.37''____Sleeve:56.5cm/22.24''____Length:88cm/34.65''"<br> "Size:M____US:6____Bust:105cm/41.34''____Sleeve:57cm/22.44''____Length:89cm/35.04''"<br> "Size:L____US:8____Bust:110cm/43.31''____Sleeve:57.5cm/22.64''____Length:90cm/35.43''"<br> "Size:XL____US:10____Bust:115cm/45.28''____Sleeve:58cm/22.83''____Length:91cm/35.83''"<br> "Size:XXL____US:12____Bust:120cm/47.24''____Sleeve:58.5cm/23.03''____Length:92cm/36.22''"<br> "Size:XXXL____US:14____Bust:125cm/49.21''____Sleeve:59cm/23.23''____Length:93cm/36.61''"<br> "Size:XXXXL____US:16____Bust:130cm/51.18''____Sleeve:59.5cm/23.43''____Length:94cm/37.01''"<br> "Size:XXXXXL____US:18____Bust:135cm/53.15''____Sleeve:60cm/23.62''____Length:95cm/37.40''"<br> '==========================================================================='<br> 'Any questions, please feel free to contact us.☺☺' 'Delivery:'<br> 'Standard express would take 7-20 days to deliver. Expedited express need 5-7 days.☺☺']</code> |
320
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
321
+ ```json
322
+ {
323
+ "scale": 20.0,
324
+ "similarity_fct": "cos_sim"
325
+ }
326
+ ```
327
+
328
+ ### Training Hyperparameters
329
+ #### Non-Default Hyperparameters
330
+
331
+ - `per_device_train_batch_size`: 128
332
+ - `per_device_eval_batch_size`: 128
333
+ - `num_train_epochs`: 5
334
+ - `multi_dataset_batch_sampler`: round_robin
335
+
336
+ #### All Hyperparameters
337
+ <details><summary>Click to expand</summary>
338
+
339
+ - `overwrite_output_dir`: False
340
+ - `do_predict`: False
341
+ - `eval_strategy`: no
342
+ - `prediction_loss_only`: True
343
+ - `per_device_train_batch_size`: 128
344
+ - `per_device_eval_batch_size`: 128
345
+ - `per_gpu_train_batch_size`: None
346
+ - `per_gpu_eval_batch_size`: None
347
+ - `gradient_accumulation_steps`: 1
348
+ - `eval_accumulation_steps`: None
349
+ - `torch_empty_cache_steps`: None
350
+ - `learning_rate`: 5e-05
351
+ - `weight_decay`: 0.0
352
+ - `adam_beta1`: 0.9
353
+ - `adam_beta2`: 0.999
354
+ - `adam_epsilon`: 1e-08
355
+ - `max_grad_norm`: 1
356
+ - `num_train_epochs`: 5
357
+ - `max_steps`: -1
358
+ - `lr_scheduler_type`: linear
359
+ - `lr_scheduler_kwargs`: {}
360
+ - `warmup_ratio`: 0.0
361
+ - `warmup_steps`: 0
362
+ - `log_level`: passive
363
+ - `log_level_replica`: warning
364
+ - `log_on_each_node`: True
365
+ - `logging_nan_inf_filter`: True
366
+ - `save_safetensors`: True
367
+ - `save_on_each_node`: False
368
+ - `save_only_model`: False
369
+ - `restore_callback_states_from_checkpoint`: False
370
+ - `no_cuda`: False
371
+ - `use_cpu`: False
372
+ - `use_mps_device`: False
373
+ - `seed`: 42
374
+ - `data_seed`: None
375
+ - `jit_mode_eval`: False
376
+ - `use_ipex`: False
377
+ - `bf16`: False
378
+ - `fp16`: False
379
+ - `fp16_opt_level`: O1
380
+ - `half_precision_backend`: auto
381
+ - `bf16_full_eval`: False
382
+ - `fp16_full_eval`: False
383
+ - `tf32`: None
384
+ - `local_rank`: 0
385
+ - `ddp_backend`: None
386
+ - `tpu_num_cores`: None
387
+ - `tpu_metrics_debug`: False
388
+ - `debug`: []
389
+ - `dataloader_drop_last`: False
390
+ - `dataloader_num_workers`: 0
391
+ - `dataloader_prefetch_factor`: None
392
+ - `past_index`: -1
393
+ - `disable_tqdm`: False
394
+ - `remove_unused_columns`: True
395
+ - `label_names`: None
396
+ - `load_best_model_at_end`: False
397
+ - `ignore_data_skip`: False
398
+ - `fsdp`: []
399
+ - `fsdp_min_num_params`: 0
400
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
401
+ - `fsdp_transformer_layer_cls_to_wrap`: None
402
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
403
+ - `deepspeed`: None
404
+ - `label_smoothing_factor`: 0.0
405
+ - `optim`: adamw_torch
406
+ - `optim_args`: None
407
+ - `adafactor`: False
408
+ - `group_by_length`: False
409
+ - `length_column_name`: length
410
+ - `ddp_find_unused_parameters`: None
411
+ - `ddp_bucket_cap_mb`: None
412
+ - `ddp_broadcast_buffers`: False
413
+ - `dataloader_pin_memory`: True
414
+ - `dataloader_persistent_workers`: False
415
+ - `skip_memory_metrics`: True
416
+ - `use_legacy_prediction_loop`: False
417
+ - `push_to_hub`: False
418
+ - `resume_from_checkpoint`: None
419
+ - `hub_model_id`: None
420
+ - `hub_strategy`: every_save
421
+ - `hub_private_repo`: False
422
+ - `hub_always_push`: False
423
+ - `gradient_checkpointing`: False
424
+ - `gradient_checkpointing_kwargs`: None
425
+ - `include_inputs_for_metrics`: False
426
+ - `eval_do_concat_batches`: True
427
+ - `fp16_backend`: auto
428
+ - `push_to_hub_model_id`: None
429
+ - `push_to_hub_organization`: None
430
+ - `mp_parameters`:
431
+ - `auto_find_batch_size`: False
432
+ - `full_determinism`: False
433
+ - `torchdynamo`: None
434
+ - `ray_scope`: last
435
+ - `ddp_timeout`: 1800
436
+ - `torch_compile`: False
437
+ - `torch_compile_backend`: None
438
+ - `torch_compile_mode`: None
439
+ - `dispatch_batches`: None
440
+ - `split_batches`: None
441
+ - `include_tokens_per_second`: False
442
+ - `include_num_input_tokens_seen`: False
443
+ - `neftune_noise_alpha`: None
444
+ - `optim_target_modules`: None
445
+ - `batch_eval_metrics`: False
446
+ - `eval_on_start`: False
447
+ - `use_liger_kernel`: False
448
+ - `eval_use_gather_object`: False
449
+ - `batch_sampler`: batch_sampler
450
+ - `multi_dataset_batch_sampler`: round_robin
451
+
452
+ </details>
453
+
454
+ ### Framework Versions
455
+ - Python: 3.11.11
456
+ - Sentence Transformers: 3.1.1
457
+ - Transformers: 4.45.2
458
+ - PyTorch: 2.5.1+cu121
459
+ - Accelerate: 1.2.1
460
+ - Datasets: 3.2.0
461
+ - Tokenizers: 0.20.3
462
+
463
+ ## Citation
464
+
465
+ ### BibTeX
466
+
467
+ #### Sentence Transformers
468
+ ```bibtex
469
+ @inproceedings{reimers-2019-sentence-bert,
470
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
471
+ author = "Reimers, Nils and Gurevych, Iryna",
472
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
473
+ month = "11",
474
+ year = "2019",
475
+ publisher = "Association for Computational Linguistics",
476
+ url = "https://arxiv.org/abs/1908.10084",
477
+ }
478
+ ```
479
+
480
+ #### MultipleNegativesRankingLoss
481
+ ```bibtex
482
+ @misc{henderson2017efficient,
483
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
484
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
485
+ year={2017},
486
+ eprint={1705.00652},
487
+ archivePrefix={arXiv},
488
+ primaryClass={cs.CL}
489
+ }
490
+ ```
491
+
492
+ <!--
493
+ ## Glossary
494
+
495
+ *Clearly define terms in order to be accessible across audiences.*
496
+ -->
497
+
498
+ <!--
499
+ ## Model Card Authors
500
+
501
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
502
+ -->
503
+
504
+ <!--
505
+ ## Model Card Contact
506
+
507
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
508
+ -->
config.json ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "roberta-base",
3
+ "architectures": [
4
+ "RobertaModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "bos_token_id": 0,
8
+ "classifier_dropout": null,
9
+ "eos_token_id": 2,
10
+ "hidden_act": "gelu",
11
+ "hidden_dropout_prob": 0.1,
12
+ "hidden_size": 768,
13
+ "initializer_range": 0.02,
14
+ "intermediate_size": 3072,
15
+ "layer_norm_eps": 1e-05,
16
+ "max_position_embeddings": 514,
17
+ "model_type": "roberta",
18
+ "num_attention_heads": 12,
19
+ "num_hidden_layers": 12,
20
+ "pad_token_id": 1,
21
+ "position_embedding_type": "absolute",
22
+ "torch_dtype": "float32",
23
+ "transformers_version": "4.45.2",
24
+ "type_vocab_size": 1,
25
+ "use_cache": true,
26
+ "vocab_size": 50265
27
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.1.1",
4
+ "transformers": "4.45.2",
5
+ "pytorch": "2.5.1+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": "cosine"
10
+ }
merges.txt ADDED
The diff for this file is too large to render. See raw diff
 
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b7ab3e65e7878528ad21caa89369f453fecb97ba8c66f43dff54f924a26b5722
3
+ size 498604904
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 128,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": "<s>",
3
+ "cls_token": "<s>",
4
+ "eos_token": "</s>",
5
+ "mask_token": {
6
+ "content": "<mask>",
7
+ "lstrip": true,
8
+ "normalized": false,
9
+ "rstrip": false,
10
+ "single_word": false
11
+ },
12
+ "pad_token": "<pad>",
13
+ "sep_token": "</s>",
14
+ "unk_token": "<unk>"
15
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_prefix_space": false,
3
+ "added_tokens_decoder": {
4
+ "0": {
5
+ "content": "<s>",
6
+ "lstrip": false,
7
+ "normalized": true,
8
+ "rstrip": false,
9
+ "single_word": false,
10
+ "special": true
11
+ },
12
+ "1": {
13
+ "content": "<pad>",
14
+ "lstrip": false,
15
+ "normalized": true,
16
+ "rstrip": false,
17
+ "single_word": false,
18
+ "special": true
19
+ },
20
+ "2": {
21
+ "content": "</s>",
22
+ "lstrip": false,
23
+ "normalized": true,
24
+ "rstrip": false,
25
+ "single_word": false,
26
+ "special": true
27
+ },
28
+ "3": {
29
+ "content": "<unk>",
30
+ "lstrip": false,
31
+ "normalized": true,
32
+ "rstrip": false,
33
+ "single_word": false,
34
+ "special": true
35
+ },
36
+ "50264": {
37
+ "content": "<mask>",
38
+ "lstrip": true,
39
+ "normalized": false,
40
+ "rstrip": false,
41
+ "single_word": false,
42
+ "special": true
43
+ }
44
+ },
45
+ "bos_token": "<s>",
46
+ "clean_up_tokenization_spaces": false,
47
+ "cls_token": "<s>",
48
+ "eos_token": "</s>",
49
+ "errors": "replace",
50
+ "mask_token": "<mask>",
51
+ "model_max_length": 128,
52
+ "pad_token": "<pad>",
53
+ "sep_token": "</s>",
54
+ "tokenizer_class": "RobertaTokenizer",
55
+ "trim_offsets": true,
56
+ "unk_token": "<unk>"
57
+ }
vocab.json ADDED
The diff for this file is too large to render. See raw diff