Marco127 commited on
Commit
5459d52
·
verified ·
1 Parent(s): d87179a

Add new SentenceTransformer model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": true,
4
+ "pooling_mode_mean_tokens": false,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,569 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - sentence-transformers
4
+ - sentence-similarity
5
+ - feature-extraction
6
+ - generated_from_trainer
7
+ - dataset_size:672
8
+ - loss:ContrastiveLoss
9
+ base_model: sentence-transformers/multi-qa-mpnet-base-dot-v1
10
+ widget:
11
+ - source_sentence: '
12
+
13
+ Animals may not be allowed onto beds or other furniture, which serves for
14
+
15
+ guests. It is not permitted to use baths, showers or washbasins for bathing or
16
+
17
+ washing animals.'
18
+ sentences:
19
+ - '
20
+
21
+ Please advise of any special needs such as high-chairs and sleeping cots.'
22
+ - '
23
+
24
+ Animals may not be allowed onto beds or other furniture, which serves for
25
+
26
+ guests. It is not permitted to use baths, showers or washbasins for bathing or
27
+
28
+ washing animals.'
29
+ - '
30
+
31
+ It is strongly advised that you arrange adequate insurance cover such as cancellation
32
+ due to illness,
33
+
34
+ accident or injury, personal accident and personal liability, loss of or damage
35
+ to baggage and sport
36
+
37
+ equipment (Note that is not an exhaustive list). We will not be responsible or
38
+ liable if you fail to take
39
+
40
+ adequate insurance cover or none at all.'
41
+ - source_sentence: 'Owners are responsible for ensuring that animals are kept quiet
42
+ between the
43
+
44
+ hours of 10:00 pm and 06:00 am. In the case of failure to abide by this
45
+
46
+ regulation the guest may be asked to leave the hotel without a refund of the
47
+
48
+ price of the night''s accommodation.'
49
+ sentences:
50
+ - '
51
+
52
+ Visitors are not allowed in the rooms and must be entertained in the lounges and/or
53
+ other public areas
54
+
55
+ provided.'
56
+ - 'To ensure the safety and comfort of everyone in the hotel, the Management
57
+
58
+ reserves the right to terminate the accommodation of guests who fail to comply
59
+
60
+ with the following rules and regulations.'
61
+ - 'Owners are responsible for ensuring that animals are kept quiet between the
62
+
63
+ hours of 10:00 pm and 06:00 am. In the case of failure to abide by this
64
+
65
+ regulation the guest may be asked to leave the hotel without a refund of the
66
+
67
+ price of the night''s accommodation.'
68
+ - source_sentence: '
69
+
70
+ We ask all guests to behave in such a way that they do not disturb other guests
71
+ and the neighborhood.
72
+
73
+ The hotel staff is authorized to refuse services to a person who violates this
74
+ rule.'
75
+ sentences:
76
+ - '
77
+
78
+ Please take note of the limitation specified for the room you have booked.
79
+
80
+ If such number is exceeded, whether temporarily or over-night, we reserve the
81
+ right to do one or more of
82
+
83
+ the following: cancel your booking; retain all the monies you''ve paid; request
84
+ you to vacate your room(s)
85
+
86
+ forthwith, charge a higher rate for the room or recover all monies due.'
87
+ - '
88
+
89
+ We ask all guests to behave in such a way that they do not disturb other guests
90
+ and the neighborhood.
91
+
92
+ The hotel staff is authorized to refuse services to a person who violates this
93
+ rule.'
94
+ - 'We will only deal with your information as indicated in the booking/reservation
95
+ and we will only process your
96
+
97
+ personal information (both terms as defined in the Protection of Personal Information
98
+ Act, act 4 of 2013 [''the
99
+
100
+ POPIA''] and the European Union General Data Protection Regulation – (''GDPR'')
101
+ and any Special Personal
102
+
103
+ Information (as defined in the GDPR & POPIA), which processing includes amongst
104
+ others the ''collecting,
105
+
106
+ storing and dissemination'' of your personal information (as defined in GDPR &
107
+ POPIA).'
108
+ - source_sentence: '
109
+
110
+ All articles stored in the luggage storage room are received at the owner’s own
111
+ risk.'
112
+ sentences:
113
+ - "\n Unregistered visitors are not permitted to enter guest rooms or other areas\
114
+ \ of\nthe hotel. An additional fee for unregistered guests will be charged to\
115
+ \ the\naccount of the guest(s) registered to the room."
116
+ - 'Please advise us if you anticipate arriving late as bookings will be cancelled
117
+ by 17:00 on the day of arrival,
118
+
119
+ unless we have been so notified.'
120
+ - '
121
+
122
+ All articles stored in the luggage storage room are received at the owner’s own
123
+ risk.'
124
+ - source_sentence: ' In the event of a disturbance, one polite request (warning) will
125
+
126
+ be given to reduce the noise. If our request is not followed, the guest will be
127
+ asked to leave
128
+
129
+ the hotel without refund and may be charged Guest Compensation Disturbance Fee.'
130
+ sentences:
131
+ - '
132
+
133
+ Without limiting the generality of the aforementioned, it applies to pay-to-view
134
+ TV programmes or videos, as
135
+
136
+ well as telephone calls or any other expenses of a similar nature that is made
137
+ from your room, you will be
138
+
139
+ deemed to be the contracting party.'
140
+ - 'Pets are not allowed in the restaurant during breakfast time
141
+
142
+ (7:00 – 10:30) for hygienic reasons due to the breakfast’s buffet style. An
143
+
144
+ exception is the case when the hotel terrace is open, as pets can be taken to
145
+
146
+ the terrace through the hotel''s main entrance and they can stay there during
147
+
148
+ breakfast.'
149
+ - ' In the event of a disturbance, one polite request (warning) will
150
+
151
+ be given to reduce the noise. If our request is not followed, the guest will be
152
+ asked to leave
153
+
154
+ the hotel without refund and may be charged Guest Compensation Disturbance Fee.'
155
+ pipeline_tag: sentence-similarity
156
+ library_name: sentence-transformers
157
+ metrics:
158
+ - dot_accuracy
159
+ - dot_accuracy_threshold
160
+ - dot_f1
161
+ - dot_f1_threshold
162
+ - dot_precision
163
+ - dot_recall
164
+ - dot_ap
165
+ - dot_mcc
166
+ model-index:
167
+ - name: SentenceTransformer based on sentence-transformers/multi-qa-mpnet-base-dot-v1
168
+ results:
169
+ - task:
170
+ type: binary-classification
171
+ name: Binary Classification
172
+ dataset:
173
+ name: Unknown
174
+ type: unknown
175
+ metrics:
176
+ - type: dot_accuracy
177
+ value: 0.6745562130177515
178
+ name: Dot Accuracy
179
+ - type: dot_accuracy_threshold
180
+ value: 49.0201301574707
181
+ name: Dot Accuracy Threshold
182
+ - type: dot_f1
183
+ value: 0.4932735426008969
184
+ name: Dot F1
185
+ - type: dot_f1_threshold
186
+ value: 35.02415466308594
187
+ name: Dot F1 Threshold
188
+ - type: dot_precision
189
+ value: 0.32934131736526945
190
+ name: Dot Precision
191
+ - type: dot_recall
192
+ value: 0.9821428571428571
193
+ name: Dot Recall
194
+ - type: dot_ap
195
+ value: 0.3294144882113245
196
+ name: Dot Ap
197
+ - type: dot_mcc
198
+ value: -0.03920743101752848
199
+ name: Dot Mcc
200
+ ---
201
+
202
+ # SentenceTransformer based on sentence-transformers/multi-qa-mpnet-base-dot-v1
203
+
204
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/multi-qa-mpnet-base-dot-v1](https://huggingface.co/sentence-transformers/multi-qa-mpnet-base-dot-v1). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
205
+
206
+ ## Model Details
207
+
208
+ ### Model Description
209
+ - **Model Type:** Sentence Transformer
210
+ - **Base model:** [sentence-transformers/multi-qa-mpnet-base-dot-v1](https://huggingface.co/sentence-transformers/multi-qa-mpnet-base-dot-v1) <!-- at revision 4633e80e17ea975bc090c97b049da26062b054d3 -->
211
+ - **Maximum Sequence Length:** 512 tokens
212
+ - **Output Dimensionality:** 768 dimensions
213
+ - **Similarity Function:** Dot Product
214
+ <!-- - **Training Dataset:** Unknown -->
215
+ <!-- - **Language:** Unknown -->
216
+ <!-- - **License:** Unknown -->
217
+
218
+ ### Model Sources
219
+
220
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
221
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
222
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
223
+
224
+ ### Full Model Architecture
225
+
226
+ ```
227
+ SentenceTransformer(
228
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: MPNetModel
229
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
230
+ )
231
+ ```
232
+
233
+ ## Usage
234
+
235
+ ### Direct Usage (Sentence Transformers)
236
+
237
+ First install the Sentence Transformers library:
238
+
239
+ ```bash
240
+ pip install -U sentence-transformers
241
+ ```
242
+
243
+ Then you can load this model and run inference.
244
+ ```python
245
+ from sentence_transformers import SentenceTransformer
246
+
247
+ # Download from the 🤗 Hub
248
+ model = SentenceTransformer("Marco127/Argu_T3")
249
+ # Run inference
250
+ sentences = [
251
+ ' In the event of a disturbance, one polite request (warning) will\nbe given to reduce the noise. If our request is not followed, the guest will be asked to leave\nthe hotel without refund and may be charged Guest Compensation Disturbance Fee.',
252
+ ' In the event of a disturbance, one polite request (warning) will\nbe given to reduce the noise. If our request is not followed, the guest will be asked to leave\nthe hotel without refund and may be charged Guest Compensation Disturbance Fee.',
253
+ '\nWithout limiting the generality of the aforementioned, it applies to pay-to-view TV programmes or videos, as\nwell as telephone calls or any other expenses of a similar nature that is made from your room, you will be\ndeemed to be the contracting party.',
254
+ ]
255
+ embeddings = model.encode(sentences)
256
+ print(embeddings.shape)
257
+ # [3, 768]
258
+
259
+ # Get the similarity scores for the embeddings
260
+ similarities = model.similarity(embeddings, embeddings)
261
+ print(similarities.shape)
262
+ # [3, 3]
263
+ ```
264
+
265
+ <!--
266
+ ### Direct Usage (Transformers)
267
+
268
+ <details><summary>Click to see the direct usage in Transformers</summary>
269
+
270
+ </details>
271
+ -->
272
+
273
+ <!--
274
+ ### Downstream Usage (Sentence Transformers)
275
+
276
+ You can finetune this model on your own dataset.
277
+
278
+ <details><summary>Click to expand</summary>
279
+
280
+ </details>
281
+ -->
282
+
283
+ <!--
284
+ ### Out-of-Scope Use
285
+
286
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
287
+ -->
288
+
289
+ ## Evaluation
290
+
291
+ ### Metrics
292
+
293
+ #### Binary Classification
294
+
295
+ * Evaluated with [<code>BinaryClassificationEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.BinaryClassificationEvaluator)
296
+
297
+ | Metric | Value |
298
+ |:-----------------------|:-----------|
299
+ | dot_accuracy | 0.6746 |
300
+ | dot_accuracy_threshold | 49.0201 |
301
+ | dot_f1 | 0.4933 |
302
+ | dot_f1_threshold | 35.0242 |
303
+ | dot_precision | 0.3293 |
304
+ | dot_recall | 0.9821 |
305
+ | **dot_ap** | **0.3294** |
306
+ | dot_mcc | -0.0392 |
307
+
308
+ <!--
309
+ ## Bias, Risks and Limitations
310
+
311
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
312
+ -->
313
+
314
+ <!--
315
+ ### Recommendations
316
+
317
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
318
+ -->
319
+
320
+ ## Training Details
321
+
322
+ ### Training Dataset
323
+
324
+ #### Unnamed Dataset
325
+
326
+ * Size: 672 training samples
327
+ * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
328
+ * Approximate statistics based on the first 672 samples:
329
+ | | sentence1 | sentence2 | label |
330
+ |:--------|:------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:------------------------------------------------|
331
+ | type | string | string | int |
332
+ | details | <ul><li>min: 11 tokens</li><li>mean: 48.63 tokens</li><li>max: 156 tokens</li></ul> | <ul><li>min: 11 tokens</li><li>mean: 48.63 tokens</li><li>max: 156 tokens</li></ul> | <ul><li>0: ~66.67%</li><li>1: ~33.33%</li></ul> |
333
+ * Samples:
334
+ | sentence1 | sentence2 | label |
335
+ |:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
336
+ | <code><br>The pets can not be left without supervision if there is a risk of causing any<br>damage or might disturb other guests.</code> | <code><br>The pets can not be left without supervision if there is a risk of causing any<br>damage or might disturb other guests.</code> | <code>0</code> |
337
+ | <code><br>Any guest in violation of these rules may be asked to leave the hotel with no refund. Extra copies of these<br>rules are available at the Front Desk upon request.</code> | <code><br>Any guest in violation of these rules may be asked to leave the hotel with no refund. Extra copies of these<br>rules are available at the Front Desk upon request.</code> | <code>0</code> |
338
+ | <code><br>Consuming the products from the minibar involves additional costs. You can find the<br>prices in the kitchen area.</code> | <code><br>Consuming the products from the minibar involves additional costs. You can find the<br>prices in the kitchen area.</code> | <code>0</code> |
339
+ * Loss: [<code>ContrastiveLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#contrastiveloss) with these parameters:
340
+ ```json
341
+ {
342
+ "distance_metric": "SiameseDistanceMetric.COSINE_DISTANCE",
343
+ "margin": 0.5,
344
+ "size_average": true
345
+ }
346
+ ```
347
+
348
+ ### Evaluation Dataset
349
+
350
+ #### Unnamed Dataset
351
+
352
+ * Size: 169 evaluation samples
353
+ * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
354
+ * Approximate statistics based on the first 169 samples:
355
+ | | sentence1 | sentence2 | label |
356
+ |:--------|:------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:------------------------------------------------|
357
+ | type | string | string | int |
358
+ | details | <ul><li>min: 13 tokens</li><li>mean: 46.01 tokens</li><li>max: 156 tokens</li></ul> | <ul><li>min: 13 tokens</li><li>mean: 46.01 tokens</li><li>max: 156 tokens</li></ul> | <ul><li>0: ~66.86%</li><li>1: ~33.14%</li></ul> |
359
+ * Samples:
360
+ | sentence1 | sentence2 | label |
361
+ |:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
362
+ | <code><br>I understand and accept that the BON Hotels Group collects the personal information ("personal<br>information") of all persons in my party for purposes of loyalty programmes and special offers. I, on behalf of<br>all in my party, expressly consent and grant permission to the BON Hotels Group to: -<br>collect, collate, process, study and use the personal information; and<br>communicate directly with me/us from time to time, unless I have stated to the contrary below.</code> | <code><br>I understand and accept that the BON Hotels Group collects the personal information ("personal<br>information") of all persons in my party for purposes of loyalty programmes and special offers. I, on behalf of<br>all in my party, expressly consent and grant permission to the BON Hotels Group to: -<br>collect, collate, process, study and use the personal information; and<br>communicate directly with me/us from time to time, unless I have stated to the contrary below.</code> | <code>0</code> |
363
+ | <code>However, in lieu of the above, any such goods will only be kept by us for 6 (six) months. At the end of which<br>period, we reserve the right in our sole discretion to dispose thereof and you will have no right of recourse<br>against us.</code> | <code>However, in lieu of the above, any such goods will only be kept by us for 6 (six) months. At the end of which<br>period, we reserve the right in our sole discretion to dispose thereof and you will have no right of recourse<br>against us.</code> | <code>0</code> |
364
+ | <code> In cases where the hotel<br>suffers damage (either physical, or moral) due to the guests’ violation of the above rules, it<br>may charge a compensation fee in proportion to the damage. Moral damage may be for<br>example disturbing other guests, thus ruining the reputation of the hotel.</code> | <code> In cases where the hotel<br>suffers damage (either physical, or moral) due to the guests’ violation of the above rules, it<br>may charge a compensation fee in proportion to the damage. Moral damage may be for<br>example disturbing other guests, thus ruining the reputation of the hotel.</code> | <code>0</code> |
365
+ * Loss: [<code>ContrastiveLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#contrastiveloss) with these parameters:
366
+ ```json
367
+ {
368
+ "distance_metric": "SiameseDistanceMetric.COSINE_DISTANCE",
369
+ "margin": 0.5,
370
+ "size_average": true
371
+ }
372
+ ```
373
+
374
+ ### Training Hyperparameters
375
+ #### Non-Default Hyperparameters
376
+
377
+ - `eval_strategy`: steps
378
+ - `per_device_train_batch_size`: 16
379
+ - `per_device_eval_batch_size`: 16
380
+ - `learning_rate`: 1e-05
381
+ - `num_train_epochs`: 2
382
+ - `warmup_ratio`: 0.1
383
+ - `fp16`: True
384
+ - `batch_sampler`: no_duplicates
385
+
386
+ #### All Hyperparameters
387
+ <details><summary>Click to expand</summary>
388
+
389
+ - `overwrite_output_dir`: False
390
+ - `do_predict`: False
391
+ - `eval_strategy`: steps
392
+ - `prediction_loss_only`: True
393
+ - `per_device_train_batch_size`: 16
394
+ - `per_device_eval_batch_size`: 16
395
+ - `per_gpu_train_batch_size`: None
396
+ - `per_gpu_eval_batch_size`: None
397
+ - `gradient_accumulation_steps`: 1
398
+ - `eval_accumulation_steps`: None
399
+ - `torch_empty_cache_steps`: None
400
+ - `learning_rate`: 1e-05
401
+ - `weight_decay`: 0.0
402
+ - `adam_beta1`: 0.9
403
+ - `adam_beta2`: 0.999
404
+ - `adam_epsilon`: 1e-08
405
+ - `max_grad_norm`: 1.0
406
+ - `num_train_epochs`: 2
407
+ - `max_steps`: -1
408
+ - `lr_scheduler_type`: linear
409
+ - `lr_scheduler_kwargs`: {}
410
+ - `warmup_ratio`: 0.1
411
+ - `warmup_steps`: 0
412
+ - `log_level`: passive
413
+ - `log_level_replica`: warning
414
+ - `log_on_each_node`: True
415
+ - `logging_nan_inf_filter`: True
416
+ - `save_safetensors`: True
417
+ - `save_on_each_node`: False
418
+ - `save_only_model`: False
419
+ - `restore_callback_states_from_checkpoint`: False
420
+ - `no_cuda`: False
421
+ - `use_cpu`: False
422
+ - `use_mps_device`: False
423
+ - `seed`: 42
424
+ - `data_seed`: None
425
+ - `jit_mode_eval`: False
426
+ - `use_ipex`: False
427
+ - `bf16`: False
428
+ - `fp16`: True
429
+ - `fp16_opt_level`: O1
430
+ - `half_precision_backend`: auto
431
+ - `bf16_full_eval`: False
432
+ - `fp16_full_eval`: False
433
+ - `tf32`: None
434
+ - `local_rank`: 0
435
+ - `ddp_backend`: None
436
+ - `tpu_num_cores`: None
437
+ - `tpu_metrics_debug`: False
438
+ - `debug`: []
439
+ - `dataloader_drop_last`: False
440
+ - `dataloader_num_workers`: 0
441
+ - `dataloader_prefetch_factor`: None
442
+ - `past_index`: -1
443
+ - `disable_tqdm`: False
444
+ - `remove_unused_columns`: True
445
+ - `label_names`: None
446
+ - `load_best_model_at_end`: False
447
+ - `ignore_data_skip`: False
448
+ - `fsdp`: []
449
+ - `fsdp_min_num_params`: 0
450
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
451
+ - `fsdp_transformer_layer_cls_to_wrap`: None
452
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
453
+ - `deepspeed`: None
454
+ - `label_smoothing_factor`: 0.0
455
+ - `optim`: adamw_torch
456
+ - `optim_args`: None
457
+ - `adafactor`: False
458
+ - `group_by_length`: False
459
+ - `length_column_name`: length
460
+ - `ddp_find_unused_parameters`: None
461
+ - `ddp_bucket_cap_mb`: None
462
+ - `ddp_broadcast_buffers`: False
463
+ - `dataloader_pin_memory`: True
464
+ - `dataloader_persistent_workers`: False
465
+ - `skip_memory_metrics`: True
466
+ - `use_legacy_prediction_loop`: False
467
+ - `push_to_hub`: False
468
+ - `resume_from_checkpoint`: None
469
+ - `hub_model_id`: None
470
+ - `hub_strategy`: every_save
471
+ - `hub_private_repo`: None
472
+ - `hub_always_push`: False
473
+ - `gradient_checkpointing`: False
474
+ - `gradient_checkpointing_kwargs`: None
475
+ - `include_inputs_for_metrics`: False
476
+ - `include_for_metrics`: []
477
+ - `eval_do_concat_batches`: True
478
+ - `fp16_backend`: auto
479
+ - `push_to_hub_model_id`: None
480
+ - `push_to_hub_organization`: None
481
+ - `mp_parameters`:
482
+ - `auto_find_batch_size`: False
483
+ - `full_determinism`: False
484
+ - `torchdynamo`: None
485
+ - `ray_scope`: last
486
+ - `ddp_timeout`: 1800
487
+ - `torch_compile`: False
488
+ - `torch_compile_backend`: None
489
+ - `torch_compile_mode`: None
490
+ - `dispatch_batches`: None
491
+ - `split_batches`: None
492
+ - `include_tokens_per_second`: False
493
+ - `include_num_input_tokens_seen`: False
494
+ - `neftune_noise_alpha`: None
495
+ - `optim_target_modules`: None
496
+ - `batch_eval_metrics`: False
497
+ - `eval_on_start`: False
498
+ - `use_liger_kernel`: False
499
+ - `eval_use_gather_object`: False
500
+ - `average_tokens_across_devices`: False
501
+ - `prompts`: None
502
+ - `batch_sampler`: no_duplicates
503
+ - `multi_dataset_batch_sampler`: proportional
504
+
505
+ </details>
506
+
507
+ ### Training Logs
508
+ | Epoch | Step | dot_ap |
509
+ |:-----:|:----:|:------:|
510
+ | -1 | -1 | 0.3294 |
511
+
512
+
513
+ ### Framework Versions
514
+ - Python: 3.11.11
515
+ - Sentence Transformers: 3.4.1
516
+ - Transformers: 4.48.3
517
+ - PyTorch: 2.5.1+cu124
518
+ - Accelerate: 1.3.0
519
+ - Datasets: 3.2.0
520
+ - Tokenizers: 0.21.0
521
+
522
+ ## Citation
523
+
524
+ ### BibTeX
525
+
526
+ #### Sentence Transformers
527
+ ```bibtex
528
+ @inproceedings{reimers-2019-sentence-bert,
529
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
530
+ author = "Reimers, Nils and Gurevych, Iryna",
531
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
532
+ month = "11",
533
+ year = "2019",
534
+ publisher = "Association for Computational Linguistics",
535
+ url = "https://arxiv.org/abs/1908.10084",
536
+ }
537
+ ```
538
+
539
+ #### ContrastiveLoss
540
+ ```bibtex
541
+ @inproceedings{hadsell2006dimensionality,
542
+ author={Hadsell, R. and Chopra, S. and LeCun, Y.},
543
+ booktitle={2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)},
544
+ title={Dimensionality Reduction by Learning an Invariant Mapping},
545
+ year={2006},
546
+ volume={2},
547
+ number={},
548
+ pages={1735-1742},
549
+ doi={10.1109/CVPR.2006.100}
550
+ }
551
+ ```
552
+
553
+ <!--
554
+ ## Glossary
555
+
556
+ *Clearly define terms in order to be accessible across audiences.*
557
+ -->
558
+
559
+ <!--
560
+ ## Model Card Authors
561
+
562
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
563
+ -->
564
+
565
+ <!--
566
+ ## Model Card Contact
567
+
568
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
569
+ -->
config.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "sentence-transformers/multi-qa-mpnet-base-dot-v1",
3
+ "architectures": [
4
+ "MPNetModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "bos_token_id": 0,
8
+ "eos_token_id": 2,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 768,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 3072,
14
+ "layer_norm_eps": 1e-05,
15
+ "max_position_embeddings": 514,
16
+ "model_type": "mpnet",
17
+ "num_attention_heads": 12,
18
+ "num_hidden_layers": 12,
19
+ "pad_token_id": 1,
20
+ "relative_attention_num_buckets": 32,
21
+ "torch_dtype": "float32",
22
+ "transformers_version": "4.48.3",
23
+ "vocab_size": 30527
24
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.4.1",
4
+ "transformers": "4.48.3",
5
+ "pytorch": "2.5.1+cu124"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": "dot"
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:46d8061cc6b0f1e79d939b97790c41c637b6141ed3a398eeb8f5d841d78a93b0
3
+ size 437967672
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<s>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "cls_token": {
10
+ "content": "<s>",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "eos_token": {
17
+ "content": "</s>",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "mask_token": {
24
+ "content": "<mask>",
25
+ "lstrip": true,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "pad_token": {
31
+ "content": "<pad>",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ },
37
+ "sep_token": {
38
+ "content": "</s>",
39
+ "lstrip": false,
40
+ "normalized": false,
41
+ "rstrip": false,
42
+ "single_word": false
43
+ },
44
+ "unk_token": {
45
+ "content": "[UNK]",
46
+ "lstrip": false,
47
+ "normalized": false,
48
+ "rstrip": false,
49
+ "single_word": false
50
+ }
51
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,73 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "<s>",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "1": {
12
+ "content": "<pad>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "2": {
20
+ "content": "</s>",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "3": {
28
+ "content": "<unk>",
29
+ "lstrip": false,
30
+ "normalized": true,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "104": {
36
+ "content": "[UNK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ },
43
+ "30526": {
44
+ "content": "<mask>",
45
+ "lstrip": true,
46
+ "normalized": false,
47
+ "rstrip": false,
48
+ "single_word": false,
49
+ "special": true
50
+ }
51
+ },
52
+ "bos_token": "<s>",
53
+ "clean_up_tokenization_spaces": false,
54
+ "cls_token": "<s>",
55
+ "do_lower_case": true,
56
+ "eos_token": "</s>",
57
+ "extra_special_tokens": {},
58
+ "mask_token": "<mask>",
59
+ "max_length": 250,
60
+ "model_max_length": 512,
61
+ "pad_to_multiple_of": null,
62
+ "pad_token": "<pad>",
63
+ "pad_token_type_id": 0,
64
+ "padding_side": "right",
65
+ "sep_token": "</s>",
66
+ "stride": 0,
67
+ "strip_accents": null,
68
+ "tokenize_chinese_chars": true,
69
+ "tokenizer_class": "MPNetTokenizer",
70
+ "truncation_side": "right",
71
+ "truncation_strategy": "longest_first",
72
+ "unk_token": "[UNK]"
73
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff