YxBxRyXJx commited on
Commit
b4e2770
·
verified ·
1 Parent(s): a37ac5c

Add new SentenceTransformer model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": true,
4
+ "pooling_mode_mean_tokens": false,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,814 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: apache-2.0
5
+ tags:
6
+ - sentence-transformers
7
+ - sentence-similarity
8
+ - feature-extraction
9
+ - generated_from_trainer
10
+ - dataset_size:183
11
+ - loss:MatryoshkaLoss
12
+ - loss:MultipleNegativesRankingLoss
13
+ base_model: BAAI/bge-base-en-v1.5
14
+ widget:
15
+ - source_sentence: '14 William O. Douglas, quoted in Charles Hurd, Film Booking Issue
16
+ Ordered Reopened,” New York Times, May 4, 1948, 1. 15 Movie Crisis Laid to Video
17
+ Inroads And Dwindling of Foreign Market, New York Times, February 27, 1949, F1.
18
+ For details on the lawsuit and its effects, see Arthur De Vany and Henry McMillan,
19
+ Was the Antitrust Action that Broke Up the Movie Studios Good for the Movies?
20
+ Evidence from the Stock Market. American Law and Economics Review 6, no. 1 (2004):
21
+ 135-53; and J.C. Strick, The Economics of the Motion Picture Industry: A Survey,
22
+ Philosophy of the Social Sciences 8, no. 4 (December 1978): 406-17. 16 The Hollywood
23
+ feature films for which Eisler provided music are Hangmen Also Die (1942), None
24
+ But the Lonely Heart (1944), Jealousy (1945), The Spanish Main (1945); A Scandal
25
+ in Paris (1946), Deadline at Dawn (1946), Woman on the Beach (1947), and So Well
26
+ Remembered (1947). Most of these are middle-of-the-road genre pieces, but the
27
+ first NOTES 267'
28
+ sentences:
29
+ - What is the opinion of Ernest Irving, a pioneer of British film music, on the
30
+ overall quality of American film music?
31
+ - What is the title of the 2007 film directed by David Fincher, produced by Michael
32
+ Medavoy, and featuring a storyline based on a real-life serial killer, as mentioned
33
+ in the provided context information?
34
+ - What was the primary reason behind the lawsuit that led to the breakup of the
35
+ movie studios, as suggested by the article in the New York Times on February 27,
36
+ 1949?
37
+ - source_sentence: 'But Gorbman (who like Flinn and Kalinak approached film music
38
+ from a formal background not in musicology but in literary criticism) was certainly
39
+ not the first scholar engaged in so-called film studies44 to address the role
40
+ that extra-diegetic music played in classical-style films. Two years before Gorbman''s
41
+ book was published, the trio of Bordwell, Staiger, and Thompson brought out their
42
+ monumental The Classical Hollywood Cinema: Film Style and Production to 1960.
43
+ As noted above, and apropos of its title, the book focuses on filmic narrative
44
+ style and the technical devices that made this style possible. In its early pages,
45
+ however, it also contains insightful comments on classical cinema''s use of music.
46
+ The book''s first music-related passage lays a foundation for Gorbman''s point
47
+ about how a score might lend unity to a film by recycling distinctive themes that
48
+ within the THE GOLDEN AGE OF FILM MUSIC, 1933-49 143'
49
+ sentences:
50
+ - What is the possible reason, as suggested by David Thomson, for why David Lean's
51
+ filmmaking style may have declined after the movie "Summer Madness" (US, 1955)?
52
+ - What shift in the portrayal of hard body male characters in film, as exemplified
53
+ by the actors who played these roles in the 1980s and 1990s, suggests that societal
54
+ expectations and norms may be changing?
55
+ - What is the significance of the authors' formal background in literary criticism
56
+ rather than musicology, as mentioned in the context of Gorbman's approach to film
57
+ music?
58
+ - source_sentence: (1931); Georg Wilhelm Pabst's Kameradschaft (1931); Fritz Lang's
59
+ M (1931) and Das Testament der Dr. Mabuse (1932); and Carl Theodor Dreyer's Vampyr
60
+ (1932). These films’ subtle mix of actual silence with accompanying music and
61
+ more or less realistic sound effects has drawn and doubtless will continue to
62
+ draw serious analytical attention from film scholars.45 And even in their own
63
+ time they drew due attention aplenty from critics of avant-garde persuasion.46
64
+ The mere fact that these films differed from the sonic norm attracted the notice,
65
+ if not always the praise, of movie reviewers for the popular press. Writing from
66
+ London, a special correspondent for the New York Times observed that Hitchcock's
67
+ Blackmail goes some way to showing how the cinematograph and the microphone can
68
+ be mated without their union being forced upon the attention of a punctilious
69
+ world as VITAPHONE AND MOVIETONE, 1926-8 101
70
+ sentences:
71
+ - What was the primary limitation that led to the failure of Edison's first Kinetophone,
72
+ which was an early attempt at sound film featuring musical accompaniment?
73
+ - What was the specific sonic approach employed by the mentioned films of Georg
74
+ Wilhelm Pabst, Fritz Lang, and Carl Theodor Dreyer that drew serious analytical
75
+ attention from film scholars?
76
+ - What limitation in Martin Scorsese's background, as mentioned in the text, restricted
77
+ his choice of subjects at this stage in his career?
78
+ - source_sentence: "39\tdivided into small, three-dimensional cubes known as volumetric\
79
+ \ pixels, or voxels. When viewers are watching certain images, the voxel demonstrates\
80
+ \ how these images in the movie are mapped into brain activity. Clips of the movie\
81
+ \ are reconstructed through brain imaging and computer stimulation by associating\
82
+ \ visual patterns in the movie with the corresponding brain activity. However,\
83
+ \ these reconstructions are blurry and are hard to make because researchers say,\
84
+ \ blood flow signals measured using fMRI change much more slowly than the neural\
85
+ \ signals that encode dynamic information in movies. Psychology and neuroscience\
86
+ \ professor, Jack Gallant explains in an interview that primary visual cortex\
87
+ \ responds to the local features of the movie such as edges, colors, motion, and\
88
+ \ texture but this part of the brain cannot understand the objects in the movie.\
89
+ \ In addition, movies that show people are reconstructed with better accuracy\
90
+ \ than abstract images. Using Neuroimaging For Entertainment Success Can brain\
91
+ \ scans predict movie success in the box office? Two marketing researchers from\
92
+ \ the Rotterdam School of Management devised an experiment by using EEG on participants.\
93
+ \ EEG demonstrated that individual choice and box office success correlate with\
94
+ \ different types of brain activity. From article, How Neuroimaging Can Save The\
95
+ \ Entertainment Industry Millions of Dollars, it states, individual choice is\
96
+ \ predicted best by high frontocentral beta activity, the choice of the general\
97
+ \ population is predicted by frontal gamma activity. Perhaps, with quickly advanced\
98
+ \ technology, predicting movie genre and plots that can hit the box office could\
99
+ \ be successful. Neurocinema in Hollywood One strategy that helps filmmakers,\
100
+ \ producers, and distributors to achieve global market success is by using fMRI\
101
+ \ and EEG to make a better storyline, characters, sound effects, and other"
102
+ sentences:
103
+ - What significant change in the portrayal of Rocky's character is evident in the
104
+ 2015 movie Creed, as compared to the original 1976 film Rocky?
105
+ - What factors led to the selection of the films "Spider-man" (2002), "Cars" (2006),
106
+ and "Avatar" (2009) for the research project examining the relationship between
107
+ film and society in the early 2000s?
108
+ - What is the main reason why researchers find it challenging to reconstruct abstract
109
+ images from movie clips using brain imaging and computer stimulation?
110
+ - source_sentence: "11\tdocumentary film so unpleasant when most had sat through horror\
111
+ \ pictures that were appreciably more violent and bloody. The answer that McCauley\
112
+ \ came up with was that the fictional nature of horror films affords viewers a\
113
+ \ sense of control by placing psychological distance between them and the violent\
114
+ \ acts they have witnessed. Most people who view horror movies understand that\
115
+ \ the filmed events are unreal, which furnishes them with psychological distance\
116
+ \ from the horror portrayed in the film. In fact, there is evidence that young\
117
+ \ viewers who perceive greater realism in horror films are more negatively affected\
118
+ \ by their exposure to horror films than viewers who perceive the film as unreal\
119
+ \ (Hoekstra, Harris, & Helmick, 1999). Four Viewing Motivations for Graphic Horror\
120
+ \ According to Dr. Deirdre Johnston (1995) study Adolescents’ Motivations for\
121
+ \ Viewing Graphic Horror of Human Communication Research there are four different\
122
+ \ main reasons for viewing graphic horror. From the study of a small sample of\
123
+ \ 220 American adolescents who like watching horror movies, Dr. Johnston reported\
124
+ \ that: The four viewing motivations are found to be related to viewers’ cognitive\
125
+ \ and affective responses to horror films, as well as viewers’ tendency to identify\
126
+ \ with either the killers or victims in these films.\" Dr. Johnson notes that:\
127
+ \ 1) gore watchers typically had low empathy, high sensation seeking, and (among\
128
+ \ males only) a strong identification with the killer, 2) thrill watchers typically\
129
+ \ had both high empathy and sensation seeking, identified themselves more with\
130
+ \ the victims, and liked the suspense of the film, 3) independent watchers typically\
131
+ \ had a high empathy for the victim along with a high positive effect for overcoming\
132
+ \ fear, and 4) problem watchers typically had high empathy for the victim but\
133
+ \ were"
134
+ sentences:
135
+ - What was the name of the series published by Oliver Ditson from 1918-25 that contained
136
+ ensemble music for motion picture plays?
137
+ - What shift in the cultural, political, and social contexts of the 1980s and 1990s
138
+ may have led to the deconstruction of the hard body characters portrayed by actors
139
+ such as Stallone and Schwarzenegger in more recent movies?
140
+ - What is the primary reason why viewers who perceive greater realism in horror
141
+ films are more negatively affected by their exposure to horror films than viewers
142
+ who perceive the film as unreal?
143
+ datasets:
144
+ - YxBxRyXJx/QAsimple_for_BGE_241019
145
+ pipeline_tag: sentence-similarity
146
+ library_name: sentence-transformers
147
+ metrics:
148
+ - cosine_accuracy@1
149
+ - cosine_accuracy@3
150
+ - cosine_accuracy@5
151
+ - cosine_accuracy@10
152
+ - cosine_precision@1
153
+ - cosine_precision@3
154
+ - cosine_precision@5
155
+ - cosine_precision@10
156
+ - cosine_recall@1
157
+ - cosine_recall@3
158
+ - cosine_recall@5
159
+ - cosine_recall@10
160
+ - cosine_ndcg@10
161
+ - cosine_mrr@10
162
+ - cosine_map@100
163
+ model-index:
164
+ - name: BGE base Movie Matryoshka
165
+ results:
166
+ - task:
167
+ type: information-retrieval
168
+ name: Information Retrieval
169
+ dataset:
170
+ name: dim 768
171
+ type: dim_768
172
+ metrics:
173
+ - type: cosine_accuracy@1
174
+ value: 0.8205128205128205
175
+ name: Cosine Accuracy@1
176
+ - type: cosine_accuracy@3
177
+ value: 0.9743589743589743
178
+ name: Cosine Accuracy@3
179
+ - type: cosine_accuracy@5
180
+ value: 1.0
181
+ name: Cosine Accuracy@5
182
+ - type: cosine_accuracy@10
183
+ value: 1.0
184
+ name: Cosine Accuracy@10
185
+ - type: cosine_precision@1
186
+ value: 0.8205128205128205
187
+ name: Cosine Precision@1
188
+ - type: cosine_precision@3
189
+ value: 0.32478632478632485
190
+ name: Cosine Precision@3
191
+ - type: cosine_precision@5
192
+ value: 0.20000000000000004
193
+ name: Cosine Precision@5
194
+ - type: cosine_precision@10
195
+ value: 0.10000000000000002
196
+ name: Cosine Precision@10
197
+ - type: cosine_recall@1
198
+ value: 0.8205128205128205
199
+ name: Cosine Recall@1
200
+ - type: cosine_recall@3
201
+ value: 0.9743589743589743
202
+ name: Cosine Recall@3
203
+ - type: cosine_recall@5
204
+ value: 1.0
205
+ name: Cosine Recall@5
206
+ - type: cosine_recall@10
207
+ value: 1.0
208
+ name: Cosine Recall@10
209
+ - type: cosine_ndcg@10
210
+ value: 0.9207838928594967
211
+ name: Cosine Ndcg@10
212
+ - type: cosine_mrr@10
213
+ value: 0.8940170940170941
214
+ name: Cosine Mrr@10
215
+ - type: cosine_map@100
216
+ value: 0.8940170940170938
217
+ name: Cosine Map@100
218
+ - task:
219
+ type: information-retrieval
220
+ name: Information Retrieval
221
+ dataset:
222
+ name: dim 512
223
+ type: dim_512
224
+ metrics:
225
+ - type: cosine_accuracy@1
226
+ value: 0.8461538461538461
227
+ name: Cosine Accuracy@1
228
+ - type: cosine_accuracy@3
229
+ value: 0.9230769230769231
230
+ name: Cosine Accuracy@3
231
+ - type: cosine_accuracy@5
232
+ value: 1.0
233
+ name: Cosine Accuracy@5
234
+ - type: cosine_accuracy@10
235
+ value: 1.0
236
+ name: Cosine Accuracy@10
237
+ - type: cosine_precision@1
238
+ value: 0.8461538461538461
239
+ name: Cosine Precision@1
240
+ - type: cosine_precision@3
241
+ value: 0.30769230769230776
242
+ name: Cosine Precision@3
243
+ - type: cosine_precision@5
244
+ value: 0.20000000000000004
245
+ name: Cosine Precision@5
246
+ - type: cosine_precision@10
247
+ value: 0.10000000000000002
248
+ name: Cosine Precision@10
249
+ - type: cosine_recall@1
250
+ value: 0.8461538461538461
251
+ name: Cosine Recall@1
252
+ - type: cosine_recall@3
253
+ value: 0.9230769230769231
254
+ name: Cosine Recall@3
255
+ - type: cosine_recall@5
256
+ value: 1.0
257
+ name: Cosine Recall@5
258
+ - type: cosine_recall@10
259
+ value: 1.0
260
+ name: Cosine Recall@10
261
+ - type: cosine_ndcg@10
262
+ value: 0.9233350110390831
263
+ name: Cosine Ndcg@10
264
+ - type: cosine_mrr@10
265
+ value: 0.8982905982905982
266
+ name: Cosine Mrr@10
267
+ - type: cosine_map@100
268
+ value: 0.8982905982905982
269
+ name: Cosine Map@100
270
+ - task:
271
+ type: information-retrieval
272
+ name: Information Retrieval
273
+ dataset:
274
+ name: dim 256
275
+ type: dim_256
276
+ metrics:
277
+ - type: cosine_accuracy@1
278
+ value: 0.8461538461538461
279
+ name: Cosine Accuracy@1
280
+ - type: cosine_accuracy@3
281
+ value: 0.9230769230769231
282
+ name: Cosine Accuracy@3
283
+ - type: cosine_accuracy@5
284
+ value: 0.9487179487179487
285
+ name: Cosine Accuracy@5
286
+ - type: cosine_accuracy@10
287
+ value: 1.0
288
+ name: Cosine Accuracy@10
289
+ - type: cosine_precision@1
290
+ value: 0.8461538461538461
291
+ name: Cosine Precision@1
292
+ - type: cosine_precision@3
293
+ value: 0.30769230769230776
294
+ name: Cosine Precision@3
295
+ - type: cosine_precision@5
296
+ value: 0.18974358974358976
297
+ name: Cosine Precision@5
298
+ - type: cosine_precision@10
299
+ value: 0.10000000000000002
300
+ name: Cosine Precision@10
301
+ - type: cosine_recall@1
302
+ value: 0.8461538461538461
303
+ name: Cosine Recall@1
304
+ - type: cosine_recall@3
305
+ value: 0.9230769230769231
306
+ name: Cosine Recall@3
307
+ - type: cosine_recall@5
308
+ value: 0.9487179487179487
309
+ name: Cosine Recall@5
310
+ - type: cosine_recall@10
311
+ value: 1.0
312
+ name: Cosine Recall@10
313
+ - type: cosine_ndcg@10
314
+ value: 0.9234104189545929
315
+ name: Cosine Ndcg@10
316
+ - type: cosine_mrr@10
317
+ value: 0.898962148962149
318
+ name: Cosine Mrr@10
319
+ - type: cosine_map@100
320
+ value: 0.898962148962149
321
+ name: Cosine Map@100
322
+ - task:
323
+ type: information-retrieval
324
+ name: Information Retrieval
325
+ dataset:
326
+ name: dim 128
327
+ type: dim_128
328
+ metrics:
329
+ - type: cosine_accuracy@1
330
+ value: 0.7692307692307693
331
+ name: Cosine Accuracy@1
332
+ - type: cosine_accuracy@3
333
+ value: 0.8974358974358975
334
+ name: Cosine Accuracy@3
335
+ - type: cosine_accuracy@5
336
+ value: 0.9487179487179487
337
+ name: Cosine Accuracy@5
338
+ - type: cosine_accuracy@10
339
+ value: 0.9487179487179487
340
+ name: Cosine Accuracy@10
341
+ - type: cosine_precision@1
342
+ value: 0.7692307692307693
343
+ name: Cosine Precision@1
344
+ - type: cosine_precision@3
345
+ value: 0.29914529914529925
346
+ name: Cosine Precision@3
347
+ - type: cosine_precision@5
348
+ value: 0.18974358974358976
349
+ name: Cosine Precision@5
350
+ - type: cosine_precision@10
351
+ value: 0.09487179487179488
352
+ name: Cosine Precision@10
353
+ - type: cosine_recall@1
354
+ value: 0.7692307692307693
355
+ name: Cosine Recall@1
356
+ - type: cosine_recall@3
357
+ value: 0.8974358974358975
358
+ name: Cosine Recall@3
359
+ - type: cosine_recall@5
360
+ value: 0.9487179487179487
361
+ name: Cosine Recall@5
362
+ - type: cosine_recall@10
363
+ value: 0.9487179487179487
364
+ name: Cosine Recall@10
365
+ - type: cosine_ndcg@10
366
+ value: 0.8688480033444261
367
+ name: Cosine Ndcg@10
368
+ - type: cosine_mrr@10
369
+ value: 0.8418803418803418
370
+ name: Cosine Mrr@10
371
+ - type: cosine_map@100
372
+ value: 0.8443986568986569
373
+ name: Cosine Map@100
374
+ - task:
375
+ type: information-retrieval
376
+ name: Information Retrieval
377
+ dataset:
378
+ name: dim 64
379
+ type: dim_64
380
+ metrics:
381
+ - type: cosine_accuracy@1
382
+ value: 0.5641025641025641
383
+ name: Cosine Accuracy@1
384
+ - type: cosine_accuracy@3
385
+ value: 0.8717948717948718
386
+ name: Cosine Accuracy@3
387
+ - type: cosine_accuracy@5
388
+ value: 0.9230769230769231
389
+ name: Cosine Accuracy@5
390
+ - type: cosine_accuracy@10
391
+ value: 0.9487179487179487
392
+ name: Cosine Accuracy@10
393
+ - type: cosine_precision@1
394
+ value: 0.5641025641025641
395
+ name: Cosine Precision@1
396
+ - type: cosine_precision@3
397
+ value: 0.2905982905982907
398
+ name: Cosine Precision@3
399
+ - type: cosine_precision@5
400
+ value: 0.18461538461538465
401
+ name: Cosine Precision@5
402
+ - type: cosine_precision@10
403
+ value: 0.09487179487179488
404
+ name: Cosine Precision@10
405
+ - type: cosine_recall@1
406
+ value: 0.5641025641025641
407
+ name: Cosine Recall@1
408
+ - type: cosine_recall@3
409
+ value: 0.8717948717948718
410
+ name: Cosine Recall@3
411
+ - type: cosine_recall@5
412
+ value: 0.9230769230769231
413
+ name: Cosine Recall@5
414
+ - type: cosine_recall@10
415
+ value: 0.9487179487179487
416
+ name: Cosine Recall@10
417
+ - type: cosine_ndcg@10
418
+ value: 0.768187565996018
419
+ name: Cosine Ndcg@10
420
+ - type: cosine_mrr@10
421
+ value: 0.708119658119658
422
+ name: Cosine Mrr@10
423
+ - type: cosine_map@100
424
+ value: 0.7088711597999523
425
+ name: Cosine Map@100
426
+ ---
427
+
428
+ # BGE base Movie Matryoshka
429
+
430
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) on the [q_asimple_for_bge_241019](https://huggingface.co/datasets/YxBxRyXJx/QAsimple_for_BGE_241019) dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
431
+
432
+ ## Model Details
433
+
434
+ ### Model Description
435
+ - **Model Type:** Sentence Transformer
436
+ - **Base model:** [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) <!-- at revision a5beb1e3e68b9ab74eb54cfd186867f64f240e1a -->
437
+ - **Maximum Sequence Length:** 512 tokens
438
+ - **Output Dimensionality:** 768 dimensions
439
+ - **Similarity Function:** Cosine Similarity
440
+ - **Training Dataset:**
441
+ - [q_asimple_for_bge_241019](https://huggingface.co/datasets/YxBxRyXJx/QAsimple_for_BGE_241019)
442
+ - **Language:** en
443
+ - **License:** apache-2.0
444
+
445
+ ### Model Sources
446
+
447
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
448
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
449
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
450
+
451
+ ### Full Model Architecture
452
+
453
+ ```
454
+ SentenceTransformer(
455
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
456
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
457
+ (2): Normalize()
458
+ )
459
+ ```
460
+
461
+ ## Usage
462
+
463
+ ### Direct Usage (Sentence Transformers)
464
+
465
+ First install the Sentence Transformers library:
466
+
467
+ ```bash
468
+ pip install -U sentence-transformers
469
+ ```
470
+
471
+ Then you can load this model and run inference.
472
+ ```python
473
+ from sentence_transformers import SentenceTransformer
474
+
475
+ # Download from the 🤗 Hub
476
+ model = SentenceTransformer("YxBxRyXJx/bge-base-movie-matryoshka")
477
+ # Run inference
478
+ sentences = [
479
+ '11\tdocumentary film so unpleasant when most had sat through horror pictures that were appreciably more violent and bloody. The answer that McCauley came up with was that the fictional nature of horror films affords viewers a sense of control by placing psychological distance between them and the violent acts they have witnessed. Most people who view horror movies understand that the filmed events are unreal, which furnishes them with psychological distance from the horror portrayed in the film. In fact, there is evidence that young viewers who perceive greater realism in horror films are more negatively affected by their exposure to horror films than viewers who perceive the film as unreal (Hoekstra, Harris, & Helmick, 1999). Four Viewing Motivations for Graphic Horror According to Dr. Deirdre Johnston (1995) study Adolescents’ Motivations for Viewing Graphic Horror of Human Communication Research there are four different main reasons for viewing graphic horror. From the study of a small sample of 220 American adolescents who like watching horror movies, Dr. Johnston reported that: The four viewing motivations are found to be related to viewers’ cognitive and affective responses to horror films, as well as viewers’ tendency to identify with either the killers or victims in these films." Dr. Johnson notes that: 1) gore watchers typically had low empathy, high sensation seeking, and (among males only) a strong identification with the killer, 2) thrill watchers typically had both high empathy and sensation seeking, identified themselves more with the victims, and liked the suspense of the film, 3) independent watchers typically had a high empathy for the victim along with a high positive effect for overcoming fear, and 4) problem watchers typically had high empathy for the victim but were',
480
+ 'What is the primary reason why viewers who perceive greater realism in horror films are more negatively affected by their exposure to horror films than viewers who perceive the film as unreal?',
481
+ 'What shift in the cultural, political, and social contexts of the 1980s and 1990s may have led to the deconstruction of the hard body characters portrayed by actors such as Stallone and Schwarzenegger in more recent movies?',
482
+ ]
483
+ embeddings = model.encode(sentences)
484
+ print(embeddings.shape)
485
+ # [3, 768]
486
+
487
+ # Get the similarity scores for the embeddings
488
+ similarities = model.similarity(embeddings, embeddings)
489
+ print(similarities.shape)
490
+ # [3, 3]
491
+ ```
492
+
493
+ <!--
494
+ ### Direct Usage (Transformers)
495
+
496
+ <details><summary>Click to see the direct usage in Transformers</summary>
497
+
498
+ </details>
499
+ -->
500
+
501
+ <!--
502
+ ### Downstream Usage (Sentence Transformers)
503
+
504
+ You can finetune this model on your own dataset.
505
+
506
+ <details><summary>Click to expand</summary>
507
+
508
+ </details>
509
+ -->
510
+
511
+ <!--
512
+ ### Out-of-Scope Use
513
+
514
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
515
+ -->
516
+
517
+ ## Evaluation
518
+
519
+ ### Metrics
520
+
521
+ #### Information Retrieval
522
+
523
+ * Datasets: `dim_768`, `dim_512`, `dim_256`, `dim_128` and `dim_64`
524
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
525
+
526
+ | Metric | dim_768 | dim_512 | dim_256 | dim_128 | dim_64 |
527
+ |:--------------------|:-----------|:-----------|:-----------|:-----------|:-----------|
528
+ | cosine_accuracy@1 | 0.8205 | 0.8462 | 0.8462 | 0.7692 | 0.5641 |
529
+ | cosine_accuracy@3 | 0.9744 | 0.9231 | 0.9231 | 0.8974 | 0.8718 |
530
+ | cosine_accuracy@5 | 1.0 | 1.0 | 0.9487 | 0.9487 | 0.9231 |
531
+ | cosine_accuracy@10 | 1.0 | 1.0 | 1.0 | 0.9487 | 0.9487 |
532
+ | cosine_precision@1 | 0.8205 | 0.8462 | 0.8462 | 0.7692 | 0.5641 |
533
+ | cosine_precision@3 | 0.3248 | 0.3077 | 0.3077 | 0.2991 | 0.2906 |
534
+ | cosine_precision@5 | 0.2 | 0.2 | 0.1897 | 0.1897 | 0.1846 |
535
+ | cosine_precision@10 | 0.1 | 0.1 | 0.1 | 0.0949 | 0.0949 |
536
+ | cosine_recall@1 | 0.8205 | 0.8462 | 0.8462 | 0.7692 | 0.5641 |
537
+ | cosine_recall@3 | 0.9744 | 0.9231 | 0.9231 | 0.8974 | 0.8718 |
538
+ | cosine_recall@5 | 1.0 | 1.0 | 0.9487 | 0.9487 | 0.9231 |
539
+ | cosine_recall@10 | 1.0 | 1.0 | 1.0 | 0.9487 | 0.9487 |
540
+ | **cosine_ndcg@10** | **0.9208** | **0.9233** | **0.9234** | **0.8688** | **0.7682** |
541
+ | cosine_mrr@10 | 0.894 | 0.8983 | 0.899 | 0.8419 | 0.7081 |
542
+ | cosine_map@100 | 0.894 | 0.8983 | 0.899 | 0.8444 | 0.7089 |
543
+
544
+ <!--
545
+ ## Bias, Risks and Limitations
546
+
547
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
548
+ -->
549
+
550
+ <!--
551
+ ### Recommendations
552
+
553
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
554
+ -->
555
+
556
+ ## Training Details
557
+
558
+ ### Training Dataset
559
+
560
+ #### q_asimple_for_bge_241019
561
+
562
+ * Dataset: [q_asimple_for_bge_241019](https://huggingface.co/datasets/YxBxRyXJx/QAsimple_for_BGE_241019) at [66635cd](https://huggingface.co/datasets/YxBxRyXJx/QAsimple_for_BGE_241019/tree/66635cde6ada74a8cf5a84db10518119fc1c221d)
563
+ * Size: 183 training samples
564
+ * Columns: <code>positive</code> and <code>anchor</code>
565
+ * Approximate statistics based on the first 183 samples:
566
+ | | positive | anchor |
567
+ |:--------|:-------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
568
+ | type | string | string |
569
+ | details | <ul><li>min: 191 tokens</li><li>mean: 356.1 tokens</li><li>max: 512 tokens</li></ul> | <ul><li>min: 16 tokens</li><li>mean: 36.04 tokens</li><li>max: 66 tokens</li></ul> |
570
+ * Samples:
571
+ | positive | anchor |
572
+ |:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
573
+ | <code>1 Introduction Why do we watch horror films? What makes horror films so exciting to watch? Why do our bodies sweat and muscles tense when we are scared? How do filmmakers, producers, sound engineers, and cinematographers specifically design a horror film? Can horror movies cause negative, lasting effects on the audience? These are some of the questions that are answered by exploring the aesthetics of horror films and the psychology behind horror movies. Chapter 1, The Allure of Horror Film, illustrates why we are drawn to scary films by studying different psychological theories and factors. Ideas include: catharsis, subconscious mind, curiosity, thrill, escape from reality, relevance, unrealism, and imagination. Also, this chapter demonstrates why people would rather watch fiction films than documentaries and the motivations for viewing graphic horror. Chapter 2, Mise-en-scène in Horror Movies, includes purposeful arrangement of scenery and stage properties of horror movie. Also...</code> | <code>What is the name of the emerging field of scientists and filmmakers that uses fMRI and EEG to read people's brain activity while watching movie scenes?</code> |
574
+ | <code>3 Chapter 1: The Allure of Horror Film Overview Although watching horror films can make us feel anxious and uneasy, we still continue to watch other horror films one after another. It is ironic how we hate the feeling of being scared, but we still enjoy the thrill. So why do we pay money to watch something to be scared? Eight Theories on why we watch Horror Films From research by philosophers, psychoanalysts, and psychologists there are theories that can explain why we are drawn to watching horror films. The first theory, psychoanalyst, Sigmund Freud portrays that horror comes from the “uncanny” emergence of images and thoughts of the primitive id. The purpose of horror films is to highlight unconscious fears, desire, urges, and primeval archetypes that are buried deep in our collective subconscious images of mothers and shadows play important roles because they are common to us all. For example, in Alfred Hitchcock's Psycho, a mother plays the role of evil in the main character...</code> | <code>What process, introduced by the Greek Philosopher Aristotle, involves the release of negative emotions through the observation of violent or scary events, resulting in a purging of aggressive emotions?</code> |
575
+ | <code>5 principle unknowable (Jancovich, 2002, p. 35). This meaning, the audience already knows that the plot and the characters are already disgusting, but the surprises in the horror narrative through the discovery of curiosity should give satisfaction. Marvin Zuckerman (1979) proposed that people who scored high in sensation seeking scale often reported a greater interest in exciting things like rollercoasters, bungee jumping and horror films. He argued more individuals who are attracted to horror movies desire the sensation of experience. However, researchers did not find the correlation to thrill-seeking activities and enjoyment of watching horror films always significant. The Gender Socialization theory (1986) by Zillman, Weaver, Mundorf and Aust exposed 36 male and 36 female undergraduates to a horror movie with the same age, opposite-gender companion of low or high initial appeal who expressed mastery, affective indifference, or distress. They reported that young men enjoyed the fi...</code> | <code>What is the proposed theory by Marvin Zuckerman (1979) regarding the relationship between sensation seeking and interest in exciting activities, including horror films?</code> |
576
+ * Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
577
+ ```json
578
+ {
579
+ "loss": "MultipleNegativesRankingLoss",
580
+ "matryoshka_dims": [
581
+ 768,
582
+ 512,
583
+ 256,
584
+ 128,
585
+ 64
586
+ ],
587
+ "matryoshka_weights": [
588
+ 1,
589
+ 1,
590
+ 1,
591
+ 1,
592
+ 1
593
+ ],
594
+ "n_dims_per_step": -1
595
+ }
596
+ ```
597
+
598
+ ### Training Hyperparameters
599
+ #### Non-Default Hyperparameters
600
+
601
+ - `eval_strategy`: epoch
602
+ - `per_device_train_batch_size`: 32
603
+ - `per_device_eval_batch_size`: 16
604
+ - `gradient_accumulation_steps`: 16
605
+ - `learning_rate`: 2e-05
606
+ - `num_train_epochs`: 5
607
+ - `lr_scheduler_type`: cosine
608
+ - `warmup_ratio`: 0.1
609
+ - `bf16`: True
610
+ - `tf32`: True
611
+ - `load_best_model_at_end`: True
612
+ - `optim`: adamw_torch_fused
613
+ - `batch_sampler`: no_duplicates
614
+
615
+ #### All Hyperparameters
616
+ <details><summary>Click to expand</summary>
617
+
618
+ - `overwrite_output_dir`: False
619
+ - `do_predict`: False
620
+ - `eval_strategy`: epoch
621
+ - `prediction_loss_only`: True
622
+ - `per_device_train_batch_size`: 32
623
+ - `per_device_eval_batch_size`: 16
624
+ - `per_gpu_train_batch_size`: None
625
+ - `per_gpu_eval_batch_size`: None
626
+ - `gradient_accumulation_steps`: 16
627
+ - `eval_accumulation_steps`: None
628
+ - `torch_empty_cache_steps`: None
629
+ - `learning_rate`: 2e-05
630
+ - `weight_decay`: 0.0
631
+ - `adam_beta1`: 0.9
632
+ - `adam_beta2`: 0.999
633
+ - `adam_epsilon`: 1e-08
634
+ - `max_grad_norm`: 1.0
635
+ - `num_train_epochs`: 5
636
+ - `max_steps`: -1
637
+ - `lr_scheduler_type`: cosine
638
+ - `lr_scheduler_kwargs`: {}
639
+ - `warmup_ratio`: 0.1
640
+ - `warmup_steps`: 0
641
+ - `log_level`: passive
642
+ - `log_level_replica`: warning
643
+ - `log_on_each_node`: True
644
+ - `logging_nan_inf_filter`: True
645
+ - `save_safetensors`: True
646
+ - `save_on_each_node`: False
647
+ - `save_only_model`: False
648
+ - `restore_callback_states_from_checkpoint`: False
649
+ - `no_cuda`: False
650
+ - `use_cpu`: False
651
+ - `use_mps_device`: False
652
+ - `seed`: 42
653
+ - `data_seed`: None
654
+ - `jit_mode_eval`: False
655
+ - `use_ipex`: False
656
+ - `bf16`: True
657
+ - `fp16`: False
658
+ - `fp16_opt_level`: O1
659
+ - `half_precision_backend`: auto
660
+ - `bf16_full_eval`: False
661
+ - `fp16_full_eval`: False
662
+ - `tf32`: True
663
+ - `local_rank`: 0
664
+ - `ddp_backend`: None
665
+ - `tpu_num_cores`: None
666
+ - `tpu_metrics_debug`: False
667
+ - `debug`: []
668
+ - `dataloader_drop_last`: False
669
+ - `dataloader_num_workers`: 0
670
+ - `dataloader_prefetch_factor`: None
671
+ - `past_index`: -1
672
+ - `disable_tqdm`: False
673
+ - `remove_unused_columns`: True
674
+ - `label_names`: None
675
+ - `load_best_model_at_end`: True
676
+ - `ignore_data_skip`: False
677
+ - `fsdp`: []
678
+ - `fsdp_min_num_params`: 0
679
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
680
+ - `fsdp_transformer_layer_cls_to_wrap`: None
681
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
682
+ - `deepspeed`: None
683
+ - `label_smoothing_factor`: 0.0
684
+ - `optim`: adamw_torch_fused
685
+ - `optim_args`: None
686
+ - `adafactor`: False
687
+ - `group_by_length`: False
688
+ - `length_column_name`: length
689
+ - `ddp_find_unused_parameters`: None
690
+ - `ddp_bucket_cap_mb`: None
691
+ - `ddp_broadcast_buffers`: False
692
+ - `dataloader_pin_memory`: True
693
+ - `dataloader_persistent_workers`: False
694
+ - `skip_memory_metrics`: True
695
+ - `use_legacy_prediction_loop`: False
696
+ - `push_to_hub`: False
697
+ - `resume_from_checkpoint`: None
698
+ - `hub_model_id`: None
699
+ - `hub_strategy`: every_save
700
+ - `hub_private_repo`: False
701
+ - `hub_always_push`: False
702
+ - `gradient_checkpointing`: False
703
+ - `gradient_checkpointing_kwargs`: None
704
+ - `include_inputs_for_metrics`: False
705
+ - `include_for_metrics`: []
706
+ - `eval_do_concat_batches`: True
707
+ - `fp16_backend`: auto
708
+ - `push_to_hub_model_id`: None
709
+ - `push_to_hub_organization`: None
710
+ - `mp_parameters`:
711
+ - `auto_find_batch_size`: False
712
+ - `full_determinism`: False
713
+ - `torchdynamo`: None
714
+ - `ray_scope`: last
715
+ - `ddp_timeout`: 1800
716
+ - `torch_compile`: False
717
+ - `torch_compile_backend`: None
718
+ - `torch_compile_mode`: None
719
+ - `dispatch_batches`: None
720
+ - `split_batches`: None
721
+ - `include_tokens_per_second`: False
722
+ - `include_num_input_tokens_seen`: False
723
+ - `neftune_noise_alpha`: None
724
+ - `optim_target_modules`: None
725
+ - `batch_eval_metrics`: False
726
+ - `eval_on_start`: False
727
+ - `use_liger_kernel`: False
728
+ - `eval_use_gather_object`: False
729
+ - `average_tokens_across_devices`: False
730
+ - `prompts`: None
731
+ - `batch_sampler`: no_duplicates
732
+ - `multi_dataset_batch_sampler`: proportional
733
+
734
+ </details>
735
+
736
+ ### Training Logs
737
+ | Epoch | Step | dim_768_cosine_ndcg@10 | dim_512_cosine_ndcg@10 | dim_256_cosine_ndcg@10 | dim_128_cosine_ndcg@10 | dim_64_cosine_ndcg@10 |
738
+ |:-------:|:-----:|:----------------------:|:----------------------:|:----------------------:|:----------------------:|:---------------------:|
739
+ | 1.0 | 1 | 0.8987 | 0.8983 | 0.8835 | 0.8419 | 0.7773 |
740
+ | 2.0 | 2 | 0.9218 | 0.9141 | 0.9075 | 0.8721 | 0.8124 |
741
+ | 1.0 | 1 | 0.9218 | 0.9141 | 0.9075 | 0.8721 | 0.8124 |
742
+ | 2.0 | 2 | 0.9356 | 0.9302 | 0.9118 | 0.8750 | 0.8057 |
743
+ | **3.0** | **4** | **0.9302** | **0.9233** | **0.9234** | **0.8783** | **0.7759** |
744
+ | 4.0 | 5 | 0.9208 | 0.9233 | 0.9234 | 0.8688 | 0.7682 |
745
+
746
+ * The bold row denotes the saved checkpoint.
747
+
748
+ ### Framework Versions
749
+ - Python: 3.10.12
750
+ - Sentence Transformers: 3.3.1
751
+ - Transformers: 4.46.3
752
+ - PyTorch: 2.5.1+cu121
753
+ - Accelerate: 1.1.1
754
+ - Datasets: 3.1.0
755
+ - Tokenizers: 0.20.3
756
+
757
+ ## Citation
758
+
759
+ ### BibTeX
760
+
761
+ #### Sentence Transformers
762
+ ```bibtex
763
+ @inproceedings{reimers-2019-sentence-bert,
764
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
765
+ author = "Reimers, Nils and Gurevych, Iryna",
766
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
767
+ month = "11",
768
+ year = "2019",
769
+ publisher = "Association for Computational Linguistics",
770
+ url = "https://arxiv.org/abs/1908.10084",
771
+ }
772
+ ```
773
+
774
+ #### MatryoshkaLoss
775
+ ```bibtex
776
+ @misc{kusupati2024matryoshka,
777
+ title={Matryoshka Representation Learning},
778
+ author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
779
+ year={2024},
780
+ eprint={2205.13147},
781
+ archivePrefix={arXiv},
782
+ primaryClass={cs.LG}
783
+ }
784
+ ```
785
+
786
+ #### MultipleNegativesRankingLoss
787
+ ```bibtex
788
+ @misc{henderson2017efficient,
789
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
790
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
791
+ year={2017},
792
+ eprint={1705.00652},
793
+ archivePrefix={arXiv},
794
+ primaryClass={cs.CL}
795
+ }
796
+ ```
797
+
798
+ <!--
799
+ ## Glossary
800
+
801
+ *Clearly define terms in order to be accessible across audiences.*
802
+ -->
803
+
804
+ <!--
805
+ ## Model Card Authors
806
+
807
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
808
+ -->
809
+
810
+ <!--
811
+ ## Model Card Contact
812
+
813
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
814
+ -->
config.json ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "BAAI/bge-base-en-v1.5",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "gradient_checkpointing": false,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 768,
12
+ "id2label": {
13
+ "0": "LABEL_0"
14
+ },
15
+ "initializer_range": 0.02,
16
+ "intermediate_size": 3072,
17
+ "label2id": {
18
+ "LABEL_0": 0
19
+ },
20
+ "layer_norm_eps": 1e-12,
21
+ "max_position_embeddings": 512,
22
+ "model_type": "bert",
23
+ "num_attention_heads": 12,
24
+ "num_hidden_layers": 12,
25
+ "pad_token_id": 0,
26
+ "position_embedding_type": "absolute",
27
+ "torch_dtype": "float32",
28
+ "transformers_version": "4.46.3",
29
+ "type_vocab_size": 2,
30
+ "use_cache": true,
31
+ "vocab_size": 30522
32
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.3.1",
4
+ "transformers": "4.46.3",
5
+ "pytorch": "2.5.1+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": "cosine"
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e287f2de7361be61126ea0543b523b9a6a6f5fbe5cbd1bbc994f745794a46b5f
3
+ size 437951328
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": true
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "mask_token": "[MASK]",
49
+ "model_max_length": 512,
50
+ "never_split": null,
51
+ "pad_token": "[PAD]",
52
+ "sep_token": "[SEP]",
53
+ "strip_accents": null,
54
+ "tokenize_chinese_chars": true,
55
+ "tokenizer_class": "BertTokenizer",
56
+ "unk_token": "[UNK]"
57
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff