SentenceTransformer

This is a sentence-transformers model trained on the parquet dataset. It maps sentences & paragraphs to a 512-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Type: Sentence Transformer
Maximum Sequence Length: 512 tokens
Output Dimensionality: 512 dimensions
Similarity Function: Cosine Similarity
Training Dataset:
- parquet

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 512, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("pankajrajdeo/Bioformer-8L-UMLS-Pubmed-TCE-Epoch-1")
# Run inference
sentences = [
    'CLINICAL NOTE: CEREBELLAR COGNITIVE AFFECTIVE SYNDROME IMPROVEMENT WITH SELECTIVE INHIBITOR OF SEROTONIN RECAPTATION.',
    'Cerebellar cognitive affective syndrome (CCAS) is characterized by alterations at the cognitive level (dysexecutive syndrome, visuospatial deficit, language...), associated with affective / emotional changes. Its pathophysiology is not well known and there is currently no specific treatment. We describe a 64-year-old man with a rare condition of cognitive-behavioral disorder after an infarction in the left middle cerebral artery, dominated by executive dysfunctions, predominantly oral apraxia, interrupted divided attention, disturbed visuospatial organization and affective abnormalities with great apathy, and whose symptoms improved with a selective serotonin reuptake inhibitor (SSRI). In absence of cerebellar structural damage, a perfusion brain single photon emission computed tomography using 99mTc- hexamethyl-propylene-aminoxime (SPECT-HMPAO) showed left frontotemporal and parietoccipital hypoperfusion of known vascular etiology, and hypoperfusion in the right cerebellar hemisphere compatible with the phenomenon of crossed diaschisis. We hypothesize that cognitive and affective deficits are aggravated by the functional disruption of the reciprocal cerebellar interconnections with areas of cerebral association and paralimbic cortex, altering the contribution of the cerebellum to cognitive and affective processing and modulation. In the case described, both the clinical situation and the functional control images improved after treatment with SSRI, which increases the possibility that there is connectivity of some serotonergic transmission projections between cerebellum and contralateral association cortices, and that said connectivity dysfunctional is involved in the pathophysiology of CCAS.',
    'Precision radiotherapy is a critical and indispensable cancer treatment means in the modern clinical workflow with the goal of achieving "quality-up and cost-down" in patient care. The challenge of this therapy lies in developing computerized clinical-assistant solutions with precision, automation, and reproducibility built-in to deliver it at scale. In this work, we provide a comprehensive yet ongoing, incomplete survey of and discussions on the recent progress of utilizing advanced deep learning, semantic organ parsing, multimodal imaging fusion, neural architecture search and medical image analytical techniques to address four corner-stone problems or sub-problems required by all precision radiotherapy workflows, namely, organs at risk (OARs) segmentation, gross tumor volume (GTV) segmentation, metastasized lymph node (LN) detection, and clinical tumor volume (CTV) segmentation. Without loss of generality, we mainly focus on using esophageal and head-and-neck cancers as examples, but the methods can be extrapolated to other types of cancers. High-precision, automated and highly reproducible OAR/GTV/LN/CTV auto-delineation techniques have demonstrated their effectiveness in reducing the inter-practitioner variabilities and the time cost to permit rapid treatment planning and adaptive replanning for the benefit of patients. Through the presentation of the achievements and limitations of these techniques in this review, we hope to encourage more collective multidisciplinary precision radiotherapy workflows to transpire.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 512]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

parquet

Dataset: parquet
Size: 27,719,606 training samples
Columns: anchor and positive
Approximate statistics based on the first 1000 samples:
anchor positive
type string string
details
min: 9 tokens
mean: 38.85 tokens
max: 130 tokens

min: 24 tokens
mean: 262.32 tokens
max: 512 tokens

	anchor	positive
type	string	string
details	min: 9 tokens mean: 38.85 tokens max: 130 tokens	min: 24 tokens mean: 262.32 tokens max: 512 tokens

Samples:

anchor	positive
`ADDRESS OF COL. GARRICK MALLERY, U. S. ARMY.`	It may be conceded that after man had all his present faculties, he did not choose between the adoption of voice and gesture, and never with those faculties, was in a state where the one was used, to the absolute exclusion of the other. The epoch, however, to which our speculations relate is that in which he had not reached the present symmetric development of his intellect and of his bodily organs, and the inquiry is: Which mode of communication was earliest adopted to his single wants and informed intelligence? With the voice he could imitate distinictively but few sounds of nature, while with gesture he could exhibit actions, motions, positions, forms, dimensions, directions and distances, with their derivations and analogues. It would seem from this unequal division of capacity that oral speech remained rudimentary long after gesture had become an efficient mode of communication. With due allowance for all purely imitative sounds, and for the spontaneous action of vocal organs unde...
`How TO OBTAIN THE BRAIN OF THE CAT.`	`How to obtain the Brain of the Cat, (Wilder).-Correction: Page 158, second column, line 7, "grains," should be "grams;" page 159, near middle of 2nd column, "successily," should be "successively;" page 161, the number of Flower's paper is 3.`
`DOLBEAR ON THE NATURE AND CONSTITUTION OF MATTER.`	`Mr. Dopp desires to make the following correction in his paper in the last issue: "In my article on page 200 of "Science", the expression and should have been and being the velocity of light.`

Loss: MultipleNegativesRankingLoss with these parameters:

{
    "scale": 20.0,
    "similarity_fct": "cos_sim"
}

Evaluation Dataset

parquet

Dataset: parquet
Size: 27,719,606 evaluation samples
Columns: anchor and positive
Approximate statistics based on the first 1000 samples:
anchor positive
type string string
details
min: 8 tokens
mean: 23.18 tokens
max: 55 tokens

min: 6 tokens
mean: 263.22 tokens
max: 512 tokens

	anchor	positive
type	string	string
details	min: 8 tokens mean: 23.18 tokens max: 55 tokens	min: 6 tokens mean: 263.22 tokens max: 512 tokens

Samples:

anchor	positive
`Machine learning makes magnificent macromolecules for medicine.`	`At the University of Minnesota, scientists explore the application of machine learning to screen a multiparametric library of polymers to investigate the relationship between polymer attributes, payload type, and biological outcomes to optimize polymeric vector development for delivery of nucleic acid payloads.`
`2022 WUOF/SIU International Consultation on Urological Diseases: Genetics and Tumor Microenvironment of Renal Cell Carcinoma.`	Renal cell carcinoma is a diverse group of diseases that can be distinguished by distinct histopathologic and genomic features. In this comprehensive review, we highlight recent advancements in our understanding of the genetic and microenvironmental hallmarks of kidney cancer. We begin with clear cell renal cell carcinoma (ccRCC), the most common subtype of this disease. We review the chromosomal and genetic alterations that drive initiation and progression of ccRCC, which has recently been shown to follow multiple highly conserved evolutionary trajectories that in turn impact disease progression and prognosis. We also review the diverse genetic events that define the many recently recognized rare subtypes within non-clear cell RCC. Finally, we discuss our evolving understanding of the ccRCC microenvironment, which has been revolutionized by recent bulk and single-cell transcriptomic analyses, suggesting potential biomarkers for guiding systemic therapy in the management of advanced cc...
`Single-bone versus both-bone plating of unstable paediatric both-bone forearm fractures. A randomized controlled clinical trial.`	PURPOSE: This clinical trial compares the functional and radiological outcomes of single-bone fixation to both-bone fixation of unstable paediatric both-bone forearm fractures. METHODS: This individually randomized two-group parallel clinical trial was performed following the Consolidated Standards of Reporting Trials or the both-bone fixation group (control). Primary outcomes were forearm range of motion and fracture union, while secondary outcomes were forearm function (price criteria), radius re-angulation, wrist and elbow range of motion, and surgical time RESULTS: A total of 50 children were included. Out of these 50 children, 25 were randomized to either arm of the study. All children in either group received the treatment assigned by randomization. Fifty (100%) children were available for final follow-up at six months post-operatively. The mean age of single-bone and both-bone fixation groups was 11.48 ± 1.93 and 13 ± 1.75 years, respectively, with a statistically significant di...

Loss: MultipleNegativesRankingLoss with these parameters:

{
    "scale": 20.0,
    "similarity_fct": "cos_sim"
}

Training Hyperparameters

Non-Default Hyperparameters

eval_strategy: steps
per_device_train_batch_size: 128
learning_rate: 2e-05
max_steps: 617193
log_level: info
fp16: True
dataloader_num_workers: 16
load_best_model_at_end: True
resume_from_checkpoint: True

All Hyperparameters

Click to expand

overwrite_output_dir: False
do_predict: False
eval_strategy: steps
prediction_loss_only: True
per_device_train_batch_size: 128
per_device_eval_batch_size: 8
per_gpu_train_batch_size: None
per_gpu_eval_batch_size: None
gradient_accumulation_steps: 1
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 2e-05
weight_decay: 0.0
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1.0
num_train_epochs: 3
max_steps: 617193
lr_scheduler_type: linear
lr_scheduler_kwargs: {}
warmup_ratio: 0.0
warmup_steps: 0
log_level: info
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
save_safetensors: True
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
no_cuda: False
use_cpu: False
use_mps_device: False
seed: 42
data_seed: None
jit_mode_eval: False
use_ipex: False
bf16: False
fp16: True
fp16_opt_level: O1
half_precision_backend: auto
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: 0
ddp_backend: None
tpu_num_cores: None
tpu_metrics_debug: False
debug: []
dataloader_drop_last: False
dataloader_num_workers: 16
dataloader_prefetch_factor: None
past_index: -1
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: True
ignore_data_skip: False
fsdp: []
fsdp_min_num_params: 0
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
fsdp_transformer_layer_cls_to_wrap: None
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch
optim_args: None
adafactor: False
group_by_length: False
length_column_name: length
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
use_legacy_prediction_loop: False
push_to_hub: False
resume_from_checkpoint: True
hub_model_id: None
hub_strategy: every_save
hub_private_repo: None
hub_always_push: False
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_inputs_for_metrics: False
include_for_metrics: []
eval_do_concat_batches: True
fp16_backend: auto
push_to_hub_model_id: None
push_to_hub_organization: None
mp_parameters:
auto_find_batch_size: False
full_determinism: False
torchdynamo: None
ray_scope: last
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
dispatch_batches: None
split_batches: None
include_tokens_per_second: False
include_num_input_tokens_seen: False
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
use_liger_kernel: False
eval_use_gather_object: False
average_tokens_across_devices: False
prompts: None
batch_sampler: batch_sampler
multi_dataset_batch_sampler: proportional

Training Logs

Click to expand

Epoch	Step	Training Loss	Validation Loss
0.0000	1	3.8953	-
0.0049	1000	0.7985	-
0.0097	2000	0.2636	-
0.0146	3000	0.2381	-
0.0194	4000	0.2166	-
0.0243	5000	0.1595	-
0.0292	6000	0.1696	-
0.0340	7000	0.1483	-
0.0389	8000	0.1377	-
0.0437	9000	0.1347	-
0.0486	10000	0.1524	-
0.0535	11000	0.1374	-
0.0583	12000	0.0999	-
0.0632	13000	0.1363	-
0.0680	14000	0.1424	-
0.0729	15000	0.0918	-
0.0778	16000	0.2012	-
0.0826	17000	0.0949	-
0.0875	18000	0.1418	-
0.0924	19000	0.0889	-
0.0972	20000	0.1648	-
0.1021	21000	0.097	-
0.1069	22000	0.1613	-
0.1118	23000	0.0852	-
0.1167	24000	0.102	-
0.1215	25000	0.0808	-
0.1264	26000	0.1621	-
0.1312	27000	0.0897	-
0.1361	28000	0.1687	-
0.1410	29000	0.0869	-
0.1458	30000	0.1747	-
0.1507	31000	0.0823	-
0.1555	32000	0.1312	-
0.1604	33000	0.1322	-
0.1653	34000	0.0924	-
0.1701	35000	0.1746	-
0.1750	36000	0.0852	-
0.1798	37000	0.0936	-
0.1847	38000	0.1498	-
0.1896	39000	0.094	-
0.1944	40000	0.1487	-
0.1993	41000	0.104	-
0.2041	42000	0.0879	-
0.2090	43000	0.1405	-
0.2139	44000	0.0893	-
0.2187	45000	0.099	-
0.2236	46000	0.1192	-
0.2285	47000	0.081	-
0.2333	48000	0.1004	-
0.2382	49000	0.1091	-
0.2430	50000	0.0911	-
0.2479	51000	0.0789	-
0.2528	52000	0.081	-
0.2576	53000	0.1084	-
0.2625	54000	0.097	-
0.2673	55000	0.0696	-
0.2722	56000	0.0804	-
0.2771	57000	0.0767	-
0.2819	58000	0.0794	-
0.2868	59000	0.0736	-
0.2916	60000	0.0909	-
0.2965	61000	0.0576	-
0.3014	62000	0.0642	-
0.3062	63000	0.0705	-
0.3111	64000	0.0835	-
0.3159	65000	0.0814	-
0.3208	66000	0.0664	-
0.3257	67000	0.0509	-
0.3305	68000	0.0669	-
0.3354	69000	0.0844	-
0.3402	70000	0.0952	-
0.3451	71000	0.1012	-
0.3500	72000	0.0431	-
0.3548	73000	0.0526	-
0.3597	74000	0.0643	-
0.3646	75000	0.0556	-
0.3694	76000	0.1135	-
0.3743	77000	0.0641	-
0.3791	78000	0.0784	-
0.3840	79000	0.0432	-
0.3889	80000	0.0693	-
0.3937	81000	0.0841	-
0.3986	82000	0.0518	-
0.4034	83000	0.0581	-
0.4083	84000	0.0749	-
0.4132	85000	0.0376	-
0.4180	86000	0.0485	-
0.4229	87000	0.0542	-
0.4277	88000	0.0821	-
0.4326	89000	0.068	-
0.4375	90000	0.054	-
0.4423	91000	0.042	-
0.4472	92000	0.0477	-
0.4520	93000	0.0556	-
0.4569	94000	0.0557	-
0.4618	95000	0.0954	-
0.4666	96000	0.037	-
0.4715	97000	0.0431	-
0.4763	98000	0.0394	-
0.4812	99000	0.0425	-
0.4861	100000	0.0347	-
0.4909	101000	0.0482	-
0.4958	102000	0.053	-
0.5007	103000	0.0289	-
0.5055	104000	0.0364	-
0.5104	105000	0.0332	-
0.5152	106000	0.0275	-
0.5201	107000	0.0626	-
0.5250	108000	0.0552	-
0.5298	109000	0.0365	-
0.5347	110000	0.045	-
0.5395	111000	0.0476	-
0.5444	112000	0.049	-
0.5493	113000	0.0334	-
0.5541	114000	0.0412	-
0.5590	115000	0.0547	-
0.5638	116000	0.0375	-
0.5687	117000	0.0776	-
0.5736	118000	0.0481	-
0.5784	119000	0.0403	-
0.5833	120000	0.0467	-
0.5881	121000	0.0347	-
0.5930	122000	0.0478	-
0.5979	123000	0.0455	-
0.6027	124000	0.0541	-
0.6076	125000	0.0483	-
0.6124	126000	0.0583	-
0.6173	127000	0.0499	-
0.6222	128000	0.0488	-
0.6270	129000	0.0403	-
0.6319	130000	0.0314	-
0.6368	131000	0.0253	-
0.6416	132000	0.0473	-
0.6465	133000	0.0433	-
0.6513	134000	0.039	-
0.6562	135000	0.0334	-
0.6611	136000	0.0427	-
0.6659	137000	0.0401	-
0.6708	138000	0.0465	-
0.6756	139000	0.0393	-
0.6805	140000	0.0481	-
0.6854	141000	0.0504	-
0.6902	142000	0.0381	-
0.6951	143000	0.0326	-
0.6999	144000	0.0314	-
0.7048	145000	0.0335	-
0.7097	146000	0.0273	-
0.7145	147000	0.034	-
0.7194	148000	0.043	-
0.7242	149000	0.0395	-
0.7291	150000	0.0335	-
0.7340	151000	0.0414	-
0.7388	152000	0.0423	-
0.7437	153000	0.0336	-
0.7485	154000	0.0424	-
0.7534	155000	0.0394	-
0.7583	156000	0.0408	-
0.7631	157000	0.0381	-
0.7680	158000	0.0348	-
0.7729	159000	0.0428	-
0.7777	160000	0.0426	-
0.7826	161000	0.0311	-
0.7874	162000	0.0276	-
0.7923	163000	0.0207	-
0.7972	164000	0.0264	-
0.8020	165000	0.0265	-
0.8069	166000	0.0214	-
0.8117	167000	0.0279	-
0.8166	168000	0.0265	-
0.8215	169000	0.0265	-
0.8263	170000	0.029	-
0.8312	171000	0.0281	-
0.8360	172000	0.0314	-
0.8409	173000	0.0316	-
0.8458	174000	0.0265	-
0.8506	175000	0.0305	-
0.8555	176000	0.0278	-
0.8603	177000	0.0314	-
0.8652	178000	0.0274	-
0.8701	179000	0.0232	-
0.8749	180000	0.0225	-
0.8798	181000	0.0262	-
0.8846	182000	0.0271	-
0.8895	183000	0.0253	-
0.8944	184000	0.0255	-
0.8992	185000	0.0259	-
0.9041	186000	0.0248	-
0.9089	187000	0.0223	-
0.9138	188000	0.0221	-
0.9187	189000	0.0236	-
0.9235	190000	0.0239	-
0.9284	191000	0.0359	-
0.9333	192000	0.0248	-
0.9381	193000	0.0195	-
0.9430	194000	0.0217	-
0.9478	195000	0.0209	-
0.9527	196000	0.0197	-
0.9576	197000	0.0199	-
0.9624	198000	0.021	-
0.9673	199000	0.0201	-
0.9721	200000	0.0203	-
0.9770	201000	0.018	-
0.9819	202000	0.02	-
0.9867	203000	0.0184	-
0.9916	204000	0.0178	-
0.9964	205000	0.0222	-
1.0000	205731	-	0.0007

Framework Versions

Python: 3.11.11
Sentence Transformers: 3.4.1
Transformers: 4.48.2
PyTorch: 2.6.0+cu124
Accelerate: 1.3.0
Datasets: 3.2.0
Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}