SentenceTransformer based on Snowflake/snowflake-arctic-embed-l-v2.0
This is a sentence-transformers model finetuned from Snowflake/snowflake-arctic-embed-l-v2.0. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: Snowflake/snowflake-arctic-embed-l-v2.0
- Maximum Sequence Length: 1024 tokens
- Output Dimensionality: 1024 tokens
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 1024, 'do_lower_case': False}) with Transformer model: XLMRobertaModel
(1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("Jrinky/snowflake")
# Run inference
sentences = [
'Why is it important to keep moving over the summer',
"It's important to keep moving over the summer!",
'2008. CHENG HF, LEE YM, Chu CH, Leung WK & Mok TMY. - Journal Editor (Hong Kong Medical Journal) 2008\n- Editor-in-Chief (Hong Kong Dental Journal) 2007\n- Editor-in-Chief (Hong Kong Dental Journal) 2006\n- Deputy Editor (Hong Kong Dental Journal) 2004',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Training Details
Training Dataset
Unnamed Dataset
- Size: 69,500 training samples
- Columns:
anchor
andpositive
- Approximate statistics based on the first 1000 samples:
anchor positive type string string details - min: 6 tokens
- mean: 17.47 tokens
- max: 44 tokens
- min: 3 tokens
- mean: 113.33 tokens
- max: 1024 tokens
- Samples:
anchor positive What might have been unnecessary if better emergency plans had been implemented
If better emergency plans had been in place, maybe chemical dipersants wouldn't be needed. And on and on.
What was the year of publication for the 3rd Edition of 'Regular Polytopes' by H.S.M. Coxeter
Coxeter, Regular Polytopes, 3rd Edition, Dover New York, 1973
Kaleidoscopes: Selected Writings of H.S.M. Coxeter, edited by F. Arthur Sherk, Peter McMullen, Anthony C. Thompson, Asia Ivic Weiss, Wiley-Interscience Publication, 1995,
(Paper 22) H.S.M.Who is the author of the GURPS Shapeshifters supplement
GURPS Shapeshifters () is a supplement by Robert M. Schroeck for the GURPS role-playing game system, third edition.
- Loss:
selfloss.Infonce
with these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Evaluation Dataset
Unnamed Dataset
- Size: 17,376 evaluation samples
- Columns:
anchor
andpositive
- Approximate statistics based on the first 1000 samples:
anchor positive type string string details - min: 6 tokens
- mean: 16.87 tokens
- max: 45 tokens
- min: 6 tokens
- mean: 115.36 tokens
- max: 1024 tokens
- Samples:
anchor positive What impressive achievements did the Warriors accomplish during their last season in Division III
The Warriors were among the most lethal offensive teams in Division III this past year, posting a team batting average of .344 and averaging nearly seven runs per game, smacking 29 home runs, and collecting nearly 600 total bases. They shared the Little East Conference regular-season championship and later knocked off the top seed in the NCAA regional tournament (Montclair State) en route to their winningest season in 14 years.
How many bars had nectar and capped honey on them
Eight of the bars had nectar and capped honey on them. There are eighteen bars with brood in some form on them and a mix of workers and drones.
What idea is being requested regarding the 'triangle'
Next up...the "triangle". Please, seriously, if anyone could float me an idea, I would really appreciate it.
- Loss:
selfloss.Infonce
with these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 3per_device_eval_batch_size
: 3learning_rate
: 5e-06num_train_epochs
: 5warmup_ratio
: 0.1fp16
: Truebatch_sampler
: no_duplicates
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 3per_device_eval_batch_size
: 3per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 5e-06weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 5max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Truefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Truedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Falsehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseeval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseeval_use_gather_object
: Falsebatch_sampler
: no_duplicatesmulti_dataset_batch_sampler
: proportional
Training Logs
Epoch | Step | Training Loss | Validation Loss |
---|---|---|---|
0.0777 | 150 | 0.0257 | 0.0134 |
0.1554 | 300 | 0.0136 | 0.0082 |
0.2332 | 450 | 0.0079 | 0.0062 |
0.3109 | 600 | 0.0065 | 0.0051 |
0.3886 | 750 | 0.0059 | 0.0045 |
0.4663 | 900 | 0.0057 | 0.0040 |
0.5440 | 1050 | 0.0064 | 0.0037 |
0.6218 | 1200 | 0.005 | 0.0034 |
0.6995 | 1350 | 0.0052 | 0.0034 |
0.7772 | 1500 | 0.0041 | 0.0032 |
Framework Versions
- Python: 3.12.3
- Sentence Transformers: 3.2.0
- Transformers: 4.44.2
- PyTorch: 2.6.0+cu124
- Accelerate: 1.3.0
- Datasets: 2.19.0
- Tokenizers: 0.19.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
Infonce
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
- Downloads last month
- 4
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.
Model tree for Jrinky/snowflake
Base model
Snowflake/snowflake-arctic-embed-l-v2.0