---
language:
- en
license: apache-2.0
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:6300
- loss:MatryoshkaLoss
- loss:MultipleNegativesRankingLoss
base_model: Snowflake/snowflake-arctic-embed-m-v1.5
widget:
- source_sentence: Cost of net revenues represents costs associated with customer
    support, site operations, and payment processing. Significant components of these
    costs primarily consist of employee compensation (including stock-based compensation),
    contractor costs, facilities costs, depreciation of equipment and amortization
    expense, bank transaction fees, credit card interchange and assessment fees, authentication
    costs, shipping costs and digital services tax.
  sentences:
  - What was the allowance for loan losses on GM Financial’s retail finance receivables
    portfolio at the end of 2023?
  - What are the key components of cost of net revenues?
  - What percentage of McLane's consolidated sales in 2023 was comprised by grocery
    sales?
- source_sentence: The net cash used in operating activities was reported as $215.2
    million, $628.5 million, and $614.1 million for three respective periods.
  sentences:
  - What was the net cash used in operating activities for the respective periods
    listed?
  - What was the total growth investment capital expenditures in 2022?
  - Where is the Financial Statement Schedule in IBM’s 2023 Form 10-K located?
- source_sentence: In 2023, the total operating expenses amounted to $4,331.6 million,
    including costs of services, selling, general and administrative expenses, and
    depreciation and amortization.
  sentences:
  - What were the total operating expenses for the company in 2023?
  - How does CMS adjust the company's Medicare Advantage and Part D premium revenues?
  - What was the average stockholders' deficit over the past five fiscal years up
    to 2023?
- source_sentence: Johnson & Johnson reported cash and cash equivalents of $21,859
    million as of the end of 2023.
  sentences:
  - Who are GameStop's main competitors in the global gaming industry?
  - What was the amount of cash and cash equivalents reported by Johnson & Johnson
    at the end of 2023?
  - By what percentage has Chevron's UK oil-equivalent production increased from 2022
    to 2023?
- source_sentence: As of December 31, 2023, Bank of America reported gross derivative
    assets and liabilities totaling $290.3 billion and $301.2 billion, respectively.
    After accounting for legally enforceable master netting agreements and cash collateral,
    these figures were adjusted to $39.3 billion in assets and $43.4 billion in liabilities.
  sentences:
  - What is the significant raw material used by MiTek and how does its supply impact
    the company?
  - By what percentage did HIV product sales increase in 2023 compared to the previous
    year?
  - What were the total derivative assets and liabilities at Bank of America as of
    December 31, 2023, after adjusting for master netting agreements and cash collateral?
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
- cosine_accuracy@1
- cosine_accuracy@3
- cosine_accuracy@5
- cosine_accuracy@10
- cosine_precision@1
- cosine_precision@3
- cosine_precision@5
- cosine_precision@10
- cosine_recall@1
- cosine_recall@3
- cosine_recall@5
- cosine_recall@10
- cosine_ndcg@10
- cosine_mrr@10
- cosine_map@100
model-index:
- name: BGE base Financial Matryoshka
  results:
  - task:
      type: information-retrieval
      name: Information Retrieval
    dataset:
      name: dim 768
      type: dim_768
    metrics:
    - type: cosine_accuracy@1
      value: 0.7542857142857143
      name: Cosine Accuracy@1
    - type: cosine_accuracy@3
      value: 0.8614285714285714
      name: Cosine Accuracy@3
    - type: cosine_accuracy@5
      value: 0.8914285714285715
      name: Cosine Accuracy@5
    - type: cosine_accuracy@10
      value: 0.9328571428571428
      name: Cosine Accuracy@10
    - type: cosine_precision@1
      value: 0.7542857142857143
      name: Cosine Precision@1
    - type: cosine_precision@3
      value: 0.28714285714285714
      name: Cosine Precision@3
    - type: cosine_precision@5
      value: 0.17828571428571424
      name: Cosine Precision@5
    - type: cosine_precision@10
      value: 0.09328571428571428
      name: Cosine Precision@10
    - type: cosine_recall@1
      value: 0.7542857142857143
      name: Cosine Recall@1
    - type: cosine_recall@3
      value: 0.8614285714285714
      name: Cosine Recall@3
    - type: cosine_recall@5
      value: 0.8914285714285715
      name: Cosine Recall@5
    - type: cosine_recall@10
      value: 0.9328571428571428
      name: Cosine Recall@10
    - type: cosine_ndcg@10
      value: 0.8430593058746703
      name: Cosine Ndcg@10
    - type: cosine_mrr@10
      value: 0.814359410430839
      name: Cosine Mrr@10
    - type: cosine_map@100
      value: 0.8171120142759164
      name: Cosine Map@100
  - task:
      type: information-retrieval
      name: Information Retrieval
    dataset:
      name: dim 512
      type: dim_512
    metrics:
    - type: cosine_accuracy@1
      value: 0.7542857142857143
      name: Cosine Accuracy@1
    - type: cosine_accuracy@3
      value: 0.8614285714285714
      name: Cosine Accuracy@3
    - type: cosine_accuracy@5
      value: 0.8914285714285715
      name: Cosine Accuracy@5
    - type: cosine_accuracy@10
      value: 0.93
      name: Cosine Accuracy@10
    - type: cosine_precision@1
      value: 0.7542857142857143
      name: Cosine Precision@1
    - type: cosine_precision@3
      value: 0.28714285714285714
      name: Cosine Precision@3
    - type: cosine_precision@5
      value: 0.17828571428571427
      name: Cosine Precision@5
    - type: cosine_precision@10
      value: 0.09299999999999999
      name: Cosine Precision@10
    - type: cosine_recall@1
      value: 0.7542857142857143
      name: Cosine Recall@1
    - type: cosine_recall@3
      value: 0.8614285714285714
      name: Cosine Recall@3
    - type: cosine_recall@5
      value: 0.8914285714285715
      name: Cosine Recall@5
    - type: cosine_recall@10
      value: 0.93
      name: Cosine Recall@10
    - type: cosine_ndcg@10
      value: 0.8409010665384006
      name: Cosine Ndcg@10
    - type: cosine_mrr@10
      value: 0.8124268707482996
      name: Cosine Mrr@10
    - type: cosine_map@100
      value: 0.8153207256101372
      name: Cosine Map@100
  - task:
      type: information-retrieval
      name: Information Retrieval
    dataset:
      name: dim 256
      type: dim_256
    metrics:
    - type: cosine_accuracy@1
      value: 0.7557142857142857
      name: Cosine Accuracy@1
    - type: cosine_accuracy@3
      value: 0.86
      name: Cosine Accuracy@3
    - type: cosine_accuracy@5
      value: 0.8942857142857142
      name: Cosine Accuracy@5
    - type: cosine_accuracy@10
      value: 0.9285714285714286
      name: Cosine Accuracy@10
    - type: cosine_precision@1
      value: 0.7557142857142857
      name: Cosine Precision@1
    - type: cosine_precision@3
      value: 0.2866666666666667
      name: Cosine Precision@3
    - type: cosine_precision@5
      value: 0.17885714285714283
      name: Cosine Precision@5
    - type: cosine_precision@10
      value: 0.09285714285714286
      name: Cosine Precision@10
    - type: cosine_recall@1
      value: 0.7557142857142857
      name: Cosine Recall@1
    - type: cosine_recall@3
      value: 0.86
      name: Cosine Recall@3
    - type: cosine_recall@5
      value: 0.8942857142857142
      name: Cosine Recall@5
    - type: cosine_recall@10
      value: 0.9285714285714286
      name: Cosine Recall@10
    - type: cosine_ndcg@10
      value: 0.8408862139768868
      name: Cosine Ndcg@10
    - type: cosine_mrr@10
      value: 0.8128662131519274
      name: Cosine Mrr@10
    - type: cosine_map@100
      value: 0.8157678611118373
      name: Cosine Map@100
  - task:
      type: information-retrieval
      name: Information Retrieval
    dataset:
      name: dim 128
      type: dim_128
    metrics:
    - type: cosine_accuracy@1
      value: 0.7442857142857143
      name: Cosine Accuracy@1
    - type: cosine_accuracy@3
      value: 0.85
      name: Cosine Accuracy@3
    - type: cosine_accuracy@5
      value: 0.8871428571428571
      name: Cosine Accuracy@5
    - type: cosine_accuracy@10
      value: 0.9142857142857143
      name: Cosine Accuracy@10
    - type: cosine_precision@1
      value: 0.7442857142857143
      name: Cosine Precision@1
    - type: cosine_precision@3
      value: 0.2833333333333333
      name: Cosine Precision@3
    - type: cosine_precision@5
      value: 0.1774285714285714
      name: Cosine Precision@5
    - type: cosine_precision@10
      value: 0.09142857142857141
      name: Cosine Precision@10
    - type: cosine_recall@1
      value: 0.7442857142857143
      name: Cosine Recall@1
    - type: cosine_recall@3
      value: 0.85
      name: Cosine Recall@3
    - type: cosine_recall@5
      value: 0.8871428571428571
      name: Cosine Recall@5
    - type: cosine_recall@10
      value: 0.9142857142857143
      name: Cosine Recall@10
    - type: cosine_ndcg@10
      value: 0.8298257719970505
      name: Cosine Ndcg@10
    - type: cosine_mrr@10
      value: 0.802593537414966
      name: Cosine Mrr@10
    - type: cosine_map@100
      value: 0.8061119393433516
      name: Cosine Map@100
  - task:
      type: information-retrieval
      name: Information Retrieval
    dataset:
      name: dim 64
      type: dim_64
    metrics:
    - type: cosine_accuracy@1
      value: 0.7
      name: Cosine Accuracy@1
    - type: cosine_accuracy@3
      value: 0.8157142857142857
      name: Cosine Accuracy@3
    - type: cosine_accuracy@5
      value: 0.8571428571428571
      name: Cosine Accuracy@5
    - type: cosine_accuracy@10
      value: 0.9071428571428571
      name: Cosine Accuracy@10
    - type: cosine_precision@1
      value: 0.7
      name: Cosine Precision@1
    - type: cosine_precision@3
      value: 0.2719047619047619
      name: Cosine Precision@3
    - type: cosine_precision@5
      value: 0.1714285714285714
      name: Cosine Precision@5
    - type: cosine_precision@10
      value: 0.09071428571428569
      name: Cosine Precision@10
    - type: cosine_recall@1
      value: 0.7
      name: Cosine Recall@1
    - type: cosine_recall@3
      value: 0.8157142857142857
      name: Cosine Recall@3
    - type: cosine_recall@5
      value: 0.8571428571428571
      name: Cosine Recall@5
    - type: cosine_recall@10
      value: 0.9071428571428571
      name: Cosine Recall@10
    - type: cosine_ndcg@10
      value: 0.8023275744891828
      name: Cosine Ndcg@10
    - type: cosine_mrr@10
      value: 0.7689109977324261
      name: Cosine Mrr@10
    - type: cosine_map@100
      value: 0.7722063607472032
      name: Cosine Map@100
---

# BGE base Financial Matryoshka

This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [Snowflake/snowflake-arctic-embed-m-v1.5](https://huggingface.co/Snowflake/snowflake-arctic-embed-m-v1.5) on the json dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

## Model Details

### Model Description
- **Model Type:** Sentence Transformer
- **Base model:** [Snowflake/snowflake-arctic-embed-m-v1.5](https://huggingface.co/Snowflake/snowflake-arctic-embed-m-v1.5) <!-- at revision 8e4eaca09c27ad3d501908636ec7c8bc3561b6de -->
- **Maximum Sequence Length:** 512 tokens
- **Output Dimensionality:** 768 dimensions
- **Similarity Function:** Cosine Similarity
- **Training Dataset:**
    - json
- **Language:** en
- **License:** apache-2.0

### Model Sources

- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)

### Full Model Architecture

```
SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)
```

## Usage

### Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

```bash
pip install -U sentence-transformers
```

Then you can load this model and run inference.
```python
from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Abinaya/snowflake-arctic-embed-financial-matryoshka")
# Run inference
sentences = [
    'As of December 31, 2023, Bank of America reported gross derivative assets and liabilities totaling $290.3 billion and $301.2 billion, respectively. After accounting for legally enforceable master netting agreements and cash collateral, these figures were adjusted to $39.3 billion in assets and $43.4 billion in liabilities.',
    'What were the total derivative assets and liabilities at Bank of America as of December 31, 2023, after adjusting for master netting agreements and cash collateral?',
    'By what percentage did HIV product sales increase in 2023 compared to the previous year?',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
```

<!--
### Direct Usage (Transformers)

<details><summary>Click to see the direct usage in Transformers</summary>

</details>
-->

<!--
### Downstream Usage (Sentence Transformers)

You can finetune this model on your own dataset.

<details><summary>Click to expand</summary>

</details>
-->

<!--
### Out-of-Scope Use

*List how the model may foreseeably be misused and address what users ought not to do with the model.*
-->

## Evaluation

### Metrics

#### Information Retrieval

* Datasets: `dim_768`, `dim_512`, `dim_256`, `dim_128` and `dim_64`
* Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)

| Metric              | dim_768    | dim_512    | dim_256    | dim_128    | dim_64     |
|:--------------------|:-----------|:-----------|:-----------|:-----------|:-----------|
| cosine_accuracy@1   | 0.7543     | 0.7543     | 0.7557     | 0.7443     | 0.7        |
| cosine_accuracy@3   | 0.8614     | 0.8614     | 0.86       | 0.85       | 0.8157     |
| cosine_accuracy@5   | 0.8914     | 0.8914     | 0.8943     | 0.8871     | 0.8571     |
| cosine_accuracy@10  | 0.9329     | 0.93       | 0.9286     | 0.9143     | 0.9071     |
| cosine_precision@1  | 0.7543     | 0.7543     | 0.7557     | 0.7443     | 0.7        |
| cosine_precision@3  | 0.2871     | 0.2871     | 0.2867     | 0.2833     | 0.2719     |
| cosine_precision@5  | 0.1783     | 0.1783     | 0.1789     | 0.1774     | 0.1714     |
| cosine_precision@10 | 0.0933     | 0.093      | 0.0929     | 0.0914     | 0.0907     |
| cosine_recall@1     | 0.7543     | 0.7543     | 0.7557     | 0.7443     | 0.7        |
| cosine_recall@3     | 0.8614     | 0.8614     | 0.86       | 0.85       | 0.8157     |
| cosine_recall@5     | 0.8914     | 0.8914     | 0.8943     | 0.8871     | 0.8571     |
| cosine_recall@10    | 0.9329     | 0.93       | 0.9286     | 0.9143     | 0.9071     |
| **cosine_ndcg@10**  | **0.8431** | **0.8409** | **0.8409** | **0.8298** | **0.8023** |
| cosine_mrr@10       | 0.8144     | 0.8124     | 0.8129     | 0.8026     | 0.7689     |
| cosine_map@100      | 0.8171     | 0.8153     | 0.8158     | 0.8061     | 0.7722     |

<!--
## Bias, Risks and Limitations

*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
-->

<!--
### Recommendations

*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
-->

## Training Details

### Training Dataset

#### json

* Dataset: json
* Size: 6,300 training samples
* Columns: <code>positive</code> and <code>anchor</code>
* Approximate statistics based on the first 1000 samples:
  |         | positive                                                                           | anchor                                                                            |
  |:--------|:-----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
  | type    | string                                                                             | string                                                                            |
  | details | <ul><li>min: 6 tokens</li><li>mean: 46.81 tokens</li><li>max: 326 tokens</li></ul> | <ul><li>min: 9 tokens</li><li>mean: 20.23 tokens</li><li>max: 45 tokens</li></ul> |
* Samples:
  | positive                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | anchor                                                                                                                                |
  |:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------|
  | <code>Opioids Related Securities Class Actions and Derivative Litigation: Three derivative complaints and two securities class actions drawing heavily on the allegations of the DOJ complaint have been filed in Delaware naming the Company and various current and former directors and certain current and former officers as defendants. The plaintiffs in the derivative suits (in which the Company is a nominal defendant) allege, among other things, that the defendants breached their fidariety duties in connection with oversight of opioids dispensing and distribution and that the defendants violated Section 14(a) of the Securities Exchange Act of 1934, as amended (the 'Exchange Act()), and are liable for contribution under Section 10(b) of the Exchange Act in connection with the Company's disclosures about opioids.</code> | <code>What kind of claims are involved in the securities and derivative litigation against the Company listed in the document?</code> |
  | <code>Walmart's fintech venture, ONE, provides financial services such as money orders, prepaid access, money transfers, check cashing, bill payment, and certain types of installment lending.</code>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | <code>What types of financial services are offered through Walmart's fintech venture, ONE?</code>                                     |
  | <code>Juice and juice concentrate from various fruits, particularly orange juice and orange juice concentrate, are principal raw materials for juice and juice drink products, and milk is the principal raw material for dairy products managed through fairlife, LLC.</code>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | <code>What are the primary raw materials for the company's juice and dairy products?</code>                                           |
* Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
  ```json
  {
      "loss": "MultipleNegativesRankingLoss",
      "matryoshka_dims": [
          768,
          512,
          256,
          128,
          64
      ],
      "matryoshka_weights": [
          1,
          1,
          1,
          1,
          1
      ],
      "n_dims_per_step": -1
  }
  ```

### Training Hyperparameters
#### Non-Default Hyperparameters

- `eval_strategy`: epoch
- `per_device_train_batch_size`: 32
- `per_device_eval_batch_size`: 16
- `gradient_accumulation_steps`: 16
- `learning_rate`: 2e-05
- `num_train_epochs`: 4
- `lr_scheduler_type`: cosine
- `warmup_ratio`: 0.1
- `bf16`: True
- `tf32`: True
- `load_best_model_at_end`: True
- `optim`: adamw_torch_fused
- `batch_sampler`: no_duplicates

#### All Hyperparameters
<details><summary>Click to expand</summary>

- `overwrite_output_dir`: False
- `do_predict`: False
- `eval_strategy`: epoch
- `prediction_loss_only`: True
- `per_device_train_batch_size`: 32
- `per_device_eval_batch_size`: 16
- `per_gpu_train_batch_size`: None
- `per_gpu_eval_batch_size`: None
- `gradient_accumulation_steps`: 16
- `eval_accumulation_steps`: None
- `torch_empty_cache_steps`: None
- `learning_rate`: 2e-05
- `weight_decay`: 0.0
- `adam_beta1`: 0.9
- `adam_beta2`: 0.999
- `adam_epsilon`: 1e-08
- `max_grad_norm`: 1.0
- `num_train_epochs`: 4
- `max_steps`: -1
- `lr_scheduler_type`: cosine
- `lr_scheduler_kwargs`: {}
- `warmup_ratio`: 0.1
- `warmup_steps`: 0
- `log_level`: passive
- `log_level_replica`: warning
- `log_on_each_node`: True
- `logging_nan_inf_filter`: True
- `save_safetensors`: True
- `save_on_each_node`: False
- `save_only_model`: False
- `restore_callback_states_from_checkpoint`: False
- `no_cuda`: False
- `use_cpu`: False
- `use_mps_device`: False
- `seed`: 42
- `data_seed`: None
- `jit_mode_eval`: False
- `use_ipex`: False
- `bf16`: True
- `fp16`: False
- `fp16_opt_level`: O1
- `half_precision_backend`: auto
- `bf16_full_eval`: False
- `fp16_full_eval`: False
- `tf32`: True
- `local_rank`: 0
- `ddp_backend`: None
- `tpu_num_cores`: None
- `tpu_metrics_debug`: False
- `debug`: []
- `dataloader_drop_last`: False
- `dataloader_num_workers`: 0
- `dataloader_prefetch_factor`: None
- `past_index`: -1
- `disable_tqdm`: False
- `remove_unused_columns`: True
- `label_names`: None
- `load_best_model_at_end`: True
- `ignore_data_skip`: False
- `fsdp`: []
- `fsdp_min_num_params`: 0
- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
- `fsdp_transformer_layer_cls_to_wrap`: None
- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
- `deepspeed`: None
- `label_smoothing_factor`: 0.0
- `optim`: adamw_torch_fused
- `optim_args`: None
- `adafactor`: False
- `group_by_length`: False
- `length_column_name`: length
- `ddp_find_unused_parameters`: None
- `ddp_bucket_cap_mb`: None
- `ddp_broadcast_buffers`: False
- `dataloader_pin_memory`: True
- `dataloader_persistent_workers`: False
- `skip_memory_metrics`: True
- `use_legacy_prediction_loop`: False
- `push_to_hub`: False
- `resume_from_checkpoint`: None
- `hub_model_id`: None
- `hub_strategy`: every_save
- `hub_private_repo`: None
- `hub_always_push`: False
- `gradient_checkpointing`: False
- `gradient_checkpointing_kwargs`: None
- `include_inputs_for_metrics`: False
- `include_for_metrics`: []
- `eval_do_concat_batches`: True
- `fp16_backend`: auto
- `push_to_hub_model_id`: None
- `push_to_hub_organization`: None
- `mp_parameters`: 
- `auto_find_batch_size`: False
- `full_determinism`: False
- `torchdynamo`: None
- `ray_scope`: last
- `ddp_timeout`: 1800
- `torch_compile`: False
- `torch_compile_backend`: None
- `torch_compile_mode`: None
- `dispatch_batches`: None
- `split_batches`: None
- `include_tokens_per_second`: False
- `include_num_input_tokens_seen`: False
- `neftune_noise_alpha`: None
- `optim_target_modules`: None
- `batch_eval_metrics`: False
- `eval_on_start`: False
- `use_liger_kernel`: False
- `eval_use_gather_object`: False
- `average_tokens_across_devices`: False
- `prompts`: None
- `batch_sampler`: no_duplicates
- `multi_dataset_batch_sampler`: proportional

</details>

### Training Logs
| Epoch     | Step   | Training Loss | dim_768_cosine_ndcg@10 | dim_512_cosine_ndcg@10 | dim_256_cosine_ndcg@10 | dim_128_cosine_ndcg@10 | dim_64_cosine_ndcg@10 |
|:---------:|:------:|:-------------:|:----------------------:|:----------------------:|:----------------------:|:----------------------:|:---------------------:|
| 0.8122    | 10     | 1.5521        | -                      | -                      | -                      | -                      | -                     |
| 1.0       | 13     | -             | 0.8136                 | 0.8108                 | 0.8143                 | 0.7949                 | 0.7552                |
| 1.5685    | 20     | 0.4812        | -                      | -                      | -                      | -                      | -                     |
| 2.0       | 26     | -             | 0.8405                 | 0.8388                 | 0.8384                 | 0.8284                 | 0.7990                |
| 2.3249    | 30     | 0.3585        | -                      | -                      | -                      | -                      | -                     |
| 3.0       | 39     | -             | 0.8420                 | 0.8409                 | 0.8408                 | 0.8290                 | 0.8009                |
| 3.0812    | 40     | 0.3101        | -                      | -                      | -                      | -                      | -                     |
| **3.731** | **48** | **-**         | **0.8431**             | **0.8409**             | **0.8409**             | **0.8298**             | **0.8023**            |

* The bold row denotes the saved checkpoint.

### Framework Versions
- Python: 3.10.15
- Sentence Transformers: 3.4.0
- Transformers: 4.47.1
- PyTorch: 2.5.1+cu124
- Accelerate: 1.0.1
- Datasets: 3.0.1
- Tokenizers: 0.21.0

## Citation

### BibTeX

#### Sentence Transformers
```bibtex
@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
```

#### MatryoshkaLoss
```bibtex
@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}
```

#### MultipleNegativesRankingLoss
```bibtex
@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
```

<!--
## Glossary

*Clearly define terms in order to be accessible across audiences.*
-->

<!--
## Model Card Authors

*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
-->

<!--
## Model Card Contact

*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
-->