metadata

tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:1664
  - loss:MultipleNegativesRankingLoss
base_model: sentence-transformers/all-MiniLM-L6-v2
widget:
  - source_sentence: >-
      Explain the importance of having a "Portfolio" section on a website. How
      can showcasing past work benefit a business or individual?
    sentences:
      - |-
        ### Sitemap

        - Home
        - About us
        - How we work
        - Services
        - Portfolio
        - Blog
        - Career
      - >-
        Business keeps asking questions about what has been done or what is
        delivered to our clients.


        What you can do:
      - >-
        Appunite always pushes boundaries and is very creative in ensuring that
        there is knowledge shared amongst the workspace and to our external
        stakeholders. This extends itself to overcoming the problem of articles,
        which are a longer form of writing, that they can be time consuming.
        There has recently been the addition of the TIL section on our website
        and is thanks to the initiative of a few Appuniter who said they would
        like to make something like this happen. They took on the task and are
        now live on our site:  This is a valuable add to companies where peoples
        main focus is not writing and is somewhere else, in this case Software
        development. It allows for shortened pieces which are compacted with
        knowledge and what one learns is shared regularly.


        ### Akai


        As Appunite believes that the future is in the hands and minds of the
        youth, what better way to take advantage of this and benefit from a
        fruitful collaboration.
  - source_sentence: >-
      How did the narrator's feelings and thoughts change as they tried to
      figure out the reason behind the problem with the reaction counters in the
      posts?
    sentences:
      - >-
        - About us

        - How we work

        - Services

        - Portfolio

        - Blog

        - Career


        <!-- image -->


        <!-- image -->


        <!-- image -->


        Miłosz Cisowski


        Product Manager


        <!-- image -->


        <!-- image -->


        # Workation 2.0 - How we work and travel without compromises


        <!-- image -->


        <!-- image -->


        ## The future is now, old man


        I bet right now you are sitting in the office or at home, thinking about
        how it would be possible to work, but also to travel. This is a great
        opportunity to live your life as a local, to enjoy the everyday life of
        people in different places that gives an amazing perspective on how you
        can walk, how you live and how you think in a different environment. It
        gives an opportunity to develop our minds, skills and business ideas.
      - >-
        So I made the deployment.

        Waited…

        Nice, everything seems to be fine! The deployment was finished, and all
        servers were up and running. We were home!

        I went to the kitchen to make some coffee with a smile on my face. I did
        a really hard job, I deserved a break.

        After about half an hour after deployment, the client texted us in our
        shared Slack channel. Something was wrong, reaction counters in the
        posts showed 0 for everything!...

        My face turned white and my stomach felt weird. Is that possible that I
        did something pretty much irreversible? No, it is impossible that I
        broke it up so hard! How is it even possible it doesn’t work after all
        the tests that I had written were successful?!

        I sat in front of the computer and immediately started to look for
        reasons. Why did it happen? It’s impossible! Why, why, why? Wait, I
        think I got it!
      - >-
        My assumption was that the onCreateViewHolder method should be called
        several times only when there are not enough ViewHolders available.
        Then, it should retrieve them from some ✨ magical pool ✨ , thus only
        onBindViewHolder should be called further during the scroll. Seems
        convenient, but believe me or not, I was struck dumb when I realised
        that in this situation, new ViewHolders are created every single time
        the user scrolls through the list! 😱
  - source_sentence: How can you ensure that contacts are well synchronized in an application?
    sentences:
      - >-
        There is one more fantastic change. This year, we have created several
        groups that will visit different countries. The best part is you can go
        to any of them. This means that you can have not one but several
        Workation and you pay only for a transfer.


        This year Appuniters chose Montenegro, Tenerife, Istanbul, Zakopane (in
        Poland), Austria and Mexico. It seems impossible, but that's how we do
        it in Appunite.


        Follow us on our social media to get more info about our trips.


        Join Appunite and create your own story!


        <!-- image -->


        <!-- image -->


        <!-- image -->


        Martyna Kowalska


        Employer Branding Specialist


        At Appunite, I'm responsible for employer branding. After work, I'm a
        student of clinical psychology. I believe that customers will never love
        a company until the employees love it first.


        <!-- image -->


        <!-- image -->


        ## Related articles


        <!-- image -->


        ### One minor change to regain your focus


        - #People

        - #Teamwork


        <!-- image -->
      - >-
        You should check here whether all contacts are actually well
        synchronized.


        It is worth noting whether the application asks for access to the
        contacts, whether the message is formulated in the right way (there must
        be specific information for the user on how these contacts will be
        processed) and whether it is displayed in the right place.


        If the application is on the foreign market, it is also worth noting
        whether the invitations with text messages come to the recipient (this
        may be related to the directional prefix and the replacement of for
        example "+" with 00 digits).


        ## Update


        When the application is already in the store, it is worth testing what
        the update will look like. You should then download the store version
        and install the version that will be released as an update to it. There
        may also be various popups, buttons in settings, etc. in the application
        in order to inform the user about available updates.


        ### Examples of problems:
      - >-
        <!-- image -->


        <!-- image -->


        <!-- image -->


        Jacek Marchwicki


        Team Leader, Engineer, Android, Flutter


        Senior developer, team and tech leader, 12 years in the field, and 10
        additional years as a coding enthusiast.


        <!-- image -->


        <!-- image -->


        <!-- image -->


        <!-- image -->


        <!-- image -->


        ## Related articles


        <!-- image -->


        ### Modular software design - 10 common mistakes


        - #Technology

        - #Backend


        <!-- image -->


        ### What is mobile application maintenance? 4 types of mobile
        application maintenance


        - #Technology

        - #Maintenance


        <!-- image -->


        ### Mastering development process


        - #Technology

        - #Strategy


        ### Appunite S.A.


        - Droga Dębińska 3A/3

        - 61-555 Poznań, Poland


        - REGON: 385381222

        - VAT ID (EU): PL 7831812112


        - [email protected]

        - [email protected]


        ### Follow us


        - Our LinkedIn

        - Our Facebook

        - Our X/Twitter

        - Our Medium

        - Our YouTube

        - Our Dribbble

        - Our Instagram

        - Our GitHub

        - Our Behance


        ### Sitemap
  - source_sentence: >-
      What has Hubert learned and experienced during his time at Appunite, and
      how has his knowledge and skills in Elixir development evolved since
      joining the company?
    sentences:
      - |-
        <!-- image -->

        <!-- image -->

        <!-- image -->

        Maciej Kaszubowski

        <!-- image -->

        <!-- image -->

        ## Related articles

        <!-- image -->

        ### Mastering development process

        - #Technology
        - #Strategy

        <!-- image -->

        ### Proposing architectural changes

        - #Technology
        - #Strategy
        - #Teamwork

        <!-- image -->

        ### Why is QA important in software development?

        - #Strategy
        - #QA

        ### Appunite S.A.

        - Droga Dębińska 3A/3
        - 61-555 Poznań, Poland

        - REGON: 385381222
        - VAT ID (EU): PL 7831812112

        - [email protected]
        - [email protected]

        ### Follow us

        - Our LinkedIn
        - Our Facebook
        - Our X/Twitter
        - Our Medium
        - Our YouTube
        - Our Dribbble
        - Our Instagram
        - Our GitHub
        - Our Behance

        ### Sitemap

        - Home
        - About us
        - How we work
        - Services
        - Portfolio
        - Blog
        - Career
      - >-
        ### Step #3


        The next step was to meet as a team and discuss the issue of everyone
        not  being on the same page when it comes to performing a great Code
        Review. I shared the document that I prepared for Tom with advice for
        the rest of the team, so we could have a base for discussion. Based on
        this document, we identified what needed to be fixed and what was
        excessive in our Code Review process. During the meeting, we prepared a
        document of Team Agreements on how to approach the Code Review process
        and what values should be taken into account.


        To avoid causing arguments among employees, I refrained everyone from
        blaming anyone personally. Instead, I focused on identifying the team's
        problems and finding solutions. Our Appunite value is "Be a team player.
        We succeed and fail as a team".


        ### Step #4


        In the end, the actions taken helped the person become a better software
        engineer. I also spoke with his leader about setting personal goals to
        help him improve on a daily basis.
      - >-
        - About us

        - How we work

        - Services

        - Portfolio

        - Blog

        - Career


        <!-- image -->


        <!-- image -->


        <!-- image -->


        Hubert Salwin


        Elixir Developer


        <!-- image -->


        <!-- image -->


        # The journey of an Appuniter - Hubert


        <!-- image -->


        <!-- image -->


        ## How it all started


        I started working at Appunite a little more than a year ago. At the time
        when I applied to the company I was still completing my bachelor’s
        degree and I had just started learning Elixir a few months before. I had
        never imagined that in such a short amount of time I would meet so many
        new people, tackle many problems and learn so much stuff.
  - source_sentence: >-
      Compare and contrast the build times for the project under different
      approaches, including non-virtualized architecture, virtualization with no
      cache, and virtualization with several improvements. How did the build
      times change with each iteration?
    sentences:
      - >-
        After so many improvements we've noticed that the build started to be
        even faster when compared to the non-virtualized architecture times


        Approximate build times for the same project:


        - 1st approach with no virtualization ~ 10 minutes

        - 1st iteration of virtualization with no cache and other improvements ~
        30 minutes

        - virtualization with several improvements ~ 4-5 minutes


        As you can see, we've achieved a very scalable CI system that seems to
        be ideal...


        ## Current issues
      - >-
        Pro-tips first:


        - note, that in order to use FireStick in your app, you need to download
        AmazonFling and WhisperPlay jars from here and include them in your
        project.

        - while launching the app after integrating FireStick you can get a
        similar crash:java.lang.NoClassDefFoundError: Failed resolution of:
        Lorg/apache/http/conn/util/InetAddressUtils;
              at com.amazon.whisperlink.android.util.RouteUtil.createRoute(RouteUtil.java:78)
              at com.amazon.whisperlink.android.util.RouteUtil.createRoute(RouteUtil.java:51)
                    ...
        Caused by: java.lang.ClassNotFoundException: Didn't find class
        "org.apache.http.conn.util.InetAddressUtils"


        It is caused by lack of apache networking library in android runtime,
        since Android 6. To fix it, add the following code to your
        AndroidManifest.
      - |-
        ### Appunite S.A.

        - Droga Dębińska 3A/3
        - 61-555 Poznań, Poland

        - REGON: 385381222
        - VAT ID (EU): PL 7831812112

        - [email protected]
        - [email protected]

        ### Follow us

        - Our LinkedIn
        - Our Facebook
        - Our X/Twitter
        - Our Medium
        - Our YouTube
        - Our Dribbble
        - Our Instagram
        - Our GitHub
        - Our Behance

        ### Sitemap

        - Home
        - About us
        - How we work
        - Services
        - Portfolio
        - Blog
        - Career
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
  - cosine_accuracy@1
  - cosine_accuracy@3
  - cosine_accuracy@5
  - cosine_accuracy@10
  - cosine_precision@1
  - cosine_precision@3
  - cosine_precision@5
  - cosine_precision@10
  - cosine_recall@1
  - cosine_recall@3
  - cosine_recall@5
  - cosine_recall@10
  - cosine_ndcg@10
  - cosine_mrr@10
  - cosine_map@100
model-index:
  - name: SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2
    results:
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: Unknown
          type: unknown
        metrics:
          - type: cosine_accuracy@1
            value: 0.8421052631578947
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.9497607655502392
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.9760765550239234
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.992822966507177
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.8421052631578947
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.31658692185007975
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.19521531100478468
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.09928229665071771
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.8421052631578947
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.9497607655502392
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.9760765550239234
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.992822966507177
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.9219480799267699
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.8986557302346777
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.8990201319148687
            name: Cosine Map@100

SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: sentence-transformers/all-MiniLM-L6-v2
Maximum Sequence Length: 256 tokens
Output Dimensionality: 384 dimensions
Similarity Function: Cosine Similarity

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'Compare and contrast the build times for the project under different approaches, including non-virtualized architecture, virtualization with no cache, and virtualization with several improvements. How did the build times change with each iteration?',
    "After so many improvements we've noticed that the build started to be even faster when compared to the non-virtualized architecture times\n\nApproximate build times for the same project:\n\n- 1st approach with no virtualization ~ 10 minutes\n- 1st iteration of virtualization with no cache and other improvements ~ 30 minutes\n- virtualization with several improvements ~ 4-5 minutes\n\nAs you can see, we've achieved a very scalable CI system that seems to be ideal...\n\n## Current issues",
    'Pro-tips first:\n\n- note, that in order to use FireStick in your app, you need to download AmazonFling and WhisperPlay jars from here and include them in your project.\n- while launching the app after integrating FireStick you can get a similar crash:java.lang.NoClassDefFoundError: Failed resolution of: Lorg/apache/http/conn/util/InetAddressUtils;\n      at com.amazon.whisperlink.android.util.RouteUtil.createRoute(RouteUtil.java:78)\n      at com.amazon.whisperlink.android.util.RouteUtil.createRoute(RouteUtil.java:51)\n            ...\nCaused by: java.lang.ClassNotFoundException: Didn\'t find class "org.apache.http.conn.util.InetAddressUtils"\n\nIt is caused by lack of apache networking library in android runtime, since Android 6. To fix it, add the following code to your AndroidManifest.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Evaluated with InformationRetrievalEvaluator

Metric	Value
cosine_accuracy@1	0.8421
cosine_accuracy@3	0.9498
cosine_accuracy@5	0.9761
cosine_accuracy@10	0.9928
cosine_precision@1	0.8421
cosine_precision@3	0.3166
cosine_precision@5	0.1952
cosine_precision@10	0.0993
cosine_recall@1	0.8421
cosine_recall@3	0.9498
cosine_recall@5	0.9761
cosine_recall@10	0.9928
cosine_ndcg@10	0.9219
cosine_mrr@10	0.8987
cosine_map@100	0.899

Training Details

Training Dataset

Unnamed Dataset

Size: 1,664 training samples
Columns: sentence_0 and sentence_1
Approximate statistics based on the first 1000 samples:
sentence_0 sentence_1
type string string
details
min: 13 tokens
mean: 30.91 tokens
max: 88 tokens

min: 8 tokens
mean: 175.1 tokens
max: 256 tokens

	sentence_0	sentence_1
type	string	string
details	min: 13 tokens mean: 30.91 tokens max: 88 tokens	min: 8 tokens mean: 175.1 tokens max: 256 tokens

Samples:

sentence_0	sentence_1
`How does the 'code-coverage' job in the workflow ensure that it has access to the necessary test results before running the code coverage analysis?`	``` code-coverage: name: Merged code coverage runs-on: ubuntu-20.04 permissions: pull-requests: write needs: - unit-tests steps: - name: Checkout uses: actions/checkout@v4 - name: Download tests results for both jobs uses: actions/download-artifact@v4 with: name: test-results-unit name: test-results - name: Run code coverage run: ./gradlew codeCoverage - name: Store HTML coverage report uses: actions/upload-artifact@v4 with: name: coverage-report path:
`Describe the purpose and functionality of the 'Download tests results for both jobs' and 'Store HTML coverage report' steps in the workflow.`	``` code-coverage: name: Merged code coverage runs-on: ubuntu-20.04 permissions: pull-requests: write needs: - unit-tests steps: - name: Checkout uses: actions/checkout@v4 - name: Download tests results for both jobs uses: actions/download-artifact@v4 with: name: test-results-unit name: test-results - name: Run code coverage run: ./gradlew codeCoverage - name: Store HTML coverage report uses: actions/upload-artifact@v4 with: name: coverage-report path:
`Explain the purpose of the Payment.Worker module in the given code snippet. How does it utilize the GenServer behavior and what is the significance of the @interval attribute?`	``` defmodule Payment.Worker do use GenServer @interval 10 * 6000 def start_link() do ... end def init() do Process.send_after(self(), :work, @interval) {:ok, %{}} end def handle_info(:work, state) do Repo.transaction(fn -> Documents.Accepted.fetch()

Loss: MultipleNegativesRankingLoss with these parameters:

{
    "scale": 20.0,
    "similarity_fct": "cos_sim"
}

Training Hyperparameters

Non-Default Hyperparameters

eval_strategy: steps
per_device_train_batch_size: 10
per_device_eval_batch_size: 10
num_train_epochs: 1
multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand

overwrite_output_dir: False
do_predict: False
eval_strategy: steps
prediction_loss_only: True
per_device_train_batch_size: 10
per_device_eval_batch_size: 10
per_gpu_train_batch_size: None
per_gpu_eval_batch_size: None
gradient_accumulation_steps: 1
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 5e-05
weight_decay: 0.0
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1
num_train_epochs: 1
max_steps: -1
lr_scheduler_type: linear
lr_scheduler_kwargs: {}
warmup_ratio: 0.0
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
save_safetensors: True
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
no_cuda: False
use_cpu: False
use_mps_device: False
seed: 42
data_seed: None
jit_mode_eval: False
use_ipex: False
bf16: False
fp16: False
fp16_opt_level: O1
half_precision_backend: auto
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: 0
ddp_backend: None
tpu_num_cores: None
tpu_metrics_debug: False
debug: []
dataloader_drop_last: False
dataloader_num_workers: 0
dataloader_prefetch_factor: None
past_index: -1
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: False
ignore_data_skip: False
fsdp: []
fsdp_min_num_params: 0
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
fsdp_transformer_layer_cls_to_wrap: None
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch
optim_args: None
adafactor: False
group_by_length: False
length_column_name: length
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
use_legacy_prediction_loop: False
push_to_hub: False
resume_from_checkpoint: None
hub_model_id: None
hub_strategy: every_save
hub_private_repo: None
hub_always_push: False
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_inputs_for_metrics: False
include_for_metrics: []
eval_do_concat_batches: True
fp16_backend: auto
push_to_hub_model_id: None
push_to_hub_organization: None
mp_parameters:
auto_find_batch_size: False
full_determinism: False
torchdynamo: None
ray_scope: last
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
dispatch_batches: None
split_batches: None
include_tokens_per_second: False
include_num_input_tokens_seen: False
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
use_liger_kernel: False
eval_use_gather_object: False
average_tokens_across_devices: False
prompts: None
batch_sampler: batch_sampler
multi_dataset_batch_sampler: round_robin

Training Logs

Epoch	Step	cosine_ndcg@10
0.2994	50	0.9175
0.5988	100	0.9152
0.8982	150	0.9211
1.0	167	0.9219

Framework Versions

Python: 3.11.11
Sentence Transformers: 3.3.1
Transformers: 4.47.1
PyTorch: 2.5.1+cu124
Accelerate: 1.2.1
Datasets: 3.2.0
Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}