akin-em7
This is a sentence-transformers model finetuned from zerbaUst/cs-em6 on the json dataset. It maps sentences & paragraphs to a 896-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: zerbaUst/cs-em6
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 896 dimensions
- Similarity Function: Cosine Similarity
- Training Dataset:
- json
- Language: en
- License: apache-2.0
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: Qwen2Model
(1): Pooling({'word_embedding_dimension': 896, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("zerbaUst/akin-em7")
# Run inference
sentences = [
'ive lost my account due to my mum making my account and shes lost the details\n\nlost email address: [EMAIL]',
"issue no1:the player is looking to recover their account. - if there is no relevant target account information, please redirect the player to the recovery form available on the help portal.- if you received a recovery form case and have found the player's account information, please follow the dedicated processes you can find in signavio (step 1a below)issue no2:the player is looking to recover their account, but, there is suspicion of account reselling/sharing.there are a few factors you can look out for to determine whether there is a suspicion of account reselling or sharing >- the log-in country in the p360 activity log, can be split into different continuous sections between the account owner and other accesses- the owner will be reaching out to us to recover the account at all times over many cases- the owner disables 2fa, and changes the email address and the display name on the account, just before the activity country changes- there are previous account recovery cases on the account already- the owner is usually able to provide all the information required for aov- you may see duplicate game activations from purchases made by the buyer/other people who access the account from different ip'splease also note:- if a player admits they are not the original owner, we should refuse the recovery request. this includes mentioning the email address on their account belongs to someone else.- if a player mentions that they share the account with other people, we should also refuse recovery in this situation, as this account sharing breaks our terms of service.- if a player admits that they sold their account, we also deny recovery.- if a player also directly admits to purchasing an account, you may deny further support as they have admitted not being the original owner of the account (please leave a private note on the account mentioning player admitting to purchasing account the account)- if they admit to selling this account, we deny recovery also as they broke our terms of service (please leave a private note on the account mentioning the player admitting to selling the account)----------------------------------------------------------case handling:1) determine which of the 2x issues noted above, your contact falls into issue 1 or issue 2.a) for issue 1, please follow this process > https://editor.signavio.com/p/hub/model/ad240bbab2d5496c8e9e9e285b253258b) for issue 2, if you suspect the account is being shared or resold, please make sure to check the salesforce notes. if you can see an existing note mentioning account reselling or sharing on the account (placed after 08/10/2020), deny account recovery and do not proceed with further steps.if you suspect the account being shared or resold and there are no salesforce notes please perform aov (they need to pass aov) and escalate to tier 2.if you are a tier 2 specialist please follow process:\xa0https://editor.signavio.com/p/hub/model/094692df25534e7389cf86aaa89f73f8note: only escalate cases where we suspect the original account owner has sold the account. no need to escalate cases where the account selling was initiated by a hacker.2) next check if the player can pass aov and recover the account by following the process: https://goto.ubisoft.org/jtguf3) if the player passes the aov process, and we suspect that the account is being resold or shared, please make sure the player is aware of all account security measures available to them, and let them know that we may not be able to recover the account again in future. please also make sure to provide the following account security faq: ubisoft.com/help/article/0000627644) place a salesforce note on the account, that we suspect is being resold or shared.please note the usual <[account] recovery> subject line still applies to these cases.---------------------------------------------------------additional information:as we are unable to prove beyond doubt that an account is being deliberately resold or shared, please make sure you do not accuse the player directly of account reselling or sharing. rather than focusing on the reselling/sharing assumption, we would like the focus to be on account security and the fact that our previous security instructions were not followed.",
'issue: player is reporting a cheater/hacker in gamecase handling: please thank the user for reporting the player. and advise we cannot communicate the outcome of any such investigation.please make sure that the reported username field in the ticket is filled out with the username of the suspected cheater.please advise the player to report the cheater within the game.additional information:if the player mentions being ddos during his game, please use the following kb : [report a player] cheating / ddos - 000084534 - i would like to report a player ddosing the game',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 896]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Information Retrieval
- Datasets:
dim_896
,dim_768
,dim_512
,dim_256
,dim_128
anddim_64
- Evaluated with
InformationRetrievalEvaluator
Metric | dim_896 | dim_768 | dim_512 | dim_256 | dim_128 | dim_64 |
---|---|---|---|---|---|---|
cosine_accuracy@1 | 0.0158 | 0.0173 | 0.0152 | 0.0149 | 0.0122 | 0.0099 |
cosine_accuracy@3 | 0.0346 | 0.0349 | 0.0331 | 0.0328 | 0.029 | 0.0224 |
cosine_accuracy@5 | 0.0463 | 0.046 | 0.0448 | 0.0439 | 0.0388 | 0.0319 |
cosine_accuracy@10 | 0.066 | 0.0657 | 0.0624 | 0.0615 | 0.0582 | 0.0472 |
cosine_precision@1 | 0.0158 | 0.0173 | 0.0152 | 0.0149 | 0.0122 | 0.0099 |
cosine_precision@3 | 0.0115 | 0.0116 | 0.011 | 0.0109 | 0.0097 | 0.0075 |
cosine_precision@5 | 0.0093 | 0.0092 | 0.009 | 0.0088 | 0.0078 | 0.0064 |
cosine_precision@10 | 0.0066 | 0.0066 | 0.0062 | 0.0061 | 0.0058 | 0.0047 |
cosine_recall@1 | 0.0158 | 0.0173 | 0.0152 | 0.0149 | 0.0122 | 0.0099 |
cosine_recall@3 | 0.0346 | 0.0349 | 0.0331 | 0.0328 | 0.029 | 0.0224 |
cosine_recall@5 | 0.0463 | 0.046 | 0.0448 | 0.0439 | 0.0388 | 0.0319 |
cosine_recall@10 | 0.066 | 0.0657 | 0.0624 | 0.0615 | 0.0582 | 0.0472 |
cosine_ndcg@10 | 0.0376 | 0.0382 | 0.0361 | 0.0354 | 0.0321 | 0.0258 |
cosine_mrr@10 | 0.029 | 0.0299 | 0.0281 | 0.0274 | 0.0242 | 0.0192 |
cosine_map@100 | 0.0326 | 0.0334 | 0.0318 | 0.0309 | 0.0276 | 0.0226 |
Training Details
Training Dataset
json
- Dataset: json
- Size: 13,396 training samples
- Columns:
anchor
andpositive
- Approximate statistics based on the first 1000 samples:
anchor positive type string string details - min: 4 tokens
- mean: 64.85 tokens
- max: 512 tokens
- min: 0 tokens
- mean: 398.47 tokens
- max: 512 tokens
- Samples:
anchor positive the email i use for this account got hacked and i can no longer access the email. so i can not login into ubisoft and i no longer feel comfortable with that email on my account.
lost email address: [EMAIL]issue no1:the player is looking to recover their account. - if there is no relevant target account information, please redirect the player to the recovery form available on the help portal.- if you received a recovery form case and have found the player's account information, please follow the dedicated processes you can find in signavio (step 1a below)issue no2:the player is looking to recover their account, but, there is suspicion of account reselling/sharing.there are a few factors you can look out for to determine whether there is a suspicion of account reselling or sharing >- the log-in country in the p360 activity log, can be split into different continuous sections between the account owner and other accesses- the owner will be reaching out to us to recover the account at all times over many cases- the owner disables 2fa, and changes the email address and the display name on the account, just before the activity country changes- there are previous account recovery cases on the a...
i am unable to change my email on my account as the old email i no longer have access to it. the old email is showing on my account as [EMAIL] but i need it to be changed to [EMAIL].
issue no1:the player is looking to recover their account. - if there is no relevant target account information, please redirect the player to the recovery form available on the help portal.- if you received a recovery form case and have found the player's account information, please follow the dedicated processes you can find in signavio (step 1a below)issue no2:the player is looking to recover their account, but, there is suspicion of account reselling/sharing.there are a few factors you can look out for to determine whether there is a suspicion of account reselling or sharing >- the log-in country in the p360 activity log, can be split into different continuous sections between the account owner and other accesses- the owner will be reaching out to us to recover the account at all times over many cases- the owner disables 2fa, and changes the email address and the display name on the account, just before the activity country changes- there are previous account recovery cases on the a...
It seems like there is no text provided for me to process. Please provide the support ticket text that you would like me to mask.
issuebased on the customer's description, we are unclear on what the issue might be and we need to ask them for more information, to help us find the correct kbcase handling1) check what information the player has provided us with in their first message, and check keywords in sowa to see if you can find a relevant kbif the player and issue is unclear, please ask for further information, such as >is this issue related to their ubisoft account / a purchase or subscription / missing content / a ban or player report / a game bug / a technical issue / feedback on a game or our servicescan the player further explain what the problem is?do they have any screenshots or videos they can provide us with? are they seeing any error messages on their side they can share with us?2) note: this is a placeholder subject line and the sl should always be updated on the case, once we receive enough info to confirm if we have a relevant kb we can use.if the issue turns out to be the player is reporting an u...
- Loss:
MatryoshkaLoss
with these parameters:{ "loss": "MultipleNegativesRankingLoss", "matryoshka_dims": [ 896, 768, 512, 256, 128, 64 ], "matryoshka_weights": [ 1, 1, 1, 1, 1, 1 ], "n_dims_per_step": -1 }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: epochper_device_train_batch_size
: 2per_device_eval_batch_size
: 2gradient_accumulation_steps
: 32learning_rate
: 2e-05num_train_epochs
: 4lr_scheduler_type
: cosinewarmup_ratio
: 0.1fp16
: Truetf32
: Falseoptim
: adamw_torch_fusedbatch_sampler
: no_duplicates
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: epochprediction_loss_only
: Trueper_device_train_batch_size
: 2per_device_eval_batch_size
: 2per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 32eval_accumulation_steps
: Nonelearning_rate
: 2e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 4max_steps
: -1lr_scheduler_type
: cosinelr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Truefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Falselocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torch_fusedoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Falsehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseeval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseprompts
: Nonebatch_sampler
: no_duplicatesmulti_dataset_batch_sampler
: proportional
Training Logs
Epoch | Step | Training Loss | dim_896_cosine_ndcg@10 | dim_768_cosine_ndcg@10 | dim_512_cosine_ndcg@10 | dim_256_cosine_ndcg@10 | dim_128_cosine_ndcg@10 | dim_64_cosine_ndcg@10 |
---|---|---|---|---|---|---|---|---|
0.0478 | 10 | 0.7342 | - | - | - | - | - | - |
0.0956 | 20 | 0.359 | - | - | - | - | - | - |
0.1433 | 30 | 0.6206 | - | - | - | - | - | - |
0.1911 | 40 | 0.3286 | - | - | - | - | - | - |
0.2389 | 50 | 0.4635 | - | - | - | - | - | - |
0.2867 | 60 | 0.4779 | - | - | - | - | - | - |
0.3344 | 70 | 0.6539 | - | - | - | - | - | - |
0.3822 | 80 | 0.5646 | - | - | - | - | - | - |
0.4300 | 90 | 0.5571 | - | - | - | - | - | - |
0.4778 | 100 | 0.4717 | - | - | - | - | - | - |
0.5255 | 110 | 0.3666 | - | - | - | - | - | - |
0.5733 | 120 | 0.692 | - | - | - | - | - | - |
0.6211 | 130 | 0.6166 | - | - | - | - | - | - |
0.6689 | 140 | 0.618 | - | - | - | - | - | - |
0.7166 | 150 | 0.4731 | - | - | - | - | - | - |
0.7644 | 160 | 0.5375 | - | - | - | - | - | - |
0.8122 | 170 | 0.4384 | - | - | - | - | - | - |
0.8600 | 180 | 0.4214 | - | - | - | - | - | - |
0.9077 | 190 | 0.7847 | - | - | - | - | - | - |
0.9555 | 200 | 0.7723 | - | - | - | - | - | - |
0.9985 | 209 | - | 0.0381 | 0.0366 | 0.0382 | 0.0369 | 0.0343 | 0.0271 |
1.0033 | 210 | 0.5171 | - | - | - | - | - | - |
1.0511 | 220 | 0.5229 | - | - | - | - | - | - |
1.0988 | 230 | 0.3208 | - | - | - | - | - | - |
1.1466 | 240 | 0.361 | - | - | - | - | - | - |
1.1944 | 250 | 0.1921 | - | - | - | - | - | - |
1.2422 | 260 | 0.2428 | - | - | - | - | - | - |
1.2899 | 270 | 0.214 | - | - | - | - | - | - |
1.3377 | 280 | 0.5747 | - | - | - | - | - | - |
1.3855 | 290 | 0.4278 | - | - | - | - | - | - |
1.4333 | 300 | 0.2921 | - | - | - | - | - | - |
1.4810 | 310 | 0.3406 | - | - | - | - | - | - |
1.5288 | 320 | 0.3055 | - | - | - | - | - | - |
1.5766 | 330 | 0.4052 | - | - | - | - | - | - |
1.6244 | 340 | 0.3753 | - | - | - | - | - | - |
1.6721 | 350 | 0.2922 | - | - | - | - | - | - |
1.7199 | 360 | 0.324 | - | - | - | - | - | - |
1.7677 | 370 | 0.2779 | - | - | - | - | - | - |
1.8155 | 380 | 0.3366 | - | - | - | - | - | - |
1.8632 | 390 | 0.4493 | - | - | - | - | - | - |
1.9110 | 400 | 0.3796 | - | - | - | - | - | - |
1.9588 | 410 | 0.4291 | - | - | - | - | - | - |
1.9970 | 418 | - | 0.0378 | 0.0387 | 0.0361 | 0.0346 | 0.0309 | 0.0257 |
2.0066 | 420 | 0.3842 | - | - | - | - | - | - |
2.0543 | 430 | 0.4343 | - | - | - | - | - | - |
2.1021 | 440 | 0.3238 | - | - | - | - | - | - |
2.1499 | 450 | 0.2563 | - | - | - | - | - | - |
2.1977 | 460 | 0.3092 | - | - | - | - | - | - |
2.2454 | 470 | 0.2376 | - | - | - | - | - | - |
2.2932 | 480 | 0.2644 | - | - | - | - | - | - |
2.3410 | 490 | 0.5582 | - | - | - | - | - | - |
2.3888 | 500 | 0.3216 | - | - | - | - | - | - |
2.4365 | 510 | 0.2821 | - | - | - | - | - | - |
2.4843 | 520 | 0.2969 | - | - | - | - | - | - |
2.5321 | 530 | 0.2768 | - | - | - | - | - | - |
2.5799 | 540 | 0.3804 | - | - | - | - | - | - |
2.6277 | 550 | 0.3968 | - | - | - | - | - | - |
2.6754 | 560 | 0.2676 | - | - | - | - | - | - |
2.7232 | 570 | 0.3127 | - | - | - | - | - | - |
2.7710 | 580 | 0.2596 | - | - | - | - | - | - |
2.8188 | 590 | 0.3421 | - | - | - | - | - | - |
2.8665 | 600 | 0.493 | - | - | - | - | - | - |
2.9143 | 610 | 0.3426 | - | - | - | - | - | - |
2.9621 | 620 | 0.4613 | - | - | - | - | - | - |
2.9955 | 627 | - | 0.0363 | 0.0368 | 0.0358 | 0.0348 | 0.0319 | 0.0253 |
3.0099 | 630 | 0.3526 | - | - | - | - | - | - |
3.0576 | 640 | 0.4347 | - | - | - | - | - | - |
3.1054 | 650 | 0.3257 | - | - | - | - | - | - |
3.1532 | 660 | 0.2329 | - | - | - | - | - | - |
3.2010 | 670 | 0.3199 | - | - | - | - | - | - |
3.2487 | 680 | 0.2374 | - | - | - | - | - | - |
3.2965 | 690 | 0.2711 | - | - | - | - | - | - |
3.3443 | 700 | 0.5732 | - | - | - | - | - | - |
3.3921 | 710 | 0.293 | - | - | - | - | - | - |
3.4398 | 720 | 0.2809 | - | - | - | - | - | - |
3.4876 | 730 | 0.3323 | - | - | - | - | - | - |
3.5354 | 740 | 0.2609 | - | - | - | - | - | - |
3.5832 | 750 | 0.3763 | - | - | - | - | - | - |
3.6309 | 760 | 0.3886 | - | - | - | - | - | - |
3.6787 | 770 | 0.2631 | - | - | - | - | - | - |
3.7265 | 780 | 0.3211 | - | - | - | - | - | - |
3.7743 | 790 | 0.2488 | - | - | - | - | - | - |
3.8220 | 800 | 0.3503 | - | - | - | - | - | - |
3.8698 | 810 | 0.4986 | - | - | - | - | - | - |
3.9176 | 820 | 0.3986 | - | - | - | - | - | - |
3.9654 | 830 | 0.4216 | - | - | - | - | - | - |
3.9940 | 836 | - | 0.0376 | 0.0382 | 0.0361 | 0.0354 | 0.0321 | 0.0258 |
Framework Versions
- Python: 3.10.11
- Sentence Transformers: 3.3.1
- Transformers: 4.41.2
- PyTorch: 2.1.2+cu121
- Accelerate: 0.33.0
- Datasets: 3.0.1
- Tokenizers: 0.19.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MatryoshkaLoss
@misc{kusupati2024matryoshka,
title={Matryoshka Representation Learning},
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
year={2024},
eprint={2205.13147},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
- Downloads last month
- 3
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
Model tree for zerbaUst/akin-em7
Base model
zerbaUst/cs-em6Evaluation results
- Cosine Accuracy@1 on dim 896self-reported0.016
- Cosine Accuracy@3 on dim 896self-reported0.035
- Cosine Accuracy@5 on dim 896self-reported0.046
- Cosine Accuracy@10 on dim 896self-reported0.066
- Cosine Precision@1 on dim 896self-reported0.016
- Cosine Precision@3 on dim 896self-reported0.012
- Cosine Precision@5 on dim 896self-reported0.009
- Cosine Precision@10 on dim 896self-reported0.007
- Cosine Recall@1 on dim 896self-reported0.016
- Cosine Recall@3 on dim 896self-reported0.035