BGE base Financial Matryoshka

This is a sentence-transformers model finetuned from BAAI/bge-base-en-v1.5. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: BAAI/bge-base-en-v1.5
Maximum Sequence Length: 512 tokens
Output Dimensionality: 768 dimensions
Similarity Function: Cosine Similarity
Language: en
License: apache-2.0

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Shashwat13333/bge-base-en-v1.5_v4")
# Run inference
sentences = [
    "What's it like working here?",
    "Yes! We are constantly looking for talented individuals. Check our Careers Page for openings.\nYou can submit your resume on our website or via our LinkedIn page.\nWe foster a people-centric, collaborative, and innovation-driven culture.\nGlassdoor: 4.3 + rating\n\nSharing Stories from Our Team\n\nDiscover firsthand experiences, growth journeys, and the vibrant culture that fuels our success.\n\nI have been a part of Techchefz for 3 years, and I can confidently say it's been a remarkable journey. From day one, I was welcomed into a vibrant community that values collaboration, creativity, and personal growth. The company culture here isn't just a buzzword, it's tangible in every interaction and initiative.\nprofileImg\n\nAashish Massand\n\nSr. Manager Delivery\n\nTechChefz has been a transformative journey, equipping me with invaluable skills and fostering a supportive community. From coding fundamentals to advanced techniques, I've gained confidence and expertise. Grateful for this experience and opportunity.\nprofileImg\n\nPankaj Datt\n\nAssociate Technology                                                                                                                                                                                                                                                                                                                                                                                         \n                                                                                                                                                                                                                                                                                                                                                                                                                                 Being a member of TechChefz's HR team is truly uplifting. The genuine positivity, motivation, and mentorship we share make each day uniquely rewarding. Fun is woven into the work culture, fostering authentic connections. I'm grateful for the authenticity and dynamism that make every day at TechChefz truly special.\nprofileImg\n\nShreya Shukla\n\nHuman Resource Associate\n\nAdvancing from an initial team member to leading a new division, the opportunity to innovate and take risks drives my motivation. It's rare to find an organization that both nurtures passion and aligns with your mindset.\nprofileImg\n\nMohit Kumar\n\nLead - DevSecOps\n\nWorking at TechChefz has been an exceptional journey. The commitment to continuous learning and professional development is unparalleled, ensuring employees stay at the forefront of the tech industry. The leadership is forward-thinking, promoting transparency and a harmonious work-life balance.\nprofileImg\n\nHarprit Singh Kohli\n\nSr. Manager - Delivery",
    "Business Process Automation: LLM Powered Agents\nOverview:\nOur AI-powered agents are designed to automate complex, time-consuming business processes. By leveraging the latest technologies such as Langchain and MongoDB, these agents can handle a variety of tasks, including data collection and web navigation, all tailored to your specific business needs.\nHow it works:\nAI Agents can be customized to automate business workflows.\nUse of Langchain ensures advanced capabilities for handling and processing web content and data.\nIntegration with MongoDB enables efficient data storage and management.\nImpact:\nSignificant time savings by automating repetitive tasks.\nEnhanced productivity, allowing employees to focus on higher-value work.\nIncreased accuracy and reduction of human errors in critical processes.\n\nCustomer Service: Sentiment Classification Using BERT\nOverview:\nThis accelerator leverages Advanced AI and Hugging Face transformers to categorize customer reviews into positive, negative, or neutral sentiments. It helps businesses gain actionable insights into customer feedback by analyzing reviews, enabling companies to respond proactively to customer concerns and improve their offerings.\nHow it works:\nBERT (Bidirectional Encoder Representations from Transformers) is fine-tuned to understand sentiment in customer feedback.\nThe tool processes customer reviews, assigning sentiment categories that can help prioritize responses or improvements.\nThe solution helps automate feedback analysis, improving the speed and accuracy of customer service interactions.\nImpact:\nActionable insights for customer service teams to improve engagement and responses.\nProactive improvements based on real-time feedback analysis.\nEnhanced customer experience by responding quickly and appropriately to customer sentiments.\n\nArtificial Intelligence: Fine-Tuning Large Language Models\nOverview:\nLarge Language Models (LLMs), such as Mistral 7B, are at the forefront of AI innovation. In this accelerator, TechChefz Digital showcases its technical prowess by fine-tuning the Mistral 7B model on the VIGO dataset, which is an open-source dataset for training large-scale language models.\nHow it works:\nWe fine-tune Mistral 7B to improve its accuracy and relevancy based on specific datasets like VIGO.\nThis allows the AI to be more context-aware and aligned with your particular business needs.\nImpact:\nEnhanced AI accuracy, enabling more effective AI-driven applications.\nOpen-source innovation that leads to cost-effective AI solutions.\nPotential to integrate fine-tuned models into a range of industries like customer service, marketing, and automation.\n\nE-Commerce: Image Similarity Search\nOverview:\nIn the e-commerce world, product discovery can be a challenge. Our AI-powered Image Similarity Search Accelerator revolutionizes product search by allowing customers to upload an image and instantly find matching or similar items. This tool enhances the shopping experience and drives higher engagement by enabling users to find products more efficiently.\nHow it works:\nCustomers upload an image of the product they are looking for.\nThe AI model uses image recognition to search for visually similar products from the e-commerce site’s catalog.\nProduct recommendations are delivered instantly based on visual matches.\nImpact:\nImproved user experience, reducing time spent searching for products.\nHigher conversion rates as customers can quickly find what they are looking for.\nDifferentiation from competitors with advanced AI-powered product discovery features.\n\nCustomer Service: RAG Chatbots\nOverview:\nRAG (Retrieval-Augmented Generation) chatbots are designed to provide timely and relevant responses to customer queries. By combining retrieval-based models with generative capabilities, these chatbots offer precise and contextually accurate answers. Our RAG-powered chatbots provide a more intelligent and tailored experience, making customer interactions smoother and more efficient.\nHow it works:\nThe chatbot first retrieves relevant data from a knowledge base or external sources.\nIt then generates responses based on this data, ensuring that each answer is aligned with the user’s query.\nRAG technology ensures that chatbots provide relevant and up-to-date information.\nImpact:\nEnhanced customer satisfaction with more precise and relevant answers.\n24/7 support capabilities, improving overall customer service availability.\nIncreased operational efficiency by automating customer interactions at scale.\n\nRequest a Demo and Browse Accelerators:\nIn our company, we offer you the opportunity to experience the power and potential of these accelerators first-hand. Whether you're looking to streamline business processes, enhance customer service, or accelerate your e-commerce strategies, our solution accelerators are built to drive fast, impactful results.\nGet started today by requesting a demo or browsing through our accelerator offerings to discover the right tools that will transform your tech projects and accelerate your success.\n",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Datasets: dim_768, dim_512, dim_256, dim_128 and dim_64
Evaluated with InformationRetrievalEvaluator

Metric	dim_768	dim_512	dim_256	dim_128	dim_64
cosine_accuracy@1	0.0	0.0	0.0	0.125	0.0
cosine_accuracy@3	0.375	0.375	0.375	0.375	0.375
cosine_accuracy@5	0.5	0.5	0.375	0.375	0.5
cosine_accuracy@10	0.875	0.875	0.875	0.875	0.625
cosine_precision@1	0.0	0.0	0.0	0.125	0.0
cosine_precision@3	0.125	0.125	0.125	0.125	0.125
cosine_precision@5	0.1	0.1	0.075	0.075	0.1
cosine_precision@10	0.0875	0.0875	0.0875	0.0875	0.0625
cosine_recall@1	0.0	0.0	0.0	0.125	0.0
cosine_recall@3	0.375	0.375	0.375	0.375	0.375
cosine_recall@5	0.5	0.5	0.375	0.375	0.5
cosine_recall@10	0.875	0.875	0.875	0.875	0.625
cosine_ndcg@10	0.3691	0.3731	0.3758	0.425	0.3157
cosine_mrr@10	0.2165	0.2213	0.2274	0.2927	0.2158
cosine_map@100	0.2234	0.2297	0.2378	0.3023	0.248

Training Details

Training Dataset

Unnamed Dataset

Size: 16 training samples
Columns: anchor and positive
Approximate statistics based on the first 16 samples:
anchor positive
type string string
details
min: 8 tokens
mean: 11.12 tokens
max: 14 tokens

min: 332 tokens
mean: 479.62 tokens
max: 512 tokens

	anchor	positive
type	string	string
details	min: 8 tokens mean: 11.12 tokens max: 14 tokens	min: 332 tokens mean: 479.62 tokens max: 512 tokens

Samples:

anchor	positive
`What e-commerce platforms do you develop on?`	Web & Mobile Development Frontend Development: HTML5: Markup language for structuring web content. CSS/JS: Styling and interactivity for dynamic user experiences. React JS: JavaScript library for building user interfaces with a component-based approach. Angular JS: JavaScript framework for developing dynamic, single-page applications (SPAs). Vue JS: Progressive JavaScript framework for building user interfaces and SPAs. Next JS: React-based framework for server-side rendering and static website generation. Mobile Development: React Native: Framework for building native mobile applications using React. Flutter: UI toolkit for building natively compiled applications for mobile, web, and desktop from a single codebase. Backend Development: Node JS: JavaScript runtime for building scalable backend services. Python: High-level programming language used for backend services, machine learning, and data science. Frappe: Full-stack web application framework based on Python and JavaScript. Java:...
`Do you have any job openings right now?`	Yes! We are constantly looking for talented individuals. Check our Careers Page for openings. You can submit your resume on our website or via our LinkedIn page. We foster a people-centric, collaborative, and innovation-driven culture. Glassdoor: 4.3 + rating Sharing Stories from Our Team Discover firsthand experiences, growth journeys, and the vibrant culture that fuels our success. I have been a part of Techchefz for 3 years, and I can confidently say it's been a remarkable journey. From day one, I was welcomed into a vibrant community that values collaboration, creativity, and personal growth. The company culture here isn't just a buzzword, it's tangible in every interaction and initiative. profileImg Aashish Massand Sr. Manager Delivery TechChefz has been a transformative journey, equipping me with invaluable skills and fostering a supportive community. From coding fundamentals to advanced techniques, I've gained confidence and expertise. Grateful for this experience and oppo...
`What does the CEO of your company do?`	Mayank Maggon – Founder, CEO & CTO Mayank Maggon is the visionary leader of TechChefz Digital, responsible for driving the company's strategic direction. With over 15 years of experience in entrepreneurship and technology, Mayank has a deep understanding of AI, cloud technologies, and digital transformation. His leadership is centered on building a culture of innovation, operational excellence, and customer-first solutions. Mayank holds a postgraduate certificate in Technology Leadership & Innovation from MIT and a postgraduate diploma in Sales & Marketing Communication from MICA, making him uniquely equipped to lead the company's growth in an ever-evolving digital ecosystem. Link: https://www.linkedin.com/in/mayankmaggon/ Other Leadership Members: Rahul Aggarwal – Senior Director, Technology: Rahul brings a wealth of technical expertise, overseeing the delivery of high-impact digital solutions. Link: https://www.linkedin.com/in/rahul-aggarwal-84288a59/ Akshit Maggon – Associate Dire...

Loss: MatryoshkaLoss with these parameters:

{
    "loss": "MultipleNegativesRankingLoss",
    "matryoshka_dims": [
        768,
        512,
        256,
        128,
        64
    ],
    "matryoshka_weights": [
        1,
        1,
        1,
        1,
        1
    ],
    "n_dims_per_step": -1
}

Training Hyperparameters

Non-Default Hyperparameters

eval_strategy: epoch
gradient_accumulation_steps: 4
learning_rate: 1e-05
weight_decay: 0.01
num_train_epochs: 4
lr_scheduler_type: cosine
warmup_ratio: 0.1
fp16: True
load_best_model_at_end: True
optim: adamw_torch_fused
push_to_hub: True
hub_model_id: Shashwat13333/bge-base-en-v1.5_v4
push_to_hub_model_id: bge-base-en-v1.5_v4
batch_sampler: no_duplicates

All Hyperparameters

Click to expand

overwrite_output_dir: False
do_predict: False
eval_strategy: epoch
prediction_loss_only: True
per_device_train_batch_size: 8
per_device_eval_batch_size: 8
per_gpu_train_batch_size: None
per_gpu_eval_batch_size: None
gradient_accumulation_steps: 4
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 1e-05
weight_decay: 0.01
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1.0
num_train_epochs: 4
max_steps: -1
lr_scheduler_type: cosine
lr_scheduler_kwargs: {}
warmup_ratio: 0.1
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
save_safetensors: True
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
no_cuda: False
use_cpu: False
use_mps_device: False
seed: 42
data_seed: None
jit_mode_eval: False
use_ipex: False
bf16: False
fp16: True
fp16_opt_level: O1
half_precision_backend: auto
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: 0
ddp_backend: None
tpu_num_cores: None
tpu_metrics_debug: False
debug: []
dataloader_drop_last: False
dataloader_num_workers: 0
dataloader_prefetch_factor: None
past_index: -1
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: True
ignore_data_skip: False
fsdp: []
fsdp_min_num_params: 0
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
fsdp_transformer_layer_cls_to_wrap: None
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch_fused
optim_args: None
adafactor: False
group_by_length: False
length_column_name: length
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
use_legacy_prediction_loop: False
push_to_hub: True
resume_from_checkpoint: None
hub_model_id: Shashwat13333/bge-base-en-v1.5_v4
hub_strategy: every_save
hub_private_repo: None
hub_always_push: False
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_inputs_for_metrics: False
include_for_metrics: []
eval_do_concat_batches: True
fp16_backend: auto
push_to_hub_model_id: bge-base-en-v1.5_v4
push_to_hub_organization: None
mp_parameters:
auto_find_batch_size: False
full_determinism: False
torchdynamo: None
ray_scope: last
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
dispatch_batches: None
split_batches: None
include_tokens_per_second: False
include_num_input_tokens_seen: False
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
use_liger_kernel: False
eval_use_gather_object: False
average_tokens_across_devices: False
prompts: None
batch_sampler: no_duplicates
multi_dataset_batch_sampler: proportional

Training Logs

Epoch	Step	Training Loss	dim_768_cosine_ndcg@10	dim_512_cosine_ndcg@10	dim_256_cosine_ndcg@10	dim_128_cosine_ndcg@10	dim_64_cosine_ndcg@10
1.0	1	13.1742	0.3388	0.4638	0.2489	0.3873	0.2666
2.0	2	-	0.4500	0.4278	0.2872	0.4212	0.3400
3.0	3	-	0.3936	0.3370	0.3036	0.3289	0.4710
4.0	4	-	0.3691	0.3731	0.3758	0.4250	0.3157

The bold row denotes the saved checkpoint.

Framework Versions

Python: 3.11.11
Sentence Transformers: 3.4.1
Transformers: 4.48.3
PyTorch: 2.5.1+cu124
Accelerate: 1.3.0
Datasets: 3.3.1
Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

Shashwat13333
/

bge-base-en-v1.5_v4