BGE base Financial Matryoshka

This is a sentence-transformers model finetuned from BAAI/bge-base-en-v1.5. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: BAAI/bge-base-en-v1.5
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity
  • Language: en
  • License: apache-2.0

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Shashwat13333/bge-base-en-v1.5_v4")
# Run inference
sentences = [
    "What's it like working here?",
    "Yes! We are constantly looking for talented individuals. Check our Careers Page for openings.\nYou can submit your resume on our website or via our LinkedIn page.\nWe foster a people-centric, collaborative, and innovation-driven culture.\nGlassdoor: 4.3 + rating\n\nSharing Stories from Our Team\n\nDiscover firsthand experiences, growth journeys, and the vibrant culture that fuels our success.\n\nI have been a part of Techchefz for 3 years, and I can confidently say it's been a remarkable journey. From day one, I was welcomed into a vibrant community that values collaboration, creativity, and personal growth. The company culture here isn't just a buzzword, it's tangible in every interaction and initiative.\nprofileImg\n\nAashish Massand\n\nSr. Manager Delivery\n\nTechChefz has been a transformative journey, equipping me with invaluable skills and fostering a supportive community. From coding fundamentals to advanced techniques, I've gained confidence and expertise. Grateful for this experience and opportunity.\nprofileImg\n\nPankaj Datt\n\nAssociate Technology                                                                                                                                                                                                                                                                                                                                                                                         \n                                                                                                                                                                                                                                                                                                                                                                                                                                 Being a member of TechChefz's HR team is truly uplifting. The genuine positivity, motivation, and mentorship we share make each day uniquely rewarding. Fun is woven into the work culture, fostering authentic connections. I'm grateful for the authenticity and dynamism that make every day at TechChefz truly special.\nprofileImg\n\nShreya Shukla\n\nHuman Resource Associate\n\nAdvancing from an initial team member to leading a new division, the opportunity to innovate and take risks drives my motivation. It's rare to find an organization that both nurtures passion and aligns with your mindset.\nprofileImg\n\nMohit Kumar\n\nLead - DevSecOps\n\nWorking at TechChefz has been an exceptional journey. The commitment to continuous learning and professional development is unparalleled, ensuring employees stay at the forefront of the tech industry. The leadership is forward-thinking, promoting transparency and a harmonious work-life balance.\nprofileImg\n\nHarprit Singh Kohli\n\nSr. Manager - Delivery",
    "Business Process Automation: LLM Powered Agents\nOverview:\nOur AI-powered agents are designed to automate complex, time-consuming business processes. By leveraging the latest technologies such as Langchain and MongoDB, these agents can handle a variety of tasks, including data collection and web navigation, all tailored to your specific business needs.\nHow it works:\nAI Agents can be customized to automate business workflows.\nUse of Langchain ensures advanced capabilities for handling and processing web content and data.\nIntegration with MongoDB enables efficient data storage and management.\nImpact:\nSignificant time savings by automating repetitive tasks.\nEnhanced productivity, allowing employees to focus on higher-value work.\nIncreased accuracy and reduction of human errors in critical processes.\n\nCustomer Service: Sentiment Classification Using BERT\nOverview:\nThis accelerator leverages Advanced AI and Hugging Face transformers to categorize customer reviews into positive, negative, or neutral sentiments. It helps businesses gain actionable insights into customer feedback by analyzing reviews, enabling companies to respond proactively to customer concerns and improve their offerings.\nHow it works:\nBERT (Bidirectional Encoder Representations from Transformers) is fine-tuned to understand sentiment in customer feedback.\nThe tool processes customer reviews, assigning sentiment categories that can help prioritize responses or improvements.\nThe solution helps automate feedback analysis, improving the speed and accuracy of customer service interactions.\nImpact:\nActionable insights for customer service teams to improve engagement and responses.\nProactive improvements based on real-time feedback analysis.\nEnhanced customer experience by responding quickly and appropriately to customer sentiments.\n\nArtificial Intelligence: Fine-Tuning Large Language Models\nOverview:\nLarge Language Models (LLMs), such as Mistral 7B, are at the forefront of AI innovation. In this accelerator, TechChefz Digital showcases its technical prowess by fine-tuning the Mistral 7B model on the VIGO dataset, which is an open-source dataset for training large-scale language models.\nHow it works:\nWe fine-tune Mistral 7B to improve its accuracy and relevancy based on specific datasets like VIGO.\nThis allows the AI to be more context-aware and aligned with your particular business needs.\nImpact:\nEnhanced AI accuracy, enabling more effective AI-driven applications.\nOpen-source innovation that leads to cost-effective AI solutions.\nPotential to integrate fine-tuned models into a range of industries like customer service, marketing, and automation.\n\nE-Commerce: Image Similarity Search\nOverview:\nIn the e-commerce world, product discovery can be a challenge. Our AI-powered Image Similarity Search Accelerator revolutionizes product search by allowing customers to upload an image and instantly find matching or similar items. This tool enhances the shopping experience and drives higher engagement by enabling users to find products more efficiently.\nHow it works:\nCustomers upload an image of the product they are looking for.\nThe AI model uses image recognition to search for visually similar products from the e-commerce site’s catalog.\nProduct recommendations are delivered instantly based on visual matches.\nImpact:\nImproved user experience, reducing time spent searching for products.\nHigher conversion rates as customers can quickly find what they are looking for.\nDifferentiation from competitors with advanced AI-powered product discovery features.\n\nCustomer Service: RAG Chatbots\nOverview:\nRAG (Retrieval-Augmented Generation) chatbots are designed to provide timely and relevant responses to customer queries. By combining retrieval-based models with generative capabilities, these chatbots offer precise and contextually accurate answers. Our RAG-powered chatbots provide a more intelligent and tailored experience, making customer interactions smoother and more efficient.\nHow it works:\nThe chatbot first retrieves relevant data from a knowledge base or external sources.\nIt then generates responses based on this data, ensuring that each answer is aligned with the user’s query.\nRAG technology ensures that chatbots provide relevant and up-to-date information.\nImpact:\nEnhanced customer satisfaction with more precise and relevant answers.\n24/7 support capabilities, improving overall customer service availability.\nIncreased operational efficiency by automating customer interactions at scale.\n\nRequest a Demo and Browse Accelerators:\nIn our company, we offer you the opportunity to experience the power and potential of these accelerators first-hand. Whether you're looking to streamline business processes, enhance customer service, or accelerate your e-commerce strategies, our solution accelerators are built to drive fast, impactful results.\nGet started today by requesting a demo or browsing through our accelerator offerings to discover the right tools that will transform your tech projects and accelerate your success.\n",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric dim_768 dim_512 dim_256 dim_128 dim_64
cosine_accuracy@1 0.0 0.0 0.0 0.125 0.0
cosine_accuracy@3 0.375 0.375 0.375 0.375 0.375
cosine_accuracy@5 0.5 0.5 0.375 0.375 0.5
cosine_accuracy@10 0.875 0.875 0.875 0.875 0.625
cosine_precision@1 0.0 0.0 0.0 0.125 0.0
cosine_precision@3 0.125 0.125 0.125 0.125 0.125
cosine_precision@5 0.1 0.1 0.075 0.075 0.1
cosine_precision@10 0.0875 0.0875 0.0875 0.0875 0.0625
cosine_recall@1 0.0 0.0 0.0 0.125 0.0
cosine_recall@3 0.375 0.375 0.375 0.375 0.375
cosine_recall@5 0.5 0.5 0.375 0.375 0.5
cosine_recall@10 0.875 0.875 0.875 0.875 0.625
cosine_ndcg@10 0.3691 0.3731 0.3758 0.425 0.3157
cosine_mrr@10 0.2165 0.2213 0.2274 0.2927 0.2158
cosine_map@100 0.2234 0.2297 0.2378 0.3023 0.248

Training Details

Training Dataset

Unnamed Dataset

  • Size: 16 training samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 16 samples:
    anchor positive
    type string string
    details
    • min: 8 tokens
    • mean: 11.12 tokens
    • max: 14 tokens
    • min: 332 tokens
    • mean: 479.62 tokens
    • max: 512 tokens
  • Samples:
    anchor positive
    What e-commerce platforms do you develop on? Web & Mobile Development
    Frontend Development:
    HTML5: Markup language for structuring web content.
    CSS/JS: Styling and interactivity for dynamic user experiences.
    React JS: JavaScript library for building user interfaces with a component-based approach.
    Angular JS: JavaScript framework for developing dynamic, single-page applications (SPAs).
    Vue JS: Progressive JavaScript framework for building user interfaces and SPAs.
    Next JS: React-based framework for server-side rendering and static website generation.
    Mobile Development:
    React Native: Framework for building native mobile applications using React.
    Flutter: UI toolkit for building natively compiled applications for mobile, web, and desktop from a single codebase.
    Backend Development:
    Node JS: JavaScript runtime for building scalable backend services.
    Python: High-level programming language used for backend services, machine learning, and data science.
    Frappe: Full-stack web application framework based on Python and JavaScript.
    Java:...
    Do you have any job openings right now? Yes! We are constantly looking for talented individuals. Check our Careers Page for openings.
    You can submit your resume on our website or via our LinkedIn page.
    We foster a people-centric, collaborative, and innovation-driven culture.
    Glassdoor: 4.3 + rating

    Sharing Stories from Our Team

    Discover firsthand experiences, growth journeys, and the vibrant culture that fuels our success.

    I have been a part of Techchefz for 3 years, and I can confidently say it's been a remarkable journey. From day one, I was welcomed into a vibrant community that values collaboration, creativity, and personal growth. The company culture here isn't just a buzzword, it's tangible in every interaction and initiative.
    profileImg

    Aashish Massand

    Sr. Manager Delivery

    TechChefz has been a transformative journey, equipping me with invaluable skills and fostering a supportive community. From coding fundamentals to advanced techniques, I've gained confidence and expertise. Grateful for this experience and oppo...
    What does the CEO of your company do? Mayank Maggon – Founder, CEO & CTO
    Mayank Maggon is the visionary leader of TechChefz Digital, responsible for driving the company's strategic direction. With over 15 years of experience in entrepreneurship and technology, Mayank has a deep understanding of AI, cloud technologies, and digital transformation. His leadership is centered on building a culture of innovation, operational excellence, and customer-first solutions.
    Mayank holds a postgraduate certificate in Technology Leadership & Innovation from MIT and a postgraduate diploma in Sales & Marketing Communication from MICA, making him uniquely equipped to lead the company's growth in an ever-evolving digital ecosystem.
    Link: https://www.linkedin.com/in/mayankmaggon/
    Other Leadership Members:
    Rahul Aggarwal – Senior Director, Technology: Rahul brings a wealth of technical expertise, overseeing the delivery of high-impact digital solutions.
    Link: https://www.linkedin.com/in/rahul-aggarwal-84288a59/
    Akshit Maggon – Associate Dire...
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: epoch
  • gradient_accumulation_steps: 4
  • learning_rate: 1e-05
  • weight_decay: 0.01
  • num_train_epochs: 4
  • lr_scheduler_type: cosine
  • warmup_ratio: 0.1
  • fp16: True
  • load_best_model_at_end: True
  • optim: adamw_torch_fused
  • push_to_hub: True
  • hub_model_id: Shashwat13333/bge-base-en-v1.5_v4
  • push_to_hub_model_id: bge-base-en-v1.5_v4
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: epoch
  • prediction_loss_only: True
  • per_device_train_batch_size: 8
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 4
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 1e-05
  • weight_decay: 0.01
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 4
  • max_steps: -1
  • lr_scheduler_type: cosine
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: True
  • resume_from_checkpoint: None
  • hub_model_id: Shashwat13333/bge-base-en-v1.5_v4
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: bge-base-en-v1.5_v4
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss dim_768_cosine_ndcg@10 dim_512_cosine_ndcg@10 dim_256_cosine_ndcg@10 dim_128_cosine_ndcg@10 dim_64_cosine_ndcg@10
1.0 1 13.1742 0.3388 0.4638 0.2489 0.3873 0.2666
2.0 2 - 0.4500 0.4278 0.2872 0.4212 0.3400
3.0 3 - 0.3936 0.3370 0.3036 0.3289 0.4710
4.0 4 - 0.3691 0.3731 0.3758 0.4250 0.3157
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.11.11
  • Sentence Transformers: 3.4.1
  • Transformers: 4.48.3
  • PyTorch: 2.5.1+cu124
  • Accelerate: 1.3.0
  • Datasets: 3.3.1
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
25
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for Shashwat13333/bge-base-en-v1.5_v4

Finetuned
(353)
this model

Evaluation results