YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

OpenLLaMA 7Bv2 Model Card

Model Description

OpenLLaMA 7Bv2 is a cutting-edge language model, trained with a focus on delivering high-quality, contextually relevant text predictions. It leverages a diverse composite dataset that includes web-crawled data, scholarly articles, and a wide range of literature and question-answer pairs to ensure broad domain coverage and applicability.

Training Data

The model was trained on a composite dataset that includes:

  • Falcon refined-web dataset
  • starcoder datasets
  • Contributions from Wikipedia for encyclopedic knowledge
  • Academic papers from arXiv for scientific understanding
  • A vast collection of books spanning multiple genres
  • Stack Exchange data curated by RedPajama

Training Procedure

  • Learning Rate: Utilized a maximum learning rate of 3e-4 and a minimum learning rate of 3e-5.
  • Batch Size: Employed a batch size of 4 million tokens, optimizing the training process for both efficiency and performance.
  • Learning Rate Scheduler: The model's learning rate scheduling closely follows the strategy used in Llama2, ensuring gradual adjustments for optimal convergence.
Downloads last month
11
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Collection including m-a-p/OpenLLaMA-Reproduce-2041.21B