Mistral 7B Zephyr Orpo
The Zephyr Orpo recipe applied on top of Mistral 7B v0.2 (new recipe with new Mistral base model)
Model description
- Model type: A 7.2B parameter GPT-like model fine-tuned on a mix of publicly available, synthetic datasets.
- Language(s) (NLP): Primarily English
- Finetuned from model: wandb/Mistral-7B-v0.2
Recipe
We trained using the alignment handbook recipe and logging to W&B
Visit the W&B workspace here
Results:
- MT bench
########## First turn ##########
score
model turn
zephyr-orpo-7b-v0.2 1 7.44375
########## Second turn ##########
score
model turn
zephyr-orpo-7b-v0.2 2 6.875
########## Average ##########
score
model
zephyr-orpo-7b-v0.2 7.159375
Trained on a single H100 for 2 hours!
- Downloads last month
- 7
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.