wandb
/

zephyr-orpo-7b-v0.2

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Mistral 7B Zephyr Orpo

The Zephyr Orpo recipe applied on top of Mistral 7B v0.2 (new recipe with new Mistral base model)

Model description

Model type: A 7.2B parameter GPT-like model fine-tuned on a mix of publicly available, synthetic datasets.
Language(s) (NLP): Primarily English
Finetuned from model: wandb/Mistral-7B-v0.2

Recipe

We trained using the alignment handbook recipe and logging to W&B

Visit the W&B workspace here

Results:

MT bench

########## First turn ##########
                            score
model               turn
zephyr-orpo-7b-v0.2 1     7.44375

########## Second turn ##########
                          score
model               turn
zephyr-orpo-7b-v0.2 2     6.875

########## Average ##########
                        score
model
zephyr-orpo-7b-v0.2  7.159375

Trained on a single H100 for 2 hours!

Downloads last month: 7

Safetensors

Model size

7.24B params

Tensor type

BF16

·

Inference Providers NEW

Text Generation

This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for wandb/zephyr-orpo-7b-v0.2

Base model

wandb/Mistral-7B-v0.2

Finetuned

(1)

this model

Quantizations

1 model

Dataset used to train wandb/zephyr-orpo-7b-v0.2