Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
chanchan7
/
llama-7b-dpo-qlora-relu
like
0
PEFT
TensorBoard
Safetensors
HuggingFaceH4/ultrafeedback_binarized
llama
alignment-handbook
Generated from Trainer
trl
dpo
Model card
Files
Files and versions
Metrics
Training metrics
Community
Train
Use this model
7221958
llama-7b-dpo-qlora-relu
/
train_results.json
chanchan7
Model save
7221958
verified
12 months ago
raw
Copy download link
history
blame
Safe
196 Bytes
{
"epoch"
:
1.0
,
"train_loss"
:
0.6582325137556925
,
"train_runtime"
:
111789.0929
,
"train_samples"
:
61135
,
"train_samples_per_second"
:
0.547
,
"train_steps_per_second"
:
0.034
}