surajjoshi's picture
End of training
a325cd7
raw
history blame
209 Bytes
{
"epoch": 3.97,
"total_flos": 2.721730793951232e+17,
"train_loss": 0.2819982838063013,
"train_runtime": 13750.6292,
"train_samples_per_second": 0.801,
"train_steps_per_second": 0.006
}