Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
RTO-RL
/
Llama3-8B-DPO
like
0
Follow
Reinforced Token Optimization
4
Safetensors
HuggingFaceH4/ultrafeedback_binarized
llama
Model card
Files
Files and versions
Community
Train
main
Llama3-8B-DPO
Commit History
Update README.md
5c7eecb
verified
zkshan2002
commited on
11 days ago
Create README.md
b878162
verified
zkshan2002
commited on
Oct 14, 2024
initial commit
e0fe0f0
verified
zkshan2002
commited on
Oct 14, 2024
initial commit
9f73655
verified
zkshan2002
commited on
Oct 14, 2024