metadata
license: apache-2.0
datasets:
- Intel/orca_dpo_pairs
language:
- en
- es
Barcenas Tiny 1.1b DPO
It is a model based on the famous TinyLlama/TinyLlama-1.1B-Chat-v1.0 and trained with DPO using the Intel/orca_dpo_pairs dataset.
With its reinforcement based training we hope to improve the Tiny model in a huge way and have a better model with better responses with a small size and accessible to most people.
Many thanks to Maxime Labonne (mlabonne) for his tutorial on how to train a LLM model using DPO, without his tutorial this model would not have been possible.
Made with ❤️ in Guadalupe, Nuevo Leon, Mexico 🇲🇽