README.md · Danielbrdz/Barcenas-Tiny-1.1b-DPO at main

metadata

license: apache-2.0
datasets:
  - Intel/orca_dpo_pairs
language:
  - en
  - es

Barcenas Tiny 1.1b DPO

It is a model based on the famous TinyLlama/TinyLlama-1.1B-Chat-v1.0 and trained with DPO using the Intel/orca_dpo_pairs dataset.

With its reinforcement based training we hope to improve the Tiny model in a huge way and have a better model with better responses with a small size and accessible to most people.

Many thanks to Maxime Labonne (mlabonne) for his tutorial on how to train a LLM model using DPO, without his tutorial this model would not have been possible.

Made with ❤️ in Guadalupe, Nuevo Leon, Mexico 🇲🇽