Post
1588
ššš° šššš„š¢šš§ šš¦šš„š„ ššš§š š®šš š ššØššš„š¬: ššš¦š¦š šššØš šš§šš¬š¢š¬ ššØš„š„šššš¢šØš§ ššš®š¹
I am happy to release two new language models for the Italian Language!
šŖ Gemma 2 9B Neogenesis ITA
anakin87/gemma-2-9b-neogenesis-ita
Building on the impressive work by VAGO Solutions, I applied Direct Preference Optimization with a mix of Italian and English data.
Using Spectrum, I trained 20% of model layers.
š Evaluated on the Open ITA LLM leaderboard ( mii-llm/open_ita_llm_leaderboard), this model achieves strong performance.
To beat it on this benchmark, you'd need a 27B model š
š¤ Gemma 2 2B Neogenesis ITA
anakin87/gemma-2-2b-neogenesis-ita
This smaller variant is fine-tuned from the original Gemma 2 2B it by Google.
Through a combination of Supervised Fine-Tuning and Direct Preference Optimization, I trained 25% of the layers using Spectrum.
š Compared to the original model, it shows improved Italian proficiency, good for its small size.
Both models were developed during the recent #gemma competition on Kaggle.
š Training code: https://www.kaggle.com/code/anakin87/post-training-gemma-for-italian-and-beyond
š Thanks @FinancialSupport and mii-llm for the help during evaluation.
I am happy to release two new language models for the Italian Language!
šŖ Gemma 2 9B Neogenesis ITA
anakin87/gemma-2-9b-neogenesis-ita
Building on the impressive work by VAGO Solutions, I applied Direct Preference Optimization with a mix of Italian and English data.
Using Spectrum, I trained 20% of model layers.
š Evaluated on the Open ITA LLM leaderboard ( mii-llm/open_ita_llm_leaderboard), this model achieves strong performance.
To beat it on this benchmark, you'd need a 27B model š
š¤ Gemma 2 2B Neogenesis ITA
anakin87/gemma-2-2b-neogenesis-ita
This smaller variant is fine-tuned from the original Gemma 2 2B it by Google.
Through a combination of Supervised Fine-Tuning and Direct Preference Optimization, I trained 25% of the layers using Spectrum.
š Compared to the original model, it shows improved Italian proficiency, good for its small size.
Both models were developed during the recent #gemma competition on Kaggle.
š Training code: https://www.kaggle.com/code/anakin87/post-training-gemma-for-italian-and-beyond
š Thanks @FinancialSupport and mii-llm for the help during evaluation.