Alina Lozovskaya's picture

Alina Lozovskaya

alozowski

AI & ML interests

NLP in all aspects

Organizations

Hugging Face's profile picture Evaluation datasets's profile picture Hugging Test Lab's profile picture Hugging Face H4's profile picture InternLM's profile picture Hugging Face TB Research's profile picture Open LLM Leaderboard's profile picture Qwen's profile picture gg-hf's profile picture IBM Granite's profile picture Social Post Explorers's profile picture HuggingFaceEval's profile picture nltpt's profile picture open-llm-leaderboard-react's profile picture Prompt Leaderboard's profile picture wut?'s profile picture Your Bench's profile picture

Posts 2

view post
Post
2676
Do I need to make it a tradition to post here every Friday? Well, here we are again!

This week, I'm happy to share that we have two official Mistral models on the Leaderboard! πŸ”₯ You can check them out: mistralai/Mixtral-8x22B-Instruct-v0.1 and mistralai/Mixtral-8x22B-v0.1

The most exciting thing here? mistralai/Mixtral-8x22B-Instruct-v0.1 model got a first place among pretrained models with an impressive average score of 79.15!πŸ₯‡ Not far behind is the Mixtral-8x22B-v0.1, achieving second place with an average score of 74.47! Well done, Mistral AI! πŸ‘

Check out my screenshot here or explore it yourself at the https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard

The second news is that CohereForAI/c4ai-command-r-plus model in 4-bit quantization got a great average score of 70.08. Cool stuff, Cohere! 😎 (and I also have the screenshot for this, don't miss it)

The last news, which might seem small but is still significant, the Leaderboard frontpage now supports Python 3.12.1. This means we're on our way to speed up the Leaderboard's performance! πŸš€

If you have any comments or suggestions, feel free to also tag me on X (Twitter), I'll try to help – [at]ailozovskaya

Have a nice weekend! ✨

Articles 1

Article
18

COβ‚‚ Emissions and Models Performance: Insights from the Open LLM Leaderboard

models

None public yet

datasets

None public yet