speakleash
/

Bielik-11B-v2.3-Instruct

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Remek commited on Sep 18, 2024

Commit

1dd6e6b

·

verified ·

1 Parent(s): afb46e7

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -258,11 +258,11 @@ The Bielik-11B-v2.3-Instruct (16 bit) model was also evaluated using the MT-Benc
 Key observations on Bielik-11B-v2.3 performance:
-1. Strong performance among mid-sized models: Bielik-11B-v2.3-Instruct scored **~~7.996875~~**, placing it ahead of several well-known models like GPT-3.5-turbo (7.868750) and Mixtral-8x7b (7.637500). This indicates that Bielik-11B-v2.3-Instruct is competitive among mid-sized models, particularly those in the 11B-70B parameter range.
 2. Competitive against larger models: Bielik-11B-v2.3-Instruct performs close to Meta-Llama-3.1-70B-Instruct (8.150000), Meta-Llama-3.1-405B-Instruct (8.168750) and even Mixtral-8x22b (8.231250), which have significantly more parameters. This efficiency in performance relative to size could make it an attractive option for tasks where resource constraints are a consideration. Bielik 100% generated answers in Polish, while other models (not typically trained for Polish) can answer Polish questions in English.
-3. Significant improvement over previous versions: compared to its predecessor, **Bielik-7B-Instruct-v0.1**, which scored **6.081250**, the Bielik-11B-v2.3-Instruct shows a significant improvement. The score increased by almost **~~2 points~~**, highlighting substantial advancements in model quality, optimization and training methodology.
 For more information - answers to test tasks and values in each category, visit the [MT-Bench PL](https://huggingface.co/spaces/speakleash/mt-bench-pl) website.

 Key observations on Bielik-11B-v2.3 performance:
+1. Strong performance among mid-sized models: Bielik-11B-v2.3-Instruct scored **8.556250**, placing it ahead of several well-known models like GPT-3.5-turbo (7.868750) and Mixtral-8x7b (7.637500). This indicates that Bielik-11B-v2.3-Instruct is competitive among mid-sized models, particularly those in the 11B-70B parameter range.
 2. Competitive against larger models: Bielik-11B-v2.3-Instruct performs close to Meta-Llama-3.1-70B-Instruct (8.150000), Meta-Llama-3.1-405B-Instruct (8.168750) and even Mixtral-8x22b (8.231250), which have significantly more parameters. This efficiency in performance relative to size could make it an attractive option for tasks where resource constraints are a consideration. Bielik 100% generated answers in Polish, while other models (not typically trained for Polish) can answer Polish questions in English.
+3. Significant improvement over previous versions: compared to its predecessor, **Bielik-7B-Instruct-v0.1**, which scored **6.081250**, the Bielik-11B-v2.3-Instruct shows a significant improvement. The score increased by almost **2.5 points**, highlighting substantial advancements in model quality, optimization and training methodology.
 For more information - answers to test tasks and values in each category, visit the [MT-Bench PL](https://huggingface.co/spaces/speakleash/mt-bench-pl) website.