Update README.md
Browse files
README.md
CHANGED
@@ -258,11 +258,11 @@ The Bielik-11B-v2.3-Instruct (16 bit) model was also evaluated using the MT-Benc
|
|
258 |
|
259 |
Key observations on Bielik-11B-v2.3 performance:
|
260 |
|
261 |
-
1. Strong performance among mid-sized models: Bielik-11B-v2.3-Instruct scored
|
262 |
|
263 |
2. Competitive against larger models: Bielik-11B-v2.3-Instruct performs close to Meta-Llama-3.1-70B-Instruct (8.150000), Meta-Llama-3.1-405B-Instruct (8.168750) and even Mixtral-8x22b (8.231250), which have significantly more parameters. This efficiency in performance relative to size could make it an attractive option for tasks where resource constraints are a consideration. Bielik 100% generated answers in Polish, while other models (not typically trained for Polish) can answer Polish questions in English.
|
264 |
|
265 |
-
3. Significant improvement over previous versions: compared to its predecessor, **Bielik-7B-Instruct-v0.1**, which scored **6.081250**, the Bielik-11B-v2.3-Instruct shows a significant improvement. The score increased by almost
|
266 |
|
267 |
For more information - answers to test tasks and values in each category, visit the [MT-Bench PL](https://huggingface.co/spaces/speakleash/mt-bench-pl) website.
|
268 |
|
|
|
258 |
|
259 |
Key observations on Bielik-11B-v2.3 performance:
|
260 |
|
261 |
+
1. Strong performance among mid-sized models: Bielik-11B-v2.3-Instruct scored **8.556250**, placing it ahead of several well-known models like GPT-3.5-turbo (7.868750) and Mixtral-8x7b (7.637500). This indicates that Bielik-11B-v2.3-Instruct is competitive among mid-sized models, particularly those in the 11B-70B parameter range.
|
262 |
|
263 |
2. Competitive against larger models: Bielik-11B-v2.3-Instruct performs close to Meta-Llama-3.1-70B-Instruct (8.150000), Meta-Llama-3.1-405B-Instruct (8.168750) and even Mixtral-8x22b (8.231250), which have significantly more parameters. This efficiency in performance relative to size could make it an attractive option for tasks where resource constraints are a consideration. Bielik 100% generated answers in Polish, while other models (not typically trained for Polish) can answer Polish questions in English.
|
264 |
|
265 |
+
3. Significant improvement over previous versions: compared to its predecessor, **Bielik-7B-Instruct-v0.1**, which scored **6.081250**, the Bielik-11B-v2.3-Instruct shows a significant improvement. The score increased by almost **2.5 points**, highlighting substantial advancements in model quality, optimization and training methodology.
|
266 |
|
267 |
For more information - answers to test tasks and values in each category, visit the [MT-Bench PL](https://huggingface.co/spaces/speakleash/mt-bench-pl) website.
|
268 |
|