Remek commited on
Commit
1dd6e6b
·
verified ·
1 Parent(s): afb46e7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -258,11 +258,11 @@ The Bielik-11B-v2.3-Instruct (16 bit) model was also evaluated using the MT-Benc
258
 
259
  Key observations on Bielik-11B-v2.3 performance:
260
 
261
- 1. Strong performance among mid-sized models: Bielik-11B-v2.3-Instruct scored **~~7.996875~~**, placing it ahead of several well-known models like GPT-3.5-turbo (7.868750) and Mixtral-8x7b (7.637500). This indicates that Bielik-11B-v2.3-Instruct is competitive among mid-sized models, particularly those in the 11B-70B parameter range.
262
 
263
  2. Competitive against larger models: Bielik-11B-v2.3-Instruct performs close to Meta-Llama-3.1-70B-Instruct (8.150000), Meta-Llama-3.1-405B-Instruct (8.168750) and even Mixtral-8x22b (8.231250), which have significantly more parameters. This efficiency in performance relative to size could make it an attractive option for tasks where resource constraints are a consideration. Bielik 100% generated answers in Polish, while other models (not typically trained for Polish) can answer Polish questions in English.
264
 
265
- 3. Significant improvement over previous versions: compared to its predecessor, **Bielik-7B-Instruct-v0.1**, which scored **6.081250**, the Bielik-11B-v2.3-Instruct shows a significant improvement. The score increased by almost **~~2 points~~**, highlighting substantial advancements in model quality, optimization and training methodology.
266
 
267
  For more information - answers to test tasks and values in each category, visit the [MT-Bench PL](https://huggingface.co/spaces/speakleash/mt-bench-pl) website.
268
 
 
258
 
259
  Key observations on Bielik-11B-v2.3 performance:
260
 
261
+ 1. Strong performance among mid-sized models: Bielik-11B-v2.3-Instruct scored **8.556250**, placing it ahead of several well-known models like GPT-3.5-turbo (7.868750) and Mixtral-8x7b (7.637500). This indicates that Bielik-11B-v2.3-Instruct is competitive among mid-sized models, particularly those in the 11B-70B parameter range.
262
 
263
  2. Competitive against larger models: Bielik-11B-v2.3-Instruct performs close to Meta-Llama-3.1-70B-Instruct (8.150000), Meta-Llama-3.1-405B-Instruct (8.168750) and even Mixtral-8x22b (8.231250), which have significantly more parameters. This efficiency in performance relative to size could make it an attractive option for tasks where resource constraints are a consideration. Bielik 100% generated answers in Polish, while other models (not typically trained for Polish) can answer Polish questions in English.
264
 
265
+ 3. Significant improvement over previous versions: compared to its predecessor, **Bielik-7B-Instruct-v0.1**, which scored **6.081250**, the Bielik-11B-v2.3-Instruct shows a significant improvement. The score increased by almost **2.5 points**, highlighting substantial advancements in model quality, optimization and training methodology.
266
 
267
  For more information - answers to test tasks and values in each category, visit the [MT-Bench PL](https://huggingface.co/spaces/speakleash/mt-bench-pl) website.
268