QuangPH3
commited on
Commit
·
54608d2
1
Parent(s):
a957073
update README: vmlu test
Browse files
README.md
CHANGED
@@ -102,9 +102,15 @@ We evaluated our model via peer comparison on multiple publicly available datase
|
|
102 |
|
103 |
Based on this results, our model performs on-par or better than most models for tasks in Vietnamese and demonstrate that this approach is extremely potential.
|
104 |
|
|
|
|
|
|
|
|
|
|
|
|
|
105 |
Pretraining loss:
|
106 |
|
107 |
-
<p align="
|
108 |
|
109 |
<h3> Run the model </h3>
|
110 |
|
|
|
102 |
|
103 |
Based on this results, our model performs on-par or better than most models for tasks in Vietnamese and demonstrate that this approach is extremely potential.
|
104 |
|
105 |
+
While this model primarily specializes in multi-turn conversational scenarios, it has demonstrated its competence in various multiple-choice question and answer tasks during testing. Below, you can find the results, fairly evaluated by the [VMLU team](https://vmlu.ai), in comparison to other open-source models, including VBD-LLaMA2-7B-50b-Chat. (We extend our gratitude to the VMLU team for their diligent work in creating an open-source public evaluation dataset).
|
106 |
+
|
107 |
+
<p align="center"> <img src="vmlu.png" width="500" /> </p>
|
108 |
+
|
109 |
+
<p align="center"> Table 3. <a href="https://vmlu.ai/leaderboard"> Benchmark on VMLU datasets </a> </p>
|
110 |
+
|
111 |
Pretraining loss:
|
112 |
|
113 |
+
<p align="center"> <img src="loss.png" width="500" /> </p>
|
114 |
|
115 |
<h3> Run the model </h3>
|
116 |
|
vmlu.png
ADDED
![]() |