Update Indic Evals in Readme
Browse files
README.md
CHANGED
@@ -98,13 +98,13 @@ We use the LM Evaluation Harness to evaluate our model on the En benchmarks task
|
|
98 |
| IndicSentiment (0-shot) | Accuracy | 0.65 | 0.70 | 0.95 | 0.96 |0.99 | 0.98 |
|
99 |
| IndicCOPA (0-shot) | Accuracy | 0.51 | 0.58 | 0.80 | 0.83 | 0.88 | 0.91 |
|
100 |
| IndicXParaphrase (0-shot) | Accuracy | 0.67 | 0.74 | 0.88 | 0.87 | 0.89 | 0.91 |
|
101 |
-
| IndicXNLI (0-shot) | Accuracy | 0.47 | 0.54 | 0.55 | 0.61 |
|
102 |
| IndicQA (0-shot) | Bert Score | 0.90 | 0.90 | 0.91 | 0.89 | 0.94 | TBD |
|
103 |
| CrossSumIN (1-shot) | chrF++ | 0.04 | 0.17 | 0.21 | 0.26 | 0.24 | 0.24 |
|
104 |
| FloresIN Translation xx-en (1-shot) | chrF++ | 0.54 | 0.50 | 0.58 | 0.60 | 0.62 | 0.63 |
|
105 |
| FloresIN Translation en-xx (1-shot) | chrF++ | 0.41 | 0.34 | 0.48 | 0.46 | 0.47 | 0.48 |
|
106 |
-
| IN22 Translation xx-en (0-shot) | chrF++ | 0.50 | 0.48 | 0.57 | 0.58 | 0.55 | 0.
|
107 |
-
| IN22 Translation en-xx (0-shot) | chrF++ | 0.36 | 0.33 | 0.45 | 0.42 | 0.44 | 0.
|
108 |
|
109 |
|
110 |
### BharatBench
|
|
|
98 |
| IndicSentiment (0-shot) | Accuracy | 0.65 | 0.70 | 0.95 | 0.96 |0.99 | 0.98 |
|
99 |
| IndicCOPA (0-shot) | Accuracy | 0.51 | 0.58 | 0.80 | 0.83 | 0.88 | 0.91 |
|
100 |
| IndicXParaphrase (0-shot) | Accuracy | 0.67 | 0.74 | 0.88 | 0.87 | 0.89 | 0.91 |
|
101 |
+
| IndicXNLI (0-shot) | Accuracy | 0.47 | 0.54 | 0.55 | 0.61 | 0.70 | 0.67 |
|
102 |
| IndicQA (0-shot) | Bert Score | 0.90 | 0.90 | 0.91 | 0.89 | 0.94 | TBD |
|
103 |
| CrossSumIN (1-shot) | chrF++ | 0.04 | 0.17 | 0.21 | 0.26 | 0.24 | 0.24 |
|
104 |
| FloresIN Translation xx-en (1-shot) | chrF++ | 0.54 | 0.50 | 0.58 | 0.60 | 0.62 | 0.63 |
|
105 |
| FloresIN Translation en-xx (1-shot) | chrF++ | 0.41 | 0.34 | 0.48 | 0.46 | 0.47 | 0.48 |
|
106 |
+
| IN22 Translation xx-en (0-shot) | chrF++ | 0.50 | 0.48 | 0.57 | 0.58 | 0.55 | 0.60 |
|
107 |
+
| IN22 Translation en-xx (0-shot) | chrF++ | 0.36 | 0.33 | 0.45 | 0.42 | 0.44 | 0.44 |
|
108 |
|
109 |
|
110 |
### BharatBench
|