krutrim-ai-labs
/

Krutrim-2-instruct

Model card Files Files and versions Community

krutrim-admin commited on 7 days ago

Commit

dacda83

·

verified ·

1 Parent(s): 7170055

Update Indic Evals in Readme

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -98,13 +98,13 @@ We use the LM Evaluation Harness to evaluate our model on the En benchmarks task
 | IndicSentiment (0-shot)                   | Accuracy   | 0.65         | 0.70           | 0.95         | 0.96          |0.99           | 0.98   |
 | IndicCOPA (0-shot)                        | Accuracy   | 0.51         | 0.58           | 0.80         | 0.83         | 0.88           | 0.91   |
 | IndicXParaphrase (0-shot)                 | Accuracy   | 0.67         | 0.74           | 0.88         | 0.87         | 0.89           | 0.91    |
-| IndicXNLI (0-shot)                        | Accuracy   | 0.47         | 0.54           | 0.55         | 0.61          | TBD            | 0.67    |
 | IndicQA (0-shot)                          | Bert Score | 0.90         | 0.90           | 0.91         | 0.89          | 0.94            | TBD    |
 | CrossSumIN (1-shot)                       | chrF++     | 0.04         | 0.17           | 0.21         | 0.26         | 0.24           | 0.24    |
 | FloresIN Translation xx-en (1-shot)       | chrF++     | 0.54         | 0.50           | 0.58         | 0.60         | 0.62           | 0.63   |
 | FloresIN Translation en-xx (1-shot)       | chrF++     | 0.41         | 0.34           | 0.48         | 0.46         | 0.47           | 0.48   |
-| IN22 Translation xx-en (0-shot)           | chrF++     | 0.50         | 0.48           | 0.57         | 0.58         | 0.55           | 0.55    |
-| IN22 Translation en-xx (0-shot)           | chrF++     | 0.36         | 0.33           | 0.45         | 0.42         | 0.44           | 0.43    |
 ### BharatBench

 | IndicSentiment (0-shot)                   | Accuracy   | 0.65         | 0.70           | 0.95         | 0.96          |0.99           | 0.98   |
 | IndicCOPA (0-shot)                        | Accuracy   | 0.51         | 0.58           | 0.80         | 0.83         | 0.88           | 0.91   |
 | IndicXParaphrase (0-shot)                 | Accuracy   | 0.67         | 0.74           | 0.88         | 0.87         | 0.89           | 0.91    |
+| IndicXNLI (0-shot)                        | Accuracy   | 0.47         | 0.54           | 0.55         | 0.61          | 0.70            | 0.67    |
 | IndicQA (0-shot)                          | Bert Score | 0.90         | 0.90           | 0.91         | 0.89          | 0.94            | TBD    |
 | CrossSumIN (1-shot)                       | chrF++     | 0.04         | 0.17           | 0.21         | 0.26         | 0.24           | 0.24    |
 | FloresIN Translation xx-en (1-shot)       | chrF++     | 0.54         | 0.50           | 0.58         | 0.60         | 0.62           | 0.63   |
 | FloresIN Translation en-xx (1-shot)       | chrF++     | 0.41         | 0.34           | 0.48         | 0.46         | 0.47           | 0.48   |
+| IN22 Translation xx-en (0-shot)           | chrF++     | 0.50         | 0.48           | 0.57         | 0.58         | 0.55           | 0.60    |
+| IN22 Translation en-xx (0-shot)           | chrF++     | 0.36         | 0.33           | 0.45         | 0.42         | 0.44           | 0.44    |
 ### BharatBench