PyTorch
mistral
Krutrim
language-model
krutrim-admin commited on
Commit
3b544b5
·
verified ·
1 Parent(s): 6d984eb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -29,6 +29,8 @@ After fine-tuning, the model underwent Direct Preference Optimization (DPO) to e
29
 
30
  The model delivers best-in-class performance across Indic tasks and a promising performance on English benchmarks equivalent to models 5-10x the size. We present details of the model architecture, pre-training, post-training and evaluation results. We also publicly release the post-trained versions of the model. We are continuously improving the model through post-training techniques such as RLHF.
31
 
 
 
32
  ## Key Features
33
  - 12B parameter dense transformer model leading to better generalization compared to Krutrim-1 7B;
34
  - Supports context up to 128K tokens making it suitable for long multi-turn conversations, long-form generations, document translations and others;
 
29
 
30
  The model delivers best-in-class performance across Indic tasks and a promising performance on English benchmarks equivalent to models 5-10x the size. We present details of the model architecture, pre-training, post-training and evaluation results. We also publicly release the post-trained versions of the model. We are continuously improving the model through post-training techniques such as RLHF.
31
 
32
+ [![Krutrim 2](https://img.youtube.com/vi/beqXNHq67xg/0.jpg)](https://www.youtube.com/watch?v=beqXNHq67xg)
33
+
34
  ## Key Features
35
  - 12B parameter dense transformer model leading to better generalization compared to Krutrim-1 7B;
36
  - Supports context up to 128K tokens making it suitable for long multi-turn conversations, long-form generations, document translations and others;