--- datasets: - wikitext-2-v1 - wikitext language: - en metrics: - perplexity - cross_entropy --- **metrics on 1024 context**: - valid_perplexity = 14.79 - valid_cross_entropy = 2.69 - train_perplexity = 13.77 - train_cross_entropy = 2.62 **metrics on 252 context**: - valid_perplexity = 17.35 **metrics on 378 context**: - valid_perplexity = 16.4 **metrics on 504 context**: - valid_perplexity = 15.86 **Dependence of the cross entropy loss on the length of the context for prediction** - x-axis*128 = context length - y-axis = cross entropy ![image/png](https://cdn-uploads.huggingface.co/production/uploads/63c1ac8cc58fcfeac186bda2/JRsRd01VrzEmTsHySMn0q.png)