ggml versions of OpenLLaMa 7B

For use with llama.cpp.

Perplexity

Calculated with llama.cpp, default settings (context 512, batch 512). Test data: wiki.test.raw of WikiText-103:

model score
open-llama-7b-q2_K.bin 8.5152
open-llama-7b-q3_K_S.bin 7.6623
open-llama-7b-q3_K.bin 7.3837
open-llama-7b-q3_K_L.bin 7.3043
open-llama-7b-q4_0.bin 7.2116
open-llama-7b-q4_1.bin 7.1609
open-llama-7b-q4_K_S.bin 7.1516
open-llama-7b-q4_K.bin 7.1116
open-llama-7b-q5_0.bin 7.0353
open-llama-7b-q5_K_S.bin 7.0325
open-llama-7b-q5_1.bin 7.0318
open-llama-7b-q5_K.bin 7.0272
open-llama-7b-q6_K.bin 7.0050
open-llama-7b-q8_0.bin 6.9968
open-llama-7b-f16.bin 6.9966
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.