Miquella 120B GGUF

GGUF quantized weights for miquella-120b. Contains all quants.

I used Importance Matrices generated from Q8_0 quant of the model. The dataset used for that was random junk for optimal quality.

Due to the limitations of HF's file size, the larger files were split into multiple chunks. Instructions below.

Linux

Example uses Q3_K_L. Replace the names appropriately for your quant of choice.

cat miquella-120b.Q3_K_L.gguf_part_* > miquella-120b.Q3_K_L.gguf && rm miquella-120b.Q3_K_L.gguf_part_*

Windows

Example uses Q3_K_L. Replace the names appropriately for your quant of choice.

COPY /B  miquella-120b.Q3_K_L.gguf_part_aa +  miquella-120b.Q3_K_L.gguf_part_ab  miquella-120b.Q3_K_L.gguf

Then delete the two splits.

Downloads last month
72
GGUF
Model size
118B params
Architecture
llama

2-bit

3-bit

Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.