eaddario commited on
Commit
278f691
·
verified ·
1 Parent(s): 10b95e2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -29,7 +29,7 @@ The process to produce the quantized [GGUF](https://huggingface.co/docs/hub/en/g
29
  2. Estimate the Perplexity score for the F16 model (base) using [wikitext-2-raw-v1](https://huggingface.co/datasets/Salesforce/wikitext/tree/main/wikitext-2-raw-v1), and record the [logits](https://huggingface.co/eaddario/Watt-Tool-8B-GGUF/tree/main/logits)
30
  3. Generate the [imatrix](https://huggingface.co/eaddario/Watt-Tool-8B-GGUF/tree/main/imatrix) for each calibration dataset
31
  4. Create quantized versions of the base model using each imatrix per quant type
32
- 5. Calculate the Perplexity and KL Divergence scores for each quantized model [(logs)](https://huggingface.co/eaddario/Watt-Tool-8B-GGUF/tree/main/scores)
33
  6. For each quant type, keep the version with the best (usually the lowest) scores
34
 
35
  *[BF16](https://en.wikipedia.org/wiki/Bfloat16_floating-point_format) would be preferred, but Apple's GPUs don't support it yet, and therefore any operations are executed in the CPU, making it unacceptably slow. This is expected to change in the near term but until then, if you are using Apple kit avoid using any models tagged BF16
 
29
  2. Estimate the Perplexity score for the F16 model (base) using [wikitext-2-raw-v1](https://huggingface.co/datasets/Salesforce/wikitext/tree/main/wikitext-2-raw-v1), and record the [logits](https://huggingface.co/eaddario/Watt-Tool-8B-GGUF/tree/main/logits)
30
  3. Generate the [imatrix](https://huggingface.co/eaddario/Watt-Tool-8B-GGUF/tree/main/imatrix) for each calibration dataset
31
  4. Create quantized versions of the base model using each imatrix per quant type
32
+ 5. Calculate the Perplexity and KL Divergence scores for each quantized model [(scores)](https://huggingface.co/eaddario/Watt-Tool-8B-GGUF/tree/main/scores)
33
  6. For each quant type, keep the version with the best (usually the lowest) scores
34
 
35
  *[BF16](https://en.wikipedia.org/wiki/Bfloat16_floating-point_format) would be preferred, but Apple's GPUs don't support it yet, and therefore any operations are executed in the CPU, making it unacceptably slow. This is expected to change in the near term but until then, if you are using Apple kit avoid using any models tagged BF16