Inference error: The current context does not support K-shift
#3 opened 4 days ago
by
lollmaolol
Tested Q6, uses 567Gb Ram
5
#2 opened 9 days ago
by
krustik
Using -ctk q4_0 -ctv q4_0 with llama.cpp server throws flash_attn error
#1 opened 11 days ago
by
softwareweaver
![](https://cdn-avatars.huggingface.co/v1/production/uploads/63038422a362e7e8b5196efb/vfFFZosz3-f_zp3stBnEV.png)