8bits quantization
#20 opened about 4 hours ago
by
ramkumarkoppu
New research paper, R1 type reasoning models can be drastically improved in quality
1
#19 opened 3 days ago
by
krustik
md5 / sha256 hashes please
1
#18 opened 5 days ago
by
ivanvolosyuk
Is there a model removing non-shared MoE experts?
4
#17 opened 5 days ago
by
ghostplant
A Step-by-step deployment guide with ollama
3
#16 opened 7 days ago
by
snowkylin
![](https://cdn-avatars.huggingface.co/v1/production/uploads/630bf18509eceb8fafe7ae0e/bE4z76vTBGW3-fZtZnQlB.jpeg)
No think tokens visible
4
#15 opened 7 days ago
by
sudkamath
Over 2 tok/sec agg backed by NVMe SSD on 96GB RAM + 24GB VRAM AM5 rig with llama.cpp
9
#13 opened 8 days ago
by
ubergarm
Running the model with vLLM does not actually work
8
#12 opened 8 days ago
by
aikitoria
![](https://cdn-avatars.huggingface.co/v1/production/uploads/6576828eb951d40e7a74985a/9XzlkFzdEggP2L1VQit9S.png)
DeepSeek-R1-GGUF on LMStudio not available
2
#11 opened 8 days ago
by
32SkyDive
Where did the BF16 come from?
8
#10 opened 8 days ago
by
gshpychka
Inference speed
2
#9 opened 9 days ago
by
Iker
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1632247447995-noauth.jpeg)
Running this model using vLLM Docker
2
#8 opened 9 days ago
by
moficodes
![](https://cdn-avatars.huggingface.co/v1/production/uploads/65151150e62687e14a050a4a/s19zigsmHLKaFXeY3B7-J.png)
UD-IQ1_M models for distilled R1 versions?
3
#6 opened 9 days ago
by
SamPurkis
Llama.cpp server chat template
2
#4 opened 12 days ago
by
softwareweaver
![](https://cdn-avatars.huggingface.co/v1/production/uploads/63038422a362e7e8b5196efb/vfFFZosz3-f_zp3stBnEV.png)
Are the Q4 and Q5 models R1 or R1-Zero
18
#2 opened 16 days ago
by
gng2info
What is the VRAM requirement to run this ?
5
#1 opened 17 days ago
by
RageshAntony