DeepSeek-R1-Distill-Qwen-32B-Q2-6

This model was converted to MLX from deepseek-ai/DeepSeek-R1-Distill-Qwen-32B, using mixed 2/6 bit quantization. This scheme preserves quality much more than a standard 2-bit quantization.

Use with mlx

pip install mlx-lm

python -m mlx_lm.chat --model pcuenq/DeepSeek-R1-Distill-Qwen-32B-Q2-6 --max-tokens 10000 --temp 0.6 --top-p 0.7

Downloads last month: 26

Safetensors

Model size

3.76B params

Tensor type

BF16

U32

Inference Providers NEW

Text Generation

This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for pcuenq/DeepSeek-R1-Distill-Qwen-32B-Q2-6

Base model

deepseek-ai/DeepSeek-R1-Distill-Qwen-32B

Quantized

(86)

this model