s1 32B GGUF

Original model: s1 32B

Model creator: simplescaling

s1 is a reasoning model finetuned from Qwen2.5-32B-Instruct on just 1,000 examples. It matches o1-preview & exhibits test-time scaling via budget forcing.

This repo contains GGUF format model files for simplescaling’s s1 32B, an open reproduction of OpenAI’s o1-preview on 1,000 reasoning traces, including model, source code, and data (see s1K).

Learn more on simplescaling’s s1 github repo & arxiv preprint.

What is GGUF?

GGUF is a file format for representing AI models. It is the third version of the format, introduced by the llama.cpp team on August 21st 2023. It is a replacement for GGML, which is no longer supported by llama.cpp. Converted with llama.cpp build 4628 (revision cde3833), using autogguf-rs.

Prompt template: ChatML

<|im_start|>system
{{system_message}}<|im_end|>
<|im_start|>user
{{prompt}}<|im_end|>
<|im_start|>assistant

Download & run with cnvrs on iPhone, iPad, and Mac!

cnvrs.ai

cnvrs is the best app for private, local AI on your device:

  • create & save Characters with custom system prompts & temperature settings
  • download and experiment with any GGUF model you can find on HuggingFace!
  • make it your own with custom Theme colors
  • powered by Metal ⚡️ & Llama.cpp, with haptics during response streaming!
  • try it out yourself today, on Testflight!
  • follow cnvrs on twitter to stay up to date

Original Model Evaluation

Table 1: s1-32B is an open and sample-efficient reasoning model. We evaluate s1-32B, Qwen, and Gemini (some entries are unknown (N.A.), see §4). Other results are from the respective reports (Qwen et al., 2024; Team, 2024b; OpenAI, 2024; DeepSeek-AI et al., 2025; Labs, 2025; Team, 2025).

# ex. = number examples used for reasoning finetuning; BF = budget forcing.

via s1: Simple test-time scaling (4.1 Results)

Model # ex. AIME 2024 MATH 500 GPQA Diamond
API only
o1-preview N.A. 44.6 85.5 73.3
o1-mini N.A. 70.0 90.0 60.0
o1 N.A. 74.4 94.8 77.3
Gemini 2.0 Flash Think. N.A. 60.0 N.A. N.A.
Open Weights
Qwen2.5-32B-Instruct N.A. 26.7 84.0 49.0
QwQ-32B N.A. 50.0 90.6 65.2
r1 >>800K 79.8 97.3 71.5
r1-distill 800K 72.6 94.3 62.1
Open Weights and Open Data
Sky-T1 17K 43.3 82.4 56.8
Bespoke-32B 17K 63.3 93.0 58.1
s1 w/o BF 1K 50.0 92.6 56.6
s1-32B 1K 56.7 93.0 59.6
Downloads last month
1,297
GGUF
Model size
32.8B params
Architecture
qwen2

1-bit

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.

Model tree for brittlewis12/s1-32B-GGUF

Quantized
(7)
this model

Dataset used to train brittlewis12/s1-32B-GGUF