---
base_model: simplescaling/s1-32B
pipeline_tag: text-generation
inference: true
language:
- en
license: apache-2.0
model_creator: simplescaling
model_name: s1-32B
model_type: qwen2
datasets:
  - simplescaling/s1K
quantized_by: brittlewis12

---

# s1 32B GGUF

**Original model**: [s1 32B](https://huggingface.co/simplescaling/s1-32B)

**Model creator**: [simplescaling](https://huggingface.co/simplescaling)

> s1 is a reasoning model finetuned from Qwen2.5-32B-Instruct on just 1,000 examples. It matches o1-preview & exhibits test-time scaling via budget forcing.

This repo contains GGUF format model files for simplescaling’s s1 32B, an open reproduction of OpenAI’s o1-preview on 1,000 reasoning traces, including model, source code, and data (see [s1K](https://huggingface.co/datasets/simplescaling/s1K)).

Learn more on simplescaling’s [s1 github repo](https://github.com/simplescaling/s1) & [arxiv preprint](https://arxiv.org/abs/2501.19393).

### What is GGUF?

GGUF is a file format for representing AI models. It is the third version of the format, 
introduced by the llama.cpp team on August 21st 2023. It is a replacement for GGML, which is no longer supported by llama.cpp. 
Converted with llama.cpp build 4628 (revision [cde3833](https://github.com/ggerganov/llama.cpp/commits/cde383323959544abe10a4d79e1d3e1ee479933c)),
using [autogguf-rs](https://github.com/brittlewis12/autogguf-rs).

### Prompt template: ChatML

```
<|im_start|>system
{{system_message}}<|im_end|>
<|im_start|>user
{{prompt}}<|im_end|>
<|im_start|>assistant

```

---

## Download & run with [cnvrs](https://twitter.com/cnvrsai) on iPhone, iPad, and Mac!

![cnvrs.ai](https://pbs.twimg.com/profile_images/1744049151241797632/0mIP-P9e_400x400.jpg)

[cnvrs](https://testflight.apple.com/join/sFWReS7K) is the best app for private, local AI on your device:
- create & save **Characters** with custom system prompts & temperature settings
- download and experiment with any **GGUF model** you can [find on HuggingFace](https://huggingface.co/models?library=gguf)!
- make it your own with custom **Theme colors**
- powered by Metal ⚡️ & [Llama.cpp](https://github.com/ggerganov/llama.cpp), with **haptics** during response streaming!
- **try it out** yourself today, on [Testflight](https://testflight.apple.com/join/sFWReS7K)!
- follow [cnvrs on twitter](https://twitter.com/cnvrsai) to stay up to date

---

## Original Model Evaluation

> **Table 1**:
> s1-32B is an open and sample-efficient reasoning model.
> We evaluate s1-32B, Qwen, and Gemini (some entries are unknown (N.A.), see §4).
> Other results are from the respective reports (Qwen et al., 2024; Team, 2024b; OpenAI, 2024; DeepSeek-AI et al., 2025; Labs, 2025; Team, 2025).
> 
> \# ex. = number examples used for reasoning finetuning; BF = budget forcing.

via [s1: Simple test-time scaling (4.1 Results)](https://arxiv.org/html/2501.19393v2#S4:~:text=Table%201%3A,BF%20%3D%20budget%20forcing.)

| Model | # ex. | AIME 2024 | MATH 500 | GPQA Diamond |
|-------|--------|------------|-----------|---------------|
| **API only** |
| o1-preview | N.A. | 44.6 | 85.5 | 73.3 |
| o1-mini | N.A. | 70.0 | 90.0 | 60.0 |
| o1 | N.A. | **74.4** | **94.8** | **77.3** |
| Gemini 2.0 Flash Think. | N.A. | 60.0 | N.A. | N.A. |
| **Open Weights** |
| Qwen2.5-32B-Instruct | N.A. | 26.7 | 84.0 | 49.0 |
| QwQ-32B | N.A. | 50.0 | 90.6 | 65.2 |
| r1 | >>800K | **79.8** | **97.3** | **71.5** |
| r1-distill | 800K | 72.6 | 94.3 | 62.1 |
| **Open Weights and Open Data** |
| Sky-T1 | 17K | 43.3 | 82.4 | 56.8 |
| Bespoke-32B | 17K | **63.3** | **93.0** | 58.1 |
| s1 w/o BF | 1K | 50.0 | 92.6 | 56.6 |
| s1-32B | 1K | **56.7** | **93.0** | **59.6** |