Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,90 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
base_model: simplescaling/s1-32B
|
3 |
+
pipeline_tag: text-generation
|
4 |
+
inference: true
|
5 |
+
language:
|
6 |
+
- en
|
7 |
+
license: apache-2.0
|
8 |
+
model_creator: simplescaling
|
9 |
+
model_name: s1-32B
|
10 |
+
model_type: qwen2
|
11 |
+
datasets:
|
12 |
+
- simplescaling/s1K
|
13 |
+
quantized_by: brittlewis12
|
14 |
+
|
15 |
+
---
|
16 |
+
|
17 |
+
# s1 32B GGUF
|
18 |
+
|
19 |
+
**Original model**: [s1 32B](https://huggingface.co/simplescaling/s1-32B)
|
20 |
+
|
21 |
+
**Model creator**: [simplescaling](https://huggingface.co/simplescaling)
|
22 |
+
|
23 |
+
> s1 is a reasoning model finetuned from Qwen2.5-32B-Instruct on just 1,000 examples. It matches o1-preview & exhibits test-time scaling via budget forcing.
|
24 |
+
|
25 |
+
This repo contains GGUF format model files for simplescaling’s s1 32B, an open reproduction of OpenAI’s o1-preview on 1,000 reasoning traces, including model, source code, and data (see [s1K](https://huggingface.co/datasets/simplescaling/s1K)).
|
26 |
+
|
27 |
+
Learn more on simplescaling’s [s1 github repo](https://github.com/simplescaling/s1) & [arxiv preprint](https://arxiv.org/abs/2501.19393).
|
28 |
+
|
29 |
+
### What is GGUF?
|
30 |
+
|
31 |
+
GGUF is a file format for representing AI models. It is the third version of the format,
|
32 |
+
introduced by the llama.cpp team on August 21st 2023. It is a replacement for GGML, which is no longer supported by llama.cpp.
|
33 |
+
Converted with llama.cpp build 4628 (revision [cde3833](https://github.com/ggerganov/llama.cpp/commits/cde383323959544abe10a4d79e1d3e1ee479933c)),
|
34 |
+
using [autogguf-rs](https://github.com/brittlewis12/autogguf-rs).
|
35 |
+
|
36 |
+
### Prompt template: ChatML
|
37 |
+
|
38 |
+
```
|
39 |
+
<|im_start|>system
|
40 |
+
{{system_message}}<|im_end|>
|
41 |
+
<|im_start|>user
|
42 |
+
{{prompt}}<|im_end|>
|
43 |
+
<|im_start|>assistant
|
44 |
+
|
45 |
+
```
|
46 |
+
|
47 |
+
---
|
48 |
+
|
49 |
+
## Download & run with [cnvrs](https://twitter.com/cnvrsai) on iPhone, iPad, and Mac!
|
50 |
+
|
51 |
+
![cnvrs.ai](https://pbs.twimg.com/profile_images/1744049151241797632/0mIP-P9e_400x400.jpg)
|
52 |
+
|
53 |
+
[cnvrs](https://testflight.apple.com/join/sFWReS7K) is the best app for private, local AI on your device:
|
54 |
+
- create & save **Characters** with custom system prompts & temperature settings
|
55 |
+
- download and experiment with any **GGUF model** you can [find on HuggingFace](https://huggingface.co/models?library=gguf)!
|
56 |
+
- make it your own with custom **Theme colors**
|
57 |
+
- powered by Metal ⚡️ & [Llama.cpp](https://github.com/ggerganov/llama.cpp), with **haptics** during response streaming!
|
58 |
+
- **try it out** yourself today, on [Testflight](https://testflight.apple.com/join/sFWReS7K)!
|
59 |
+
- follow [cnvrs on twitter](https://twitter.com/cnvrsai) to stay up to date
|
60 |
+
|
61 |
+
---
|
62 |
+
|
63 |
+
## Original Model Evaluation
|
64 |
+
|
65 |
+
> **Table 1**:
|
66 |
+
> s1-32B is an open and sample-efficient reasoning model.
|
67 |
+
> We evaluate s1-32B, Qwen, and Gemini (some entries are unknown (N.A.), see §4).
|
68 |
+
> Other results are from the respective reports (Qwen et al., 2024; Team, 2024b; OpenAI, 2024; DeepSeek-AI et al., 2025; Labs, 2025; Team, 2025).
|
69 |
+
>
|
70 |
+
> \# ex. = number examples used for reasoning finetuning; BF = budget forcing.
|
71 |
+
|
72 |
+
via [s1: Simple test-time scaling (4.1 Results)](https://arxiv.org/html/2501.19393v2#S4:~:text=Table%201%3A,BF%20%3D%20budget%20forcing.)
|
73 |
+
|
74 |
+
| Model | # ex. | AIME 2024 | MATH 500 | GPQA Diamond |
|
75 |
+
|-------|--------|------------|-----------|---------------|
|
76 |
+
| **API only** |
|
77 |
+
| o1-preview | N.A. | 44.6 | 85.5 | 73.3 |
|
78 |
+
| o1-mini | N.A. | 70.0 | 90.0 | 60.0 |
|
79 |
+
| o1 | N.A. | **74.4** | **94.8** | **77.3** |
|
80 |
+
| Gemini 2.0 Flash Think. | N.A. | 60.0 | N.A. | N.A. |
|
81 |
+
| **Open Weights** |
|
82 |
+
| Qwen2.5-32B-Instruct | N.A. | 26.7 | 84.0 | 49.0 |
|
83 |
+
| QwQ-32B | N.A. | 50.0 | 90.6 | 65.2 |
|
84 |
+
| r1 | >>800K | **79.8** | **97.3** | **71.5** |
|
85 |
+
| r1-distill | 800K | 72.6 | 94.3 | 62.1 |
|
86 |
+
| **Open Weights and Open Data** |
|
87 |
+
| Sky-T1 | 17K | 43.3 | 82.4 | 56.8 |
|
88 |
+
| Bespoke-32B | 17K | **63.3** | **93.0** | 58.1 |
|
89 |
+
| s1 w/o BF | 1K | 50.0 | 92.6 | 56.6 |
|
90 |
+
| s1-32B | 1K | **56.7** | **93.0** | **59.6** |
|