brittlewis12 commited on
Commit
e7c6ca2
·
verified ·
1 Parent(s): 1e5c5e5

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +90 -0
README.md ADDED
@@ -0,0 +1,90 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: simplescaling/s1-32B
3
+ pipeline_tag: text-generation
4
+ inference: true
5
+ language:
6
+ - en
7
+ license: apache-2.0
8
+ model_creator: simplescaling
9
+ model_name: s1-32B
10
+ model_type: qwen2
11
+ datasets:
12
+ - simplescaling/s1K
13
+ quantized_by: brittlewis12
14
+
15
+ ---
16
+
17
+ # s1 32B GGUF
18
+
19
+ **Original model**: [s1 32B](https://huggingface.co/simplescaling/s1-32B)
20
+
21
+ **Model creator**: [simplescaling](https://huggingface.co/simplescaling)
22
+
23
+ > s1 is a reasoning model finetuned from Qwen2.5-32B-Instruct on just 1,000 examples. It matches o1-preview & exhibits test-time scaling via budget forcing.
24
+
25
+ This repo contains GGUF format model files for simplescaling’s s1 32B, an open reproduction of OpenAI’s o1-preview on 1,000 reasoning traces, including model, source code, and data (see [s1K](https://huggingface.co/datasets/simplescaling/s1K)).
26
+
27
+ Learn more on simplescaling’s [s1 github repo](https://github.com/simplescaling/s1) & [arxiv preprint](https://arxiv.org/abs/2501.19393).
28
+
29
+ ### What is GGUF?
30
+
31
+ GGUF is a file format for representing AI models. It is the third version of the format,
32
+ introduced by the llama.cpp team on August 21st 2023. It is a replacement for GGML, which is no longer supported by llama.cpp.
33
+ Converted with llama.cpp build 4628 (revision [cde3833](https://github.com/ggerganov/llama.cpp/commits/cde383323959544abe10a4d79e1d3e1ee479933c)),
34
+ using [autogguf-rs](https://github.com/brittlewis12/autogguf-rs).
35
+
36
+ ### Prompt template: ChatML
37
+
38
+ ```
39
+ <|im_start|>system
40
+ {{system_message}}<|im_end|>
41
+ <|im_start|>user
42
+ {{prompt}}<|im_end|>
43
+ <|im_start|>assistant
44
+
45
+ ```
46
+
47
+ ---
48
+
49
+ ## Download & run with [cnvrs](https://twitter.com/cnvrsai) on iPhone, iPad, and Mac!
50
+
51
+ ![cnvrs.ai](https://pbs.twimg.com/profile_images/1744049151241797632/0mIP-P9e_400x400.jpg)
52
+
53
+ [cnvrs](https://testflight.apple.com/join/sFWReS7K) is the best app for private, local AI on your device:
54
+ - create & save **Characters** with custom system prompts & temperature settings
55
+ - download and experiment with any **GGUF model** you can [find on HuggingFace](https://huggingface.co/models?library=gguf)!
56
+ - make it your own with custom **Theme colors**
57
+ - powered by Metal ⚡️ & [Llama.cpp](https://github.com/ggerganov/llama.cpp), with **haptics** during response streaming!
58
+ - **try it out** yourself today, on [Testflight](https://testflight.apple.com/join/sFWReS7K)!
59
+ - follow [cnvrs on twitter](https://twitter.com/cnvrsai) to stay up to date
60
+
61
+ ---
62
+
63
+ ## Original Model Evaluation
64
+
65
+ > **Table 1**:
66
+ > s1-32B is an open and sample-efficient reasoning model.
67
+ > We evaluate s1-32B, Qwen, and Gemini (some entries are unknown (N.A.), see §4).
68
+ > Other results are from the respective reports (Qwen et al., 2024; Team, 2024b; OpenAI, 2024; DeepSeek-AI et al., 2025; Labs, 2025; Team, 2025).
69
+ >
70
+ > \# ex. = number examples used for reasoning finetuning; BF = budget forcing.
71
+
72
+ via [s1: Simple test-time scaling (4.1 Results)](https://arxiv.org/html/2501.19393v2#S4:~:text=Table%201%3A,BF%20%3D%20budget%20forcing.)
73
+
74
+ | Model | # ex. | AIME 2024 | MATH 500 | GPQA Diamond |
75
+ |-------|--------|------------|-----------|---------------|
76
+ | **API only** |
77
+ | o1-preview | N.A. | 44.6 | 85.5 | 73.3 |
78
+ | o1-mini | N.A. | 70.0 | 90.0 | 60.0 |
79
+ | o1 | N.A. | **74.4** | **94.8** | **77.3** |
80
+ | Gemini 2.0 Flash Think. | N.A. | 60.0 | N.A. | N.A. |
81
+ | **Open Weights** |
82
+ | Qwen2.5-32B-Instruct | N.A. | 26.7 | 84.0 | 49.0 |
83
+ | QwQ-32B | N.A. | 50.0 | 90.6 | 65.2 |
84
+ | r1 | >>800K | **79.8** | **97.3** | **71.5** |
85
+ | r1-distill | 800K | 72.6 | 94.3 | 62.1 |
86
+ | **Open Weights and Open Data** |
87
+ | Sky-T1 | 17K | 43.3 | 82.4 | 56.8 |
88
+ | Bespoke-32B | 17K | **63.3** | **93.0** | 58.1 |
89
+ | s1 w/o BF | 1K | 50.0 | 92.6 | 56.6 |
90
+ | s1-32B | 1K | **56.7** | **93.0** | **59.6** |