OpenThinker-32B
This model is a fine-tuned version of Qwen/Qwen2.5-32B-Instruct on the OpenThoughts-114k dataset.
The dataset is derived by distilling DeepSeek-R1 using the data pipeline available on github. More info about the dataset can be found on the dataset card at OpenThoughts-114k dataset.
The numbers reported in the table below are evaluated with our open-source tool Evalchemy.
Model Name | Dataset Size | AIME24 I/II | AIME25 I | MATH500 | GPQA Diamond | LCBv2 |
---|---|---|---|---|---|---|
LIMO-32B | 0.8k | 56.7 | 49.3 | 86.6 | 58.1 | 60.0 |
s1-32B | 1k | 36.0 | 25.3 | 84.8 | 50.5 | 40.9 |
s1.1-32B | 1k | 64.7 | 49.3 | 89.0 | 60.1 | 65.5 |
DeepSeek-R1-Distill-Qwen-32B | 800k (closed) | 76.7 | 55.9 | 89.4 | 57.6 | 71.2 |
OpenThinker-32B | 114k | 66.0 | 53.3 | 90.6 | 61.6 | 68.9 |
We are fully open-source. Our model weights, datasets, data generation code, evaluation code, and training code are all publicly available.
Open Weights | Open Data | Open Code | |
---|---|---|---|
OpenThinker-32B | β | β | β |
DeepSeek-R1-Distill-Qwen-32B | β | β | β |
OpenAI/Gemini | β | β | β |
Intended uses & limitations
Apache 2.0 License
Training procedure
We finetune Qwen2.5-32B-Instruct on OpenThoughts-114k for 3 epochs with a 16k context length using LlamaFactory. Our full training configuration is provided in our repository. Training the 32B model on OpenThoughts-114k was done on AWS SageMaker with 8xH100 P5 nodes. On 4 nodes, this took around 90 hours. Meanwhile, for training on OpenThoughts-Unverified-173k, we used 96 nodes of 4xA100 (64 GB per GPU), training took 30 hours, spending 11,520 A100 hours on the Leonardo Supercomputer.
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 1
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 32
- gradient_accumulation_steps: 3
- total_train_batch_size: 96
- total_eval_batch_size: 256
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 3.0
Framework versions
- Transformers 4.46.1
- Pytorch 2.3.0
- Datasets 3.1.0
- Tokenizers 0.20.3
More info can be found in our repository: https://github.com/open-thoughts/open-thoughts.
Citation
@misc{openthoughts,
author = {Team, OpenThoughts},
month = jan,
title = {{Open Thoughts}},
howpublished = {https://open-thoughts.ai},
year = {2025}
}
Links
- π Open Thoughts Launch Blog Post
- π Open Thoughts Measuring Reasoning with Evalchmey Blog Post
- π Open Thoughts OpenThinker-32B Post
- π» Open Thoughts GitHub Repository
- π§ OpenThoughts-114k dataset
- π§ OpenThoughts-Unverified-173k dataset
- π€ OpenThinker-7B model
- π€ OpenThinker-7B-Unverfied model
- π€ OpenThinker-32B model - this model
- π€ OpenThinker-32B-Unverified model
- Downloads last month
- 196
Model tree for open-thoughts/OpenThinker-32B
Base model
Qwen/Qwen2.5-32B