Model Card for Model ID
This is an early experiment using the GRPOTrainer
and training reasoning models using the Unsloth library. It is not intended for real use, but it should work OK for simple prompt tests and easy mathematics questions.
(You can run this using the code below on a free Colab/Kaggle basic GPU account for testing.)
NOTE: If you are interested in reasoning models and research in this area, I maintain an up-to-date resource list here : https://github.com/benjaminzwhite/reasoning-models
Example query:
"What is the smallest prime number greater than 50 ?"
Example response:
"<reasoning>\nTo find the smallest prime number greater than 50, we can start checking from 51 onwards for primality. A prime number is a number that has no divisors other than 1 and itself. We check each number to see if it's divisible by any number other than 1 and itself.\n</reasoning>\n<answer>\n53\n</answer>"
How to Get Started with the Model
To use this with standard HuggingFace code, I recommend starting with this code (based 95% on the default code shown at the base model page : https://huggingface.co/Qwen/Qwen2.5-3B-Instruct)
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "benjaminzwhite/Qwen2.5-3B-Instruct_GSM8K-GRPO_16bit"
# model loading
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
# system prompt used during training
SYSTEM_PROMPT = """
Respond in the following format:
<reasoning>
...
</reasoning>
<answer>
...
</answer>
"""
# your query goes here
user_prompt = "What is the smallest prime number greater than 50 ?"
messages = [
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": user_prompt}
]
# default Qwen2.5 code from this point ...
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(
**model_inputs,
max_new_tokens=512
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)
# sample answer obtained to my query, to show expected format
# (note that the answer, 53, is correct here)
"""
"<reasoning>\nTo find the smallest prime number greater than 50, we can start checking from 51 onwards for primality. A prime number is a number that has no divisors other than 1 and itself. We check each number to see if it's divisible by any number other than 1 and itself.\n</reasoning>\n<answer>\n53\n</answer>"
"""
Training Details
Training Data
Trained on GSM8K mathematics dataset.
- Downloads last month
- 24