justinj92
/

Qwen2.5-1.5B-Thinking

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

justinj92 commited on 11 days ago

Commit

e14b4a3

·

verified ·

1 Parent(s): 2050059

Update README.md

Files changed (1) hide show

README.md +8 -13

README.md CHANGED Viewed

@@ -39,16 +39,15 @@ This model is a fine-tuned version of [Qwen/Qwen2.5-1.5B-Instruct](https://huggi
 It has been trained using [TRL](https://github.com/huggingface/trl).
-## Quick start
-```python
-from transformers import pipeline
-question = "Mia can decorate 2 dozen Easter eggs per hour. Her little brother Billy can only decorate 10 eggs per hour. They need to decorate 170 eggs for the Easter egg hunt. If they work together, how long will it take them to decorate all the eggs?"
-generator = pipeline("text-generation", model="justinj92/Qwen2.5-1.5B-Thinking", device="cuda")
-output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
-print(output["generated_text"])
-```
 ## Evals
 | Model                                    | GSM8k 0-Shot | GSM8k Few-Shot |
@@ -57,17 +56,13 @@ print(output["generated_text"])
 | Qwen2.5-1.5B-Thinking             | 14.4             | 63.31                 |
 ## Training procedure
 <img src="https://raw.githubusercontent.com/wandb/wandb/fc186783c86c33980e5c73f13363c13b2c5508b1/assets/logo-dark.svg" alt="Weights & Biases Logged" width="150" height="24"/>
 <img src="https://huggingface.co/justinj92/Qwen2.5-1.5B-Thinking/resolve/main/w%26b_qwen_r1.png" width="1024" height="800"/>
-Trained on 1xH100 96GB via Azure Cloud.
-GRPO'd on Maths related problems due to GPU Credit constraints.
 This model was trained with GRPO, a method introduced in [DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models](https://huggingface.co/papers/2402.03300).

 It has been trained using [TRL](https://github.com/huggingface/trl).
+### Usage Recommendations
+**We recommend adhering to the following configurations when utilizing the models, including benchmarking, to achieve the expected performance:**
+1. Set the temperature within the range of 0.5-0.7 (0.6 is recommended) to prevent endless repetitions or incoherent outputs.
+2. **For mathematical problems, it is advisable to include a directive in your prompt such as: "Please reason step by step, and put your final answer within \boxed{}."**
+3. When evaluating model performance, it is recommended to conduct multiple tests and average the results.
+4. This model is not enhanced for other domains apart from Maths.
 ## Evals
 | Model                                    | GSM8k 0-Shot | GSM8k Few-Shot |
 | Qwen2.5-1.5B-Thinking             | 14.4             | 63.31                 |
 ## Training procedure
 <img src="https://raw.githubusercontent.com/wandb/wandb/fc186783c86c33980e5c73f13363c13b2c5508b1/assets/logo-dark.svg" alt="Weights & Biases Logged" width="150" height="24"/>
 <img src="https://huggingface.co/justinj92/Qwen2.5-1.5B-Thinking/resolve/main/w%26b_qwen_r1.png" width="1024" height="800"/>
+Trained on 1xH100 96GB via Azure Cloud (East US2).
 This model was trained with GRPO, a method introduced in [DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models](https://huggingface.co/papers/2402.03300).