justinj92 commited on
Commit
e14b4a3
·
verified ·
1 Parent(s): 2050059

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -13
README.md CHANGED
@@ -39,16 +39,15 @@ This model is a fine-tuned version of [Qwen/Qwen2.5-1.5B-Instruct](https://huggi
39
  It has been trained using [TRL](https://github.com/huggingface/trl).
40
 
41
 
42
- ## Quick start
43
 
44
- ```python
45
- from transformers import pipeline
 
 
 
 
46
 
47
- question = "Mia can decorate 2 dozen Easter eggs per hour. Her little brother Billy can only decorate 10 eggs per hour. They need to decorate 170 eggs for the Easter egg hunt. If they work together, how long will it take them to decorate all the eggs?"
48
- generator = pipeline("text-generation", model="justinj92/Qwen2.5-1.5B-Thinking", device="cuda")
49
- output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
50
- print(output["generated_text"])
51
- ```
52
  ## Evals
53
 
54
  | Model | GSM8k 0-Shot | GSM8k Few-Shot |
@@ -57,17 +56,13 @@ print(output["generated_text"])
57
  | Qwen2.5-1.5B-Thinking | 14.4 | 63.31 |
58
 
59
 
60
-
61
-
62
  ## Training procedure
63
 
64
  <img src="https://raw.githubusercontent.com/wandb/wandb/fc186783c86c33980e5c73f13363c13b2c5508b1/assets/logo-dark.svg" alt="Weights & Biases Logged" width="150" height="24"/>
65
 
66
  <img src="https://huggingface.co/justinj92/Qwen2.5-1.5B-Thinking/resolve/main/w%26b_qwen_r1.png" width="1024" height="800"/>
67
 
68
- Trained on 1xH100 96GB via Azure Cloud.
69
-
70
- GRPO'd on Maths related problems due to GPU Credit constraints.
71
 
72
  This model was trained with GRPO, a method introduced in [DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models](https://huggingface.co/papers/2402.03300).
73
 
 
39
  It has been trained using [TRL](https://github.com/huggingface/trl).
40
 
41
 
42
+ ### Usage Recommendations
43
 
44
+ **We recommend adhering to the following configurations when utilizing the models, including benchmarking, to achieve the expected performance:**
45
+
46
+ 1. Set the temperature within the range of 0.5-0.7 (0.6 is recommended) to prevent endless repetitions or incoherent outputs.
47
+ 2. **For mathematical problems, it is advisable to include a directive in your prompt such as: "Please reason step by step, and put your final answer within \boxed{}."**
48
+ 3. When evaluating model performance, it is recommended to conduct multiple tests and average the results.
49
+ 4. This model is not enhanced for other domains apart from Maths.
50
 
 
 
 
 
 
51
  ## Evals
52
 
53
  | Model | GSM8k 0-Shot | GSM8k Few-Shot |
 
56
  | Qwen2.5-1.5B-Thinking | 14.4 | 63.31 |
57
 
58
 
 
 
59
  ## Training procedure
60
 
61
  <img src="https://raw.githubusercontent.com/wandb/wandb/fc186783c86c33980e5c73f13363c13b2c5508b1/assets/logo-dark.svg" alt="Weights & Biases Logged" width="150" height="24"/>
62
 
63
  <img src="https://huggingface.co/justinj92/Qwen2.5-1.5B-Thinking/resolve/main/w%26b_qwen_r1.png" width="1024" height="800"/>
64
 
65
+ Trained on 1xH100 96GB via Azure Cloud (East US2).
 
 
66
 
67
  This model was trained with GRPO, a method introduced in [DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models](https://huggingface.co/papers/2402.03300).
68