migtissera
/

Tess-R1-Limerick-Llama-3.1-70B

Model card Files Files and versions Community

migtissera commited on Nov 5, 2024

Commit

f81c056

·

verified ·

1 Parent(s): 0523050

Update README.md

Files changed (1) hide show

README.md +11 -10

README.md CHANGED Viewed

@@ -12,17 +12,8 @@ model-index:
 <br>
-# Evaluations
-|              | Tess-R1 Limerick | Claude 3.5 Haiku | GPT-4o mini |
-|--------------|------------------|------------------|-------------|
-| GPQA         | 41.5%            | 41.6%           | 40.2%       |
-| MMLU         | 81.6%            | -               | 82.0%       |
-| MATH         | 64.2%            | 69.4%           | 70.2%       |
-| MMLU-Pro     | 65.6%            | 65.0%           | -           |
-| HumanEval    |             | 88.1%           | 87.2%       |
-| DROP (F1 Score) |         | 83.1%           | 79.7%       |
 Welcome to the Tess-Reasoning-1 (Tess-R1) series of models. Tess-R1 is designed with test-time compute in mind, and has the capabilities to produce a Chain-of-Thought (CoT) reasoning before producing the final output.
@@ -36,6 +27,16 @@ The model is trained to first think step-by-step, and contemplate on its answers
 # Important Note:
 In a multi-turn conversation, only the contents between the `<output>` `</output>` tags (discarding the tags) should be carried forward. Otherwise the model will see out of distribution input data and will fail.
 # Prompt Format

 <br>
+# Introduction
 Welcome to the Tess-Reasoning-1 (Tess-R1) series of models. Tess-R1 is designed with test-time compute in mind, and has the capabilities to produce a Chain-of-Thought (CoT) reasoning before producing the final output.
 # Important Note:
 In a multi-turn conversation, only the contents between the `<output>` `</output>` tags (discarding the tags) should be carried forward. Otherwise the model will see out of distribution input data and will fail.
+# Evaluations
+|              | Tess-R1 Limerick | Claude 3.5 Haiku | GPT-4o mini |
+|--------------|------------------|------------------|-------------|
+| GPQA         | 41.5%            | 41.6%           | 40.2%       |
+| MMLU         | 81.6%            | -               | 82.0%       |
+| MATH         | 64.2%            | 69.4%           | 70.2%       |
+| MMLU-Pro     | 65.6%            | 65.0%           | -           |
+| HumanEval    |             | 88.1%           | 87.2%       |
+| DROP (F1 Score) |         | 83.1%           | 79.7%       |
 # Prompt Format