TIGER-Lab
/

Qwen2.5-Math-7B-CFT

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

ubowang commited on 13 days ago

Commit

da34804

·

verified ·

1 Parent(s): b18b91d

Update README.md

Files changed (1) hide show

README.md +6 -1

README.md CHANGED Viewed

@@ -49,7 +49,7 @@ The model demonstrates that learning to critique is more effective than learning
 ### Training Data
-- Dataset: [WebInstruct-CFT-50K](https://huggingface.co/datasets/TIGER-Lab/WebInstruct-CFT-50K)
 - Training format: (input=[query; noisy response], output=critique)
 - Teacher model: GPT-4o for generating critiques
@@ -60,6 +60,11 @@ The model demonstrates that learning to critique is more effective than learning
 - Training time: ~1 hour with DeepSpeed Zero-3
 For more details about the model architecture, methodology, and comprehensive evaluation results, please visit our [project webpage](https://tiger-ai-lab.github.io/CritiqueFineTuning).

 ### Training Data
+- Dataset: [WebInstruct-CFT-50K](https://huggingface.co/datasets/TIGER-Lab/WebInstruct-CFT)
 - Training format: (input=[query; noisy response], output=critique)
 - Teacher model: GPT-4o for generating critiques
 - Training time: ~1 hour with DeepSpeed Zero-3
+## Evaluation Results
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/636a35eff8d9af4aea181608/ifPVcA7-aAdzbxX8U6wat.png)
 For more details about the model architecture, methodology, and comprehensive evaluation results, please visit our [project webpage](https://tiger-ai-lab.github.io/CritiqueFineTuning).