Qwen2.5-Math-7B-CFT

Introduction

Qwen2.5-Math-7B-CFT is a 7B parameter mathematical reasoning model that introduces a paradigm shift in language model training. Rather than using traditional supervised fine-tuning (SFT) to imitate correct answers, this model is trained using our novel Critique Fine-Tuning (CFT) approach, which teaches the model to critique and analyze responses, leading to deeper understanding and enhanced reasoning capabilities.

The model demonstrates that learning to critique is more effective than learning to imitate. Despite being trained on just 50K samples, it achieves remarkable performance matching or exceeding models trained on 2M+ samples, reaching 79.4% accuracy on MATH and 41.6% on OlympiadBench benchmarks.

Key Features

Novel training methodology inspired by human learning processes that emphasize critical thinking
Consistent 4-10% improvement over traditional SFT approaches across six math benchmarks
Exceptional data efficiency: matches performance of models trained on 40x more data
Built on the strong foundation of Qwen2.5-Math-7B

Training Details

Training Data

Dataset: WebInstruct-CFT-50K
Training format: (input=[query; noisy response], output=critique)
Teacher model: GPT-4o for generating critiques

Training Infrastructure

Framework: LLaMA-Factory
Hardware: 8x NVIDIA H100 GPUs
Training time: ~1 hour with DeepSpeed Zero-3

Evaluation Results

Table 1: Performance comparison of Qwen2.5-Math-7B-CFT vs. other reasoning-specialized models.

For more details about the model architecture, methodology, and comprehensive evaluation results, please visit our project webpage.

TIGER-Lab
/

Qwen2.5-Math-7B-CFT

Qwen2.5-Math-7B-CFT

Introduction

Key Features

Training Details

Training Data

Training Infrastructure

Evaluation Results

Model tree for TIGER-Lab/Qwen2.5-Math-7B-CFT

Dataset used to train TIGER-Lab/Qwen2.5-Math-7B-CFT

Collection including TIGER-Lab/Qwen2.5-Math-7B-CFT

CritiqueFineTuning