metadata
license: apache-2.0
datasets:
- TIGER-Lab/WebInstruct-CFT
language:
- en
base_model:
- Qwen/Qwen2.5-32B-Instruct
tags:
- cft
- math
- reasoning
pipeline_tag: text-generation
library_name: transformers
Qwen2.5-32B-Instruct-CFT
Introduction
Qwen2.5-32B-Instruct-CFT is a 32B parameter model fine-tuned using our novel Critique Fine-Tuning (CFT) approach. Built upon the Qwen2.5-32B-Instruct base model, this variant is trained to critique and analyze responses rather than simply imitate them, leading to enhanced reasoning capabilities.
Key Features
- Built on the powerful Qwen2.5-32B-Instruct foundation
- Trained using Critique Fine-Tuning (CFT) methodology
- Highly efficient training with minimal data requirements
- Inherits the strong instruction-following capabilities of the base model
Training Details
Training Data
- Dataset: WebInstruct-CFT-4K
- Training format: (input=[query; noisy response], output=critique)
- Teacher model: GPT-4o for generating critiques
Training Infrastructure
- Framework: LLaMA-Factory
- Hardware: 8x NVIDIA H100 GPUs
- Training time: ~1.5 hours with DeepSpeed Zero-3
For more details about the model architecture, methodology, and comprehensive evaluation results, please visit our project webpage.