metadata

license: apache-2.0
datasets:
  - TIGER-Lab/WebInstruct-CFT
language:
  - en
base_model:
  - Qwen/Qwen2.5-32B-Instruct
tags:
  - cft
  - math
  - reasoning
pipeline_tag: text-generation
library_name: transformers

Qwen2.5-32B-Instruct-CFT

Introduction

Qwen2.5-32B-Instruct-CFT is a 32B parameter model fine-tuned using our novel Critique Fine-Tuning (CFT) approach. Built upon the Qwen2.5-32B-Instruct base model, this variant is trained to critique and analyze responses rather than simply imitate them, leading to enhanced reasoning capabilities.

Key Features

Built on the powerful Qwen2.5-32B-Instruct foundation
Trained using Critique Fine-Tuning (CFT) methodology
Highly efficient training with minimal data requirements
Inherits the strong instruction-following capabilities of the base model

Training Details

Training Data

Dataset: WebInstruct-CFT-4K
Training format: (input=[query; noisy response], output=critique)
Teacher model: GPT-4o for generating critiques

Training Infrastructure

Framework: LLaMA-Factory
Hardware: 8x NVIDIA H100 GPUs
Training time: ~1.5 hours with DeepSpeed Zero-3

For more details about the model architecture, methodology, and comprehensive evaluation results, please visit our project webpage.