ubowang's picture
Add library name and pipeline tag (#1)
4fdc411 verified
metadata
license: apache-2.0
datasets:
  - TIGER-Lab/WebInstruct-CFT
language:
  - en
base_model:
  - Qwen/Qwen2.5-32B-Instruct
tags:
  - cft
  - math
  - reasoning
pipeline_tag: text-generation
library_name: transformers

Qwen2.5-32B-Instruct-CFT

Introduction

Qwen2.5-32B-Instruct-CFT is a 32B parameter model fine-tuned using our novel Critique Fine-Tuning (CFT) approach. Built upon the Qwen2.5-32B-Instruct base model, this variant is trained to critique and analyze responses rather than simply imitate them, leading to enhanced reasoning capabilities.

Key Features

  • Built on the powerful Qwen2.5-32B-Instruct foundation
  • Trained using Critique Fine-Tuning (CFT) methodology
  • Highly efficient training with minimal data requirements
  • Inherits the strong instruction-following capabilities of the base model

Training Details

Training Data

  • Dataset: WebInstruct-CFT-4K
  • Training format: (input=[query; noisy response], output=critique)
  • Teacher model: GPT-4o for generating critiques

Training Infrastructure

  • Framework: LLaMA-Factory
  • Hardware: 8x NVIDIA H100 GPUs
  • Training time: ~1.5 hours with DeepSpeed Zero-3

For more details about the model architecture, methodology, and comprehensive evaluation results, please visit our project webpage.