CoT-Valve: Length-Compressible Chain-of-Thought Tuning
Abstract
Chain-of-Thought significantly enhances a model's reasoning capability, but it also comes with a considerable increase in inference costs due to long chains. With the observation that the reasoning path can be easily compressed under easy tasks but struggle on hard tasks, we explore the feasibility of elastically controlling the length of reasoning paths with only one model, thereby reducing the inference overhead of reasoning models dynamically based on task difficulty. We introduce a new tuning and inference strategy named CoT-Valve, designed to allow models to generate reasoning chains of varying lengths. To achieve this, we propose to identify a direction in the parameter space that, when manipulated, can effectively control the length of generated CoT. Moreover, we show that this property is valuable for compressing the reasoning chain. We construct datasets with chains from long to short for the same questions and explore two enhanced strategies for CoT-Valve: (1) a precise length-compressible CoT tuning method, and (2) a progressive chain length compression approach. Our experiments show that CoT-Valve successfully enables controllability and compressibility of the chain and shows better performance than the prompt-based control. We applied this method to QwQ-32B-Preview, reducing reasoning chains on GSM8K from 741 to 225 tokens with a minor performance drop (95.07% to 94.92%) and on AIME from 6827 to 4629 tokens, with only one additional incorrect answer.
Community
Datasets: https://huggingface.co./collections/horseee/cot-valve-67ae3d0b2f6bddd288504839
Code (release soon): https://github.com/horseee/CoT-Valve
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- RedStar: Does Scaling Long-CoT Data Unlock Better Slow-Reasoning Systems? (2025)
- Demystifying Long Chain-of-Thought Reasoning in LLMs (2025)
- Enhancing Generalization in Chain of Thought Reasoning for Smaller Models (2025)
- Unveiling the Mechanisms of Explicit CoT Training: How Chain-of-Thought Enhances Reasoning Generalization (2025)
- Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though (2025)
- Enhancing Auto-regressive Chain-of-Thought through Loop-Aligned Reasoning (2025)
- LLMs Can Easily Learn to Reason from Demonstrations Structure, not content, is what matters! (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper