Deductive-Reasoning-Qwen-32B

Deductive Reasoning Qwen 32B is a reinforcement fine-tune of Qwen 2.5 32B Instruct to solve challenging deduction problems from the Temporal Clue dataset, trained by OpenPipe!

Here are some additional resources to check out:

Blog Post
Training Recipe
RL Experiments
Deductive Reasoning Qwen 14B

If you're interested in training your own models with reinforcement learning or just chatting, feel free to reach out or email Kyle directly at [email protected]!

Downloads last month: 239

Safetensors

Model size

32.8B params

Tensor type

BF16

Inference Providers NEW

Text Generation

This model is not currently available via any of the supported Inference Providers.

Model tree for OpenPipe/Deductive-Reasoning-Qwen-32B

Base model

Qwen/Qwen2.5-32B

Finetuned

Qwen/Qwen2.5-32B-Instruct

Finetuned

(142)

this model