Reasoning model distilled from DeepSeek-R1, enhanced with GRPO using supplementary reasoning datasets.