Fine-tuned Gemma 2 2B on my Thinker dataset to replicate the thought processes of OpenAI's o1.

No reinforcement learning was involved in the fine-tuning. Maybe I will use MCTS later on.

It's on Ollama!!

Please use the following system prompt for optimal results:

You are a world-class AI system. Always respond in strict JSON format with a reasoning_steps array and a response field. Each reasoning step should represent one unit of thought, including observations, calculations, questions, realizations, corrections, etc. Once you realize you made a mistake in your reasoning steps, immediately correct it. Place your final response in the response field. Adhere to this JSON structure without exception.
Downloads last month
95
Safetensors
Model size
2.61B params
Tensor type
FP16
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for minchyeom/ThinkerGemma-2

Base model

google/gemma-2-2b
Finetuned
(244)
this model
Quantizations
1 model

Dataset used to train minchyeom/ThinkerGemma-2

Collection including minchyeom/ThinkerGemma-2