metadata
base_model: mistralai/Mistral-Small-24B-Instruct-2501
datasets:
- ServiceNow-AI/R1-Distill-SFT
tags:
- text-generation-inference
- transformers
- unsloth
- mistral
- trl
- sft
license: apache-2.0
language:
- en

mistral-small-r1-tensopolis
This model is a reasoning fine-tune of unsloth/mistral-small-24b-instruct-2501-unsloth-bnb-4bit. Trained in 1xA100 for about 100 hours. Please refer to the base model and dataset for more information about license, prompt format, etc.
Base model: mistralai/Mistral-Small-24B-Instruct-2501
Dataset: ServiceNow-AI/R1-Distill-SFT
Basic Instruct Template (V7-Tekken)
<s>[SYSTEM_PROMPT]<system prompt>[/SYSTEM_PROMPT][INST]<user message>[/INST]<assistant response></s>[INST]<user message>[/INST]
This mistral model was trained 2x faster with Unsloth and Huggingface's TRL library.