codelion
/

Llama-3.3-70B-o1

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Llama-3.3-70B-o1 / README.md

codelion's picture

Update README.md

ca35c3b verified 21 days ago

|

history blame contribute delete

1.78 kB

	---
	base_model: unsloth/llama-3.3-70b-instruct-bnb-4bit
	tags:
	- text-generation-inference
	- transformers
	- unsloth
	- llama
	- trl
	license: apache-2.0
	language:
	- en
	datasets:
	- codelion/Sky-T1_data_17k
	- codelion/optillm-router-dataset
	metrics:
	- accuracy
	---

	# Llama-3.3-70B-o1 Thinker Model

	This model was fine-tuned on CoT-style reasoning traces. The model will respond with a _thinking_ trace
	between the `<\|begin_of_thought\|>` and `<\|end_of_thought\|>` tags. The final answer will be between the `<\|begin_of_solution\|>`
	and `<\|end_of_solution\|>` tags.

	Compared to the base Llama model, this thinker model has a tendency to generate a large number of tokens. So, if you are benchmarking
	make sure you have the full generated text in the reponse, ending with the `<\|end_of_solution\|>` tag. For most queries,
	you will need to set the `max_tokens` to at least 8192.

	The GGUF quants for the model are available here - [Llama-3.3-70B-o1-gguf](https://huggingface.co./codelion/Llama-3.3-70B-o1-gguf)

	The model was trained using QLoRA fine-tuning. You can find the adapter here - [Llama-3.3-70B-o1-lora](https://huggingface.co./codelion/Llama-3.3-70B-o1-lora).

	## Evaluation results

	\| Model \| AIME 2024 pass@1 \|
	\|-------\|------------------\|
	\| Llama-3.3-70B-o1 \| 46.7 \|
	\| Llama-3.3-70B \| 30.0 \|
	\| Sky-T1-32B-Preview \| 43.3 \|
	\| o1-preview \| 40.0 \|
	\| QwQ \| 50.0 \|

	- Developed by: codelion
	- License: apache-2.0
	- Finetuned from model : unsloth/llama-3.3-70b-instruct-bnb-4bit

	This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

	[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)