|
--- |
|
license: llama3.3 |
|
datasets: |
|
- DebateLabKIT/deepa2-conversations |
|
- DebateLabKIT/deep-argmap-conversations |
|
- allenai/tulu-3-sft-mixture |
|
base_model: |
|
- meta-llama/Llama-3.3-70B-Instruct |
|
pipeline_tag: text-generation |
|
library_name: transformers |
|
tags: |
|
- logic |
|
- argumentation |
|
- critical-thinking |
|
- argument-mapping |
|
- trl |
|
- sft |
|
--- |
|
|
|
# Model Card for Llama-3.3-Argunaut-1-70B-SFT |
|
|
|
This model is a fine-tuned version of [meta-llama/Llama-3.3-70B-Instruct](https://huggingface.co./meta-llama/Llama-3.3-70B-Instruct). |
|
It has been trained using [TRL](https://github.com/huggingface/trl). |
|
|
|
## Quick start |
|
|
|
```python |
|
from transformers import pipeline |
|
|
|
question = "Are you familiar with Argdown syntax? What's its purpose?" |
|
generator = pipeline("text-generation", model="DebateLabKIT/Llama-3.1-Argunaut-1-8B-SFT", device="cuda") |
|
output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0] |
|
print(output["generated_text"]) |
|
``` |
|
|
|
|
|
## SFT dataset mixture |
|
|
|
|Dataset|Weight (examples)|Weight (tokens)| |
|
|:------|:----:|:----:| |
|
|DebateLabKIT/deepa2-conversations|25%|49%| |
|
|DebateLabKIT/deep-argmap-conversations|25%|18%| |
|
|allenai/tulu-3-sft-mixture|50%|33%| |
|
|
|
|
|
## Training procedure |
|
|
|
Trained with SFT on **1M examples** and for 1 epoch with |
|
|
|
* context length 8196 |
|
* packing (trl implementation) |
|
* *spectrum* (top 30 percent) |
|
|
|
```yaml |
|
# Training parameters |
|
num_train_epochs: 1 |
|
per_device_train_batch_size: 2 |
|
gradient_accumulation_steps: 8 |
|
gradient_checkpointing: true |
|
gradient_checkpointing_kwargs: |
|
use_reentrant: false |
|
learning_rate: 2.0e-6 # following _Tülu 3_ recipe |
|
lr_scheduler_type: cosine |
|
warmup_ratio: 0.1 |
|
``` |
|
|
|
Hardware: 4 x H100 GPUs. |
|
|
|
_This work was performed on the HoreKa supercomputer funded by the |
|
Ministry of Science, Research and the Arts Baden-Württemberg and by |
|
the Federal Ministry of Education and Research._ |
|
|
|
### Framework versions |
|
|
|
- TRL: 0.12.1 |
|
- Transformers: 4.46.3 |
|
- Pytorch: 2.4.1 |
|
- Datasets: 3.1.0 |
|
- Tokenizers: 0.20.3 |
|
|
|
## Credits |
|
|
|
This work wouldn't be possible without all the **great contributions from the open LLM community**. Thank you! Special kudos go to |
|
|
|
- @philschmid for his latest [fine-tuning boilerplate](https://www.philschmid.de/fine-tune-llms-in-2025) |
|
- @lvwerra, @lewtun et al for building and maintaining [trl](https://github.com/huggingface/trl) |
|
- @cognitivecomputations for sharing [spectrum](https://github.com/cognitivecomputations/spectrum/tree/main) |
|
- @allenai for the [Tülu recipe and artifacts](https://huggingface.co./collections/allenai/tulu-3-datasets-673b8df14442393f7213f372) |
|
|