Trained Models ποΈ
Collection
They may be small, but they're training like giants!
β’
8 items
β’
Updated
β’
16
<|im_start|>
and <|im_end|>
)<|im_start|>system
{system_message}<|im_end|>
<|im_start|>user
{user_message}<|im_end|>
<|im_start|>assistant
penalty_alpha: 0.5
top_k: 5
from transformers import pipeline
generate = pipeline("text-generation", "Felladrin/TinyMistral-248M-Chat-v2")
messages = [
{
"role": "system",
"content": "You are a highly knowledgeable and friendly assistant. Your goal is to understand and respond to user inquiries with clarity. Your interactions are always respectful, helpful, and focused on delivering the most accurate information to the user.",
},
{
"role": "user",
"content": "Hey! Got a question for you!",
},
{
"role": "assistant",
"content": "Sure! What's it?",
},
{
"role": "user",
"content": "What are some potential applications for quantum computing?",
},
]
prompt = generate.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
output = generate(
prompt,
max_new_tokens=256,
penalty_alpha=0.5,
top_k=5,
)
print(output[0]["generated_text"])
This model was trained with SFTTrainer using the following settings:
Hyperparameter | Value |
---|---|
Learning rate | 2e-5 |
Total train batch size | 32 |
Max. sequence length | 2048 |
Weight decay | 0.01 |
Warmup ratio | 0.1 |
NEFTune Noise Alpha | 5 |
Optimizer | Adam with betas=(0.9,0.999) and epsilon=1e-08 |
Scheduler | cosine |
Seed | 42 |
Detailed results can be found here
Metric | Value |
---|---|
Avg. | 27.42 |
AI2 Reasoning Challenge (25-Shot) | 23.29 |
HellaSwag (10-Shot) | 27.39 |
MMLU (5-Shot) | 23.52 |
TruthfulQA (0-shot) | 41.32 |
Winogrande (5-shot) | 49.01 |
GSM8k (5-shot) | 0.00 |
Base model
Locutusque/TinyMistral-248M