Model Card for Soniox-7B-v1.0

Soniox 7B is a powerful large language model. Supports English and code with 8K context. Matches GPT-4 performance on some benchmarks. Built on top of Mistral 7B, enhanced with additional pre-training and fine-tuning for strong problem-solving capabilities. Apache 2.0 License. For more details, please read our blog post.

Usage in Transformers

The model is available in transformers and can be used as follows:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_path = "soniox/Soniox-7B-v1.0"
model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=torch.float16)
tokenizer = AutoTokenizer.from_pretrained(model_path)

device = "cuda"
model.to(device)

messages = [
    {"role": "user", "content": "12 plus 21?"},
    {"role": "assistant", "content": "33."},
    {"role": "user", "content": "Five minus one?"},
]
tok_prompt = tokenizer.apply_chat_template(messages, return_tensors="pt")

model_inputs = tok_prompt.to(device)
generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True)
decoded = tokenizer.batch_decode(generated_ids)
print(decoded[0])

Inference deployment

Refer to our documentation for inference with vLLM and other deployment options.

Downloads last month
1,216
Safetensors
Model size
7.24B params
Tensor type
BF16
Β·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for soniox/Soniox-7B-v1.0

Quantizations
2 models

Spaces using soniox/Soniox-7B-v1.0 6