Model Card for Soniox-7B-v1.0
Soniox 7B is a powerful large language model. Supports English and code with 8K context. Matches GPT-4 performance on some benchmarks. Built on top of Mistral 7B, enhanced with additional pre-training and fine-tuning for strong problem-solving capabilities. Apache 2.0 License. For more details, please read our blog post.
Usage in Transformers
The model is available in transformers and can be used as follows:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model_path = "soniox/Soniox-7B-v1.0"
model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=torch.float16)
tokenizer = AutoTokenizer.from_pretrained(model_path)
device = "cuda"
model.to(device)
messages = [
{"role": "user", "content": "12 plus 21?"},
{"role": "assistant", "content": "33."},
{"role": "user", "content": "Five minus one?"},
]
tok_prompt = tokenizer.apply_chat_template(messages, return_tensors="pt")
model_inputs = tok_prompt.to(device)
generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True)
decoded = tokenizer.batch_decode(generated_ids)
print(decoded[0])
Inference deployment
Refer to our documentation for inference with vLLM and other deployment options.
- Downloads last month
- 1,216
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.