Model Description

Quasar-1.5 is a next-generation reasoning model designed with advanced mechanisms to enhance task-specific performance and contextual understanding. It introduces a novel component, the Token Temperature Mechanism, which dynamically refines user inputs by categorizing tokens into 'hot' (key parts of the input critical for solving tasks) and 'cold' (less critical parts). This mechanism enables Quasar-1 to focus on the most relevant aspects of a task, improving interpretability, accuracy, and reasoning capabilities.

  • Key Features:

    (1) Token Temperature-Based Attention Modulation: Adjusts the attention given to different parts of the input based on token categorization.

    (2) Guided Sequence of Thought (GSoT): Enhances reasoning by guiding the model through structured, step-by-step problem-solving processes.

  • Developed by: SILX AI .

Note

The Token Temperature Mechanism was incorporated during training to enable dynamic token weighting. Iterative validation was performed using a mix of real-world and synthetic benchmarks.

Compute

We finished training using 4x H100 GPUs, completing the process in 8 hours at a cost of $125. As part of the Lambda Researcher Program, we would like to thank Lambda Cloud for providing the best compute possible to continue the training.

Evaluation

Datasets Quasar-1 Qwen-2.5-32B-Instruct QwQ o1-preview
Math500 99.2 76.2 85.4 81.4
AIME2024 97.3 16.7 50.0 40.0
LiveCodeBench-Easy 98.3 84.6 90.7 92.9
LiveCodeBench-Medium 89.8 40.8 56.3 54.9
LiveCodeBench-Hard 82.9 9.8 17.1 16.3
GPQA-Diamond 95.8 45.5 52.5 75.2
Downloads last month
314
Safetensors
Model size
32.8B params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for silx-ai/Quasar-1.5-Pro

Quantizations
2 models

Collection including silx-ai/Quasar-1.5-Pro