File size: 8,271 Bytes
12db607
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a6de899
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12db607
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a6de899
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
---
license: apache-2.0
language:
- en
- zh
base_model:
- Qwen/Qwen2.5-14B-Instruct
pipeline_tag: text-generation
library_name: transformers
tags:
- text-generation-inference
- trl
- vlm
- sft
- code
- math
model-index:
- name: Gauss-Opus-14B-R999
  results:
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: IFEval (0-Shot)
      type: wis-k/instruction-following-eval
      split: train
      args:
        num_few_shot: 0
    metrics:
    - type: inst_level_strict_acc and prompt_level_strict_acc
      value: 39.07
      name: averaged accuracy
    source:
      url: https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=prithivMLmods%2FGauss-Opus-14B-R999
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: BBH (3-Shot)
      type: SaylorTwift/bbh
      split: test
      args:
        num_few_shot: 3
    metrics:
    - type: acc_norm
      value: 44.94
      name: normalized accuracy
    source:
      url: https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=prithivMLmods%2FGauss-Opus-14B-R999
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: MATH Lvl 5 (4-Shot)
      type: lighteval/MATH-Hard
      split: test
      args:
        num_few_shot: 4
    metrics:
    - type: exact_match
      value: 57.55
      name: exact match
    source:
      url: https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=prithivMLmods%2FGauss-Opus-14B-R999
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: GPQA (0-shot)
      type: Idavidrein/gpqa
      split: train
      args:
        num_few_shot: 0
    metrics:
    - type: acc_norm
      value: 18.9
      name: acc_norm
    source:
      url: https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=prithivMLmods%2FGauss-Opus-14B-R999
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: MuSR (0-shot)
      type: TAUR-Lab/MuSR
      args:
        num_few_shot: 0
    metrics:
    - type: acc_norm
      value: 27.83
      name: acc_norm
    source:
      url: https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=prithivMLmods%2FGauss-Opus-14B-R999
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: MMLU-PRO (5-shot)
      type: TIGER-Lab/MMLU-Pro
      config: main
      split: test
      args:
        num_few_shot: 5
    metrics:
    - type: acc
      value: 44.53
      name: accuracy
    source:
      url: https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=prithivMLmods%2FGauss-Opus-14B-R999
      name: Open LLM Leaderboard
---
![ccccccccccccc.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/Ii0oEprS2lm6Zoama7CPe.png)

# **Gauss-Opus-14B-R999**  

> Gauss-Opus-14B-R999 is based on the Qwen 2.5 14B modality architecture, designed to enhance mathematical and constructive reasoning capabilities. This model is optimized for advanced problem-solving, logical structuring, and mathematical comprehension. It excels in numerical reasoning, theorem proving, and multi-step calculations. Fine-tuned with specialized datasets in mathematics, physics, and formal logic, it delivers structured, high-accuracy outputs with a strong emphasis on precision and clarity.  

## **Key Improvements**  
1. **Enhanced Mathematical Reasoning**: Optimized for algebra, calculus, number theory, and logical deduction, providing precise and structured solutions.  
2. **Improved Instruction Following**: Capable of interpreting and following complex mathematical proofs, equations, and problem-solving instructions with high accuracy.  
3. **Versatile Adaptability**: Handles diverse reasoning tasks, including step-by-step solutions, mathematical proofs, and constructive problem-solving.  
4. **Long-Context Support**: Supports up to 128K tokens for input context and can generate up to 8K tokens in a single output, making it ideal for detailed mathematical derivations.  
5. **Multilingual Proficiency**: Supports over 29 languages, including English, Chinese, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, Arabic, and more, ensuring broad accessibility.  

## **Quickstart with transformers**  

Here is a code snippet with `apply_chat_template` to show you how to load the tokenizer and model and generate content:  

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "prithivMLmods/Gauss-Opus-14B-R999"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

prompt = "Solve the integral \int x^2 dx and explain the steps."
messages = [
    {"role": "system", "content": "You are a mathematical assistant specialized in problem-solving and theorem proving."},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=512
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
```  

## **Intended Use**  
1. **Mathematical Problem-Solving**:  
   Designed for high-precision mathematical reasoning, step-by-step calculations, and structured solutions.  

2. **Theorem Proving and Logical Reasoning**:  
   Useful for verifying mathematical proofs, formal logic derivations, and theorem-based reasoning.  

3. **STEM Education and Research**:  
   Ideal for educators, researchers, and students requiring assistance in complex problem-solving and mathematical modeling.  

4. **Algorithm Development and Optimization**:  
   Supports structured reasoning in algorithmic problem-solving, coding optimizations, and computational logic.  

5. **Long-Form Explanatory Content**:  
   Can generate detailed mathematical articles, research summaries, and explanatory guides with structured step-by-step reasoning.  

6. **Multilingual Mathematical Assistance**:  
   Supports global accessibility for mathematical discussions, translations, and problem explanations across multiple languages.  

## **Limitations**  
1. **Hardware Requirements**:  
   Requires high-memory GPUs or TPUs due to its large parameter size and long-context support.  

2. **Potential Bias in Training Data**:  
   While optimized for accuracy, the model may inherit biases from training data in certain problem-solving approaches.  

3. **Complexity in Abstract Theories**:  
   May struggle with highly abstract or unsolved mathematical problems that require intuitive leaps beyond computational logic.  

4. **Error Propagation in Extended Proofs**:  
   Small errors in early steps may compound in multi-step proofs and long-form mathematical derivations.  

5. **Prompt Sensitivity**:  
   The quality of responses depends on how well the problem is structured and framed within the input prompt.
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co./datasets/open-llm-leaderboard/prithivMLmods__Gauss-Opus-14B-R999-details)!
Summarized results can be found [here](https://huggingface.co./datasets/open-llm-leaderboard/contents/viewer/default/train?q=prithivMLmods%2FGauss-Opus-14B-R999&sort[column]=Average%20%E2%AC%86%EF%B8%8F&sort[direction]=desc)!

|      Metric       |Value (%)|
|-------------------|--------:|
|**Average**        |    38.80|
|IFEval (0-Shot)    |    39.07|
|BBH (3-Shot)       |    44.94|
|MATH Lvl 5 (4-Shot)|    57.55|
|GPQA (0-shot)      |    18.90|
|MuSR (0-shot)      |    27.83|
|MMLU-PRO (5-shot)  |    44.53|