nvidia
/

OpenMath2-Llama3.1-8B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

igitman commited on Oct 4, 2024

Commit

a9c240c

·

verified ·

1 Parent(s): 68fed43

Update README.md

Files changed (1) hide show

README.md +36 -7

README.md CHANGED Viewed

@@ -35,9 +35,9 @@ The model outperforms [Llama3.1-8B-Instruct](https://huggingface.co/meta-llama/L
   </style>
 <div class="image-container">
-        <img src="scaling_plot.jpg" title="Performance of Llama-3.1-8B-Instruct as it is trained on increasing proportions of OpenMathInstruct-2">
-        <img src="math_level_comp.jpg" title="Comparison of OpenMath2-Llama3.1-8B vs. Llama-3.1-8B-Instruct across MATH levels">
-    </div>
 | Model | GSM8K | MATH | AMC 2023 | AIME 2024 | Omni-MATH |
 |:---|:---:|:---:|:---:|:---:|:---:|
@@ -54,13 +54,42 @@ The pipeline we used to produce the data and models is fully open-sourced!
 - [Models](https://huggingface.co/collections/nvidia/openmath-2-66fb142317d86400783d2c7b)
 - [Dataset](https://huggingface.co/datasets/nvidia/OpenMathInstruct-2)
 # How to use the models?
-Our models are fully compatible with Llama3.1-instruct format, so you should be able to just replace an existing Llama3.1 checkpoint and use it in the same way.
-Please note that these models have not been instruction tuned and might not provide good answers outside of math domain.
-If you don't know how to use Llama3.1 models, we provide convenient [instructions in our repo](https://github.com/Kipok/NeMo-Skills/blob/main/docs/inference.md).
 # Reproducing our results

   </style>
 <div class="image-container">
+    <img src="scaling_plot.jpg" title="Performance of Llama-3.1-8B-Instruct as it is trained on increasing proportions of OpenMathInstruct-2">
+    <img src="math_level_comp.jpg" title="Comparison of OpenMath2-Llama3.1-8B vs. Llama-3.1-8B-Instruct across MATH levels">
+</div>
 | Model | GSM8K | MATH | AMC 2023 | AIME 2024 | Omni-MATH |
 |:---|:---:|:---:|:---:|:---:|:---:|
 - [Models](https://huggingface.co/collections/nvidia/openmath-2-66fb142317d86400783d2c7b)
 - [Dataset](https://huggingface.co/datasets/nvidia/OpenMathInstruct-2)
+See our paper to learn more details!
 # How to use the models?
+Our models are trained with the same "chat format" as Llama3.1-instruct models (same system/user/assistant tokens).
+Please note that these models have not been instruction tuned on general data and thus might not provide good answers outside of math domain.
+We recommend using [instructions in our repo](https://github.com/Kipok/NeMo-Skills/blob/main/docs/inference.md) to run inference with these models, but here is
+an example of how to do it through transformers api:
+```python
+import transformers
+import torch
+model_id = "nvidia/OpenMath2-Llama3.1-8B"
+pipeline = transformers.pipeline(
+    "text-generation",
+    model=model_id,
+    model_kwargs={"torch_dtype": torch.bfloat16},
+    device_map="auto",
+)
+messages = [
+    {
+        "role": "user",
+        "content": "Solve the following math problem. Make sure to put the answer (and only answer) inside \\boxed{}.\n\n" +
+        "What is the minimum value of $a^2+6a-7$?"},
+]
+outputs = pipeline(
+    messages,
+    max_new_tokens=4096,
+)
+print(outputs[0]["generated_text"][-1]['content'])
+```
 # Reproducing our results