inarikami commited on
Commit
657c40f
·
verified ·
1 Parent(s): b1cd096

Upload readme.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. readme.md +106 -0
readme.md ADDED
@@ -0,0 +1,106 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-nc-4.0
3
+ language:
4
+ - en
5
+ pipeline_tag: text-generation
6
+ tags:
7
+ - qwen2
8
+ - nvidia
9
+ - AceMath
10
+ - math
11
+ - CoT
12
+ - pytorch
13
+ - inarikami
14
+ - nlp
15
+ - math
16
+ - code
17
+ - chat
18
+ - conversational
19
+ base_model: nvidia/AceMath-72B-Instruct
20
+ library_name: transformers
21
+ ---
22
+
23
+ ## GGUF Conversion
24
+
25
+ This is a llama.cpp conversion in F16, Q8, Q4 for the original https://huggingface.co/nvidia/AceMath-72B-Instruct model published by Nvidia
26
+
27
+ ## Introduction
28
+ We introduce AceMath, a family of frontier models designed for mathematical reasoning. The models in AceMath family, including AceMath-1.5B/7B/72B-Instruct and AceMath-7B/72B-RM, are <b>Improved using Qwen</b>.
29
+ The AceMath-1.5B/7B/72B-Instruct models excel at solving English mathematical problems using Chain-of-Thought (CoT) reasoning, while the AceMath-7B/72B-RM models, as outcome reward models, specialize in evaluating and scoring mathematical solutions.
30
+
31
+ The AceMath-1.5B/7B/72B-Instruct models are developed from the Qwen2.5-Math-1.5B/7B/72B-Base models, leveraging a multi-stage supervised fine-tuning (SFT) process: first with general-purpose SFT data, followed by math-specific SFT data. We are releasing all training data to support further research in this field.
32
+
33
+ We only recommend using the AceMath models for solving math problems. To support other tasks, we also release AceInstruct-1.5B/7B/72B, a series of general-purpose SFT models designed to handle code, math, and general knowledge tasks. These models are built upon the Qwen2.5-1.5B/7B/72B-Base.
34
+
35
+ For more information about AceMath, check our [website](https://research.nvidia.com/labs/adlr/acemath/) and [paper](https://arxiv.org/abs/2412.15084).
36
+
37
+
38
+ ## All Resources
39
+ ### AceMath Instruction Models
40
+ - [AceMath-1.5B-Instruct](https://huggingface.co/nvidia/AceMath-1.5B-Instruct), [AceMath-7B-Instruct](https://huggingface.co/nvidia/AceMath-7B-Instruct), [AceMath-72B-Instruct](https://huggingface.co/nvidia/AceMath-72B-Instruct)
41
+
42
+ ### AceMath Reward Models
43
+ - [AceMath-7B-RM](https://huggingface.co/nvidia/AceMath-7B-RM), [AceMath-72B-RM](https://huggingface.co/nvidia/AceMath-72B-RM)
44
+
45
+ ### Evaluation & Training Data
46
+ - [AceMath-RewardBench](https://huggingface.co/datasets/nvidia/AceMath-RewardBench), [AceMath-Instruct Training Data](https://huggingface.co/datasets/nvidia/AceMath-Instruct-Training-Data), [AceMath-RM Training Data](https://huggingface.co/datasets/nvidia/AceMath-RM-Training-Data)
47
+
48
+ ### General Instruction Models
49
+ - [AceInstruct-1.5B](https://huggingface.co/nvidia/AceInstruct-1.5B), [AceInstruct-7B](https://huggingface.co/nvidia/AceInstruct-7B), [AceInstruct-72B](https://huggingface.co/nvidia/AceInstruct-72B)
50
+
51
+
52
+ ## Benchmark Results (AceMath-Instruct + AceMath-72B-RM)
53
+
54
+ <p align="center">
55
+ <img src="./acemath-pic.png" alt="AceMath Benchmark Results" width="800">
56
+ </p>
57
+
58
+
59
+ We compare AceMath to leading proprietary and open-access math models in above Table. Our AceMath-7B-Instruct, largely outperforms the previous best-in-class Qwen2.5-Math-7B-Instruct (Average pass@1: 67.2 vs. 62.9) on a variety of math reasoning benchmarks, while coming close to the performance of 10× larger Qwen2.5-Math-72B-Instruct (67.2 vs. 68.2). Notably, our AceMath-72B-Instruct outperforms the state-of-the-art Qwen2.5-Math-72B-Instruct (71.8 vs. 68.2), GPT-4o (67.4) and Claude 3.5 Sonnet (65.6) by a margin. We also report the rm@8 accuracy (best of 8) achieved by our reward model, AceMath-72B-RM, which sets a new record on these reasoning benchmarks. This excludes OpenAI’s o1 model, which relies on scaled inference computation.
60
+
61
+
62
+ ## How to use
63
+ ```python
64
+ from transformers import AutoModelForCausalLM, AutoTokenizer
65
+
66
+ model_name = "nvidia/AceMath-72B-Instruct"
67
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
68
+ model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map="auto")
69
+
70
+ prompt = "Jen enters a lottery by picking $4$ distinct numbers from $S=\\{1,2,3,\\cdots,9,10\\}.$ $4$ numbers are randomly chosen from $S.$ She wins a prize if at least two of her numbers were $2$ of the randomly chosen numbers, and wins the grand prize if all four of her numbers were the randomly chosen numbers. The probability of her winning the grand prize given that she won a prize is $\\tfrac{m}{n}$ where $m$ and $n$ are relatively prime positive integers. Find $m+n$."
71
+ messages = [{"role": "user", "content": prompt}]
72
+
73
+ text = tokenizer.apply_chat_template(
74
+ messages,
75
+ tokenize=False,
76
+ add_generation_prompt=True
77
+ )
78
+ model_inputs = tokenizer([text], return_tensors="pt").to("cuda")
79
+
80
+ generated_ids = model.generate(
81
+ **model_inputs,
82
+ max_new_tokens=2048
83
+ )
84
+ generated_ids = [
85
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
86
+ ]
87
+
88
+ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
89
+ ```
90
+
91
+
92
+ ## Correspondence to
93
+ Zihan Liu ([email protected]), Yang Chen ([email protected]), Wei Ping ([email protected])
94
+
95
+
96
+ ## Citation
97
+ If you find our work helpful, we’d appreciate it if you could cite us.
98
+ <pre>
99
+ @article{acemath2024,
100
+ title={AceMath: Advancing Frontier Math Reasoning with Post-Training and Reward Modeling},
101
+ author={Liu, Zihan and Chen, Yang and Shoeybi, Mohammad and Catanzaro, Bryan and Ping, Wei},
102
+ journal={arXiv preprint},
103
+ year={2024}
104
+ }
105
+ </pre>
106
+