Triangle104
/

SuperCorrect-7B-Q5_K_S-GGUF

@@ -1,14 +1,14 @@
 ---
-license: apache-2.0
 language:
 - en
 metrics:
 - accuracy
-base_model: BitStarWalkin/SuperCorrect-7B
-library_name: transformers
 tags:
-- llama-cpp
-- gguf-my-repo
 ---
 # Triangle104/SuperCorrect-7B-Q5_K_S-GGUF
@@ -28,6 +28,8 @@ Introduction
 -
 This repo provides the official implementation of SuperCorrect a novel two-stage fine-tuning method for improving both reasoning accuracy and self-correction ability for LLMs.
 Notably, our SupperCorrect-7B model significantly surpasses powerful DeepSeekMath-7B by 7.8%/5.3% and Qwen2.5-Math-7B by 15.1%/6.3% on MATH/GSM8K benchmarks, achieving new SOTA performance among all 7B models.
 🚨 Unlike other LLMs, we incorporate LLMs with our pre-defined hierarchical thought template ([Buffer of Thought (BoT)](https://github.com/YangLing0818/buffer-of-thought-llm)) to conduct more deliberate reasoning than conventional CoT. It should be noted that our evaluation methods relies on pure mathematical reasoning abilities of LLMs, instead of leverage other programming methods such as PoT and ToRA.
@@ -50,6 +52,7 @@ Inference
 -
 🤗 Hugging Face Transformers
 from transformers import AutoModelForCausalLM, AutoTokenizer
 model_name = "BitStarWalkin/SuperCorrect-7B"
@@ -62,7 +65,7 @@ model = AutoModelForCausalLM.from_pretrained(
 )
 tokenizer = AutoTokenizer.from_pretrained(model_name)
-prompt = "Find the distance between the foci of the ellipse \[9x^2 + \frac{y^2}{9} = 99.\]"
 hierarchical_prompt = "Solve the following math problem in a step-by-step XML format, each step should be enclosed within tags like <Step1></Step1>. For each step enclosed within the tags, determine if this step is challenging and tricky, if so, add detailed explanation and analysis enclosed within <Key> </Key> in this step, as helpful annotations to help you thinking and remind yourself how to conduct reasoning correctly. After all the reasoning steps, summarize the common solution and reasoning steps to help you and your classmates who are not good at math generalize to similar problems within <Generalized></Generalized>. Finally present the final answer within <Answer> </Answer>."
 # HT
 messages = [
@@ -87,6 +90,34 @@ generated_ids = [
 response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
 print(response)
 Performance
 -
@@ -96,7 +127,7 @@ Citation
 -
 @article{yang2024supercorrect,
 title={SuperCorrect: Supervising and Correcting Language Models with Error-Driven Insights}
-  author={Yang, Ling and Yu, Zhaochen and Zhang, Tianjun and Xu, Minkai and Gonzalez, Joseph E and Cui, Bin and Yan, Shuicheng},
   journal={arXiv preprint arXiv:2410.09008},
   year={2024}
 }
@@ -150,4 +181,4 @@ Step 3: Run inference through the main binary.
 or
 ```
 ./llama-server --hf-repo Triangle104/SuperCorrect-7B-Q5_K_S-GGUF --hf-file supercorrect-7b-q5_k_s.gguf -c 2048
-```

 ---
+base_model: BitStarWalkin/SuperCorrect-7B
 language:
 - en
+library_name: transformers
+license: apache-2.0
 metrics:
 - accuracy
+pipeline_tag: question-answering
 tags:
+- llama
 ---
 # Triangle104/SuperCorrect-7B-Q5_K_S-GGUF
 -
 This repo provides the official implementation of SuperCorrect a novel two-stage fine-tuning method for improving both reasoning accuracy and self-correction ability for LLMs.
+[Paper](https://huggingface.co/papers/2410.09008)
 Notably, our SupperCorrect-7B model significantly surpasses powerful DeepSeekMath-7B by 7.8%/5.3% and Qwen2.5-Math-7B by 15.1%/6.3% on MATH/GSM8K benchmarks, achieving new SOTA performance among all 7B models.
 🚨 Unlike other LLMs, we incorporate LLMs with our pre-defined hierarchical thought template ([Buffer of Thought (BoT)](https://github.com/YangLing0818/buffer-of-thought-llm)) to conduct more deliberate reasoning than conventional CoT. It should be noted that our evaluation methods relies on pure mathematical reasoning abilities of LLMs, instead of leverage other programming methods such as PoT and ToRA.
 -
 🤗 Hugging Face Transformers
+```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
 model_name = "BitStarWalkin/SuperCorrect-7B"
 )
 tokenizer = AutoTokenizer.from_pretrained(model_name)
+prompt = "Find the distance between the foci of the ellipse \\[9x^2 + \\frac{y^2}{9} = 99.\\]"
 hierarchical_prompt = "Solve the following math problem in a step-by-step XML format, each step should be enclosed within tags like <Step1></Step1>. For each step enclosed within the tags, determine if this step is challenging and tricky, if so, add detailed explanation and analysis enclosed within <Key> </Key> in this step, as helpful annotations to help you thinking and remind yourself how to conduct reasoning correctly. After all the reasoning steps, summarize the common solution and reasoning steps to help you and your classmates who are not good at math generalize to similar problems within <Generalized></Generalized>. Finally present the final answer within <Answer> </Answer>."
 # HT
 messages = [
 response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
 print(response)
+```
+#### 🔥 vLLM
+```python
+import os
+from vllm import LLM, SamplingParams
+model_name = 'BitStarWalkin/SuperCorrect-7B'
+hierarchical_prompt = "Solve the following math problem in a step-by-step XML format, each step should be enclosed within tags like <Step1></Step1>. For each step enclosed within the tags, determine if this step is challenging and tricky, if so, add detailed explanation and analysis enclosed within <Key> </Key> in this step, as helpful annotations to help you thinking and remind yourself how to conduct reasoning correctly. After all the reasoning steps, summarize the common solution and reasoning steps to help you and your classmates who are not good at math generalize to similar problems within <Generalized></Generalized>. Finally present the final answer within <Answer> </Answer>."
+prompts = [
+    "For what positive value of $t$ is $|{-4+ti}| = 6$?",
+    "Find the distance between the foci of the ellipse \\[9x^2 + \\frac{y^2}{9} = 99.\\]",
+    "The fourth term of a geometric series is $24$ and the eleventh term is $3072$. What is the common ratio?"
+]
+combined_prompts = [hierarchical_prompt + '\n' + prompt for prompt in prompts]
+sampling_params = SamplingParams(temperature=0, top_p=1,max_tokens=1024)
+llm = LLM(model=model_name, trust_remote_code=True)
+outputs = llm.generate(combined_prompts, sampling_params)
+#Print the outputs.
+for output in outputs:
+    prompt = output.prompt
+    generated_text = output.outputs[0].text
+    print(f"Prompt: {prompt}")
+    print(f"Generated text: {generated_text}")
+```
+Here we also provide inference code with [vLLM](https://github.com/vllm-project/vllm) . vLLM is a fast and easy-to-use library for LLM inference and serving.
 Performance
 -
 -
 @article{yang2024supercorrect,
 title={SuperCorrect: Supervising and Correcting Language Models with Error-Driven Insights}
+  author={Yang, Ling and Yu, Zhaochen and Zhang, Tianjun and Xu, Minkai, Joseph E and Gonzalez, Bin Cui and Yan, Shuicheng},
   journal={arXiv preprint arXiv:2410.09008},
   year={2024}
 }
 or
 ```
 ./llama-server --hf-repo Triangle104/SuperCorrect-7B-Q5_K_S-GGUF --hf-file supercorrect-7b-q5_k_s.gguf -c 2048
+```