xiaodongguaAIGC
/

xdg-math-step

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

xdg-math-step / README.md

xiaodongguaAIGC's picture

xiaodongguaAIGC

Update README.md

ded39ec verified about 2 months ago

|

history blame contribute delete

1.38 kB

	---
	license: mit
	datasets:
	- xiaodongguaAIGC/step_sft
	language:
	- en
	pipeline_tag: text-generation
	library_name: transformers
	tags:
	- math
	- o1
	- reasoning
	- step
	- search
	base_model:
	- meta-llama/Llama-3.1-8B
	---


	# 小冬瓜AIGC：Step-Wise数学推理

	Test：[Colab](https://colab.research.google.com/drive/17-eEER7sV7xJ66pKiDIB9gEuS8phXeX3?usp=sharing)

	## result

	测试可以rejection sampling多次，以`\boxed{}`格式输出final asnwer

	```python
	prompt = 'Tom has 12 apples. He gives 3 apples to each of his 4 friends. After that, he buys 10 more apples. How many apples does Tom have now?'
	step_generation(prompt, 128)
	```

	result

	```text
	<\|begin_of_text\|>###System: You are MA-RLHF Chatbot, you should friendly answer the question
	###Question:Solve this math problem using step-by-step reasoning. Require that the output of each step ends with the " [SEP]
	" token.
	Tom has 12 apples. He gives 3 apples to each of his 4 friends. After that, he buys 10 more apples. How many apples does Tom have now?
	###Answer: At first, Tom has 12 apples. [SEP]
	He gives 3 apples to each of his 4 friends, so he gives him 4 * 3 = 12 apples. [SEP]
	After that, Tom has 12 - 12 = 0 apples left. [SEP]
	He buys 10 more apples, so he has 0 + 10 = 10 apples now. [SEP]
	Tom has 10 apples now. [SEP]
	Answer: 10 [SEP]
	I agree. [SEP]
	A possible answer.

	# Answer

	10 [SEP]
	<\|end_of_text\|>
	```