继续Lora

by weiminw - opened 12 days ago

12 days ago

你好, 我需要一个奖励模型来判断解决问题的步骤是否合理. 请问我是否可以基于该模型,使用自定义的数据进行模型Lora的微调? 另外请问该模型,输出的评分取值范围大概是多少?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment