Ask questions about training data construction

by zzzzz2023 - opened 3 days ago

zzzzz2023

3 days ago

•

Hello, I have seen the code of your model. I would like to know the construction way of label in training, and how to better calculate the loss by process reward.@Zhenru Thank you for your answer

zzzzz2023

3 days ago

loss in the model code is calculated as loss_fct(logits.view(-1, self.num_labels), labels.view(-1))，But here the logits are the probabilities of tokens in the assistant, how should labels be constructed and logits directly calculate the cross entropy

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment