Why the hf-format model does not have rm head, since the original format model does have.
Why the hf-format model does not have rm head, since the original format model does have.
Thank you for your interest in our model!
We trained the model using NeMo Aligner at https://huggingface.co./nvidia/Llama-3.1-Nemotron-70B-Reward
When we trained it, we had a separate RM_head (i.e. a linear layer).
To allow more people to do inference on the model easily, we converted it to HF compatible checkpoint by copying over the useful weights to the output_embedding layer (which is also a linear layer) - specifically to the first vocabulary token (note, this RM only has one scalar value ‘reward' rather than five as suggested in your email - details in https://arxiv.org/abs/2410.01257). This allowed others to easily convert the model for inference in frameworks which doesn’t OOTB support additional layers to my best understanding.
Hence, the inference guide recommends https://huggingface.co./nvidia/Llama-3.1-Nemotron-70B-Reward-HF#usage
If you want to finetune further, you can consider using Nemo Aligner or apply a similar trick as what we did to initialize the sequence classifier layer (note - it should just be one layer with no activation) with the corresponding output embedding weights.