Safetensors
llama

Trained with tokenizer of OpenRLHF/Llama-3-8b-sft-mixture.

Downloads last month
80
Safetensors
Model size
1.24B params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.

Model tree for RTO-RL/Llama3.2-1B-RewardModel

Finetuned
(90)
this model

Dataset used to train RTO-RL/Llama3.2-1B-RewardModel