RLHF-PPO-RewardModel-LLama3-1B-v2 / adapter_model.safetensors

Commit History

bikalnetomi/RLHF-PPO-RewardModel-LLama3-1B-v2
ddf9334
verified

bikalnetomi commited on