Can you provide some training details about this model (like learning rate)?
#2
by
iseesaw
- opened
Thanks.
Hi,
We primarily follow RLHFlow's recipe, except that we train for 2 epochs instead.
chrisliu298
changed discussion status to
closed