yueqin yin's picture

3 1 1

yueqin yin

yyqoni

AI & ML interests

None yet

Recent Activity

updated a collection 18 days ago

DenseRewardRLHF-PPO

updated a model 18 days ago

yyqoni/Phi-3-mini-4k-bandit-ppo-60k

upvoted a paper 19 days ago

Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model

View all activity

Organizations

yyqoni's activity

commented a paper 19 days ago

Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model

Paper • 2501.02790 • Published 22 days ago • 9 •

New activity in nvidia/HelpSteer2 7 months ago

Averaging GT Overall Scores in Bradley-Terry Model with HelpSteer2

#3 opened 7 months ago by

New activity in wandb/mistral-7b-zephyr-dpo 11 months ago

The format of chat template

#2 opened 11 months ago by

The format of chat template

#2 opened 11 months ago by