raft_study

AI & ML interests

None defined yet.

Recent Activity

hendrydong authored a paper 2 days ago

BOLT: Bootstrap Long Chain-of-Thought in Language Models without Distillation

hendrydong authored a paper 6 days ago

Reward-Guided Speculative Decoding for Efficient LLM Reasoning

hendrydong authored a paper about 2 months ago

Offline Reinforcement Learning for LLM Multi-Step Reasoning

View all activity

models 4

raftrsf/sfr_raft_iter5_2epoch

Text Generation • Updated Jun 17, 2024 • 6

raftrsf/sfr_raft_iter4_2epoch

Text Generation • Updated Jun 13, 2024 • 10

raftrsf/sfr_raft_iter4

Text Generation • Updated Jun 13, 2024 • 5

raftrsf/pair_pref

Text Generation • Updated May 18, 2024 • 8

datasets 8

raftrsf/sfr_concise_iter5_top1

Viewer • Updated Jun 14, 2024 • 20k • 38

raftrsf/sfr_concise_iter5_k32_with_rewards

Viewer • Updated Jun 14, 2024 • 20k • 45

raftrsf/sfr_concise_iter4_top1

Viewer • Updated Jun 12, 2024 • 20k • 35

raftrsf/sfr_concise_iter4_k32_with_rewards

Viewer • Updated Jun 12, 2024 • 20k • 40

raftrsf/ipo_eval_data_baseline.json

Viewer • Updated May 18, 2024 • 7.62k • 33

raftrsf/zephyr_pi0_gen_57k_for_offline_dpo_ipo

Viewer • Updated May 7, 2024 • 57.5k • 36

raftrsf/iterative_ipo_pm_iter1_n4

Viewer • Updated Apr 25, 2024 • 13.5k • 35

raftrsf/iterative_ipo_pm_iter1

Viewer • Updated Apr 24, 2024 • 13.5k • 34