arxiv:2405.07863
Wei Xiong
weqweasdas
AI & ML interests
Machine learning, RLHF
Recent Activity
updated
a dataset
about 19 hours ago
selfcorrexp2/balanced_model_as_rm_2prompt
published
a dataset
about 19 hours ago
selfcorrexp2/balanced_model_as_rm_2prompt
updated
a dataset
about 20 hours ago
selfcorrexp2/balanced_model_as_rm
Organizations
models
23
weqweasdas/zephyr-7b-dpo-full
Text Generation
•
Updated
•
7
weqweasdas/zephyr-7b-gemma-dpo
Updated
weqweasdas/zephyr-7b-sft-full
Updated
weqweasdas/zephyr-7b-dpo-qlora
Updated
weqweasdas/gpt2-cpt-dutch
Text Generation
•
Updated
•
65
weqweasdas/zephyr-7b-gemma-sft
Updated
weqweasdas/raft_baseline_zephyr_packing_model6_1_4_e6_weight085
Text Generation
•
Updated
•
4
weqweasdas/raft_baseline_zephyr_packing_model6_1_4_e6
Text Generation
•
Updated
•
5
weqweasdas/raft_baseline_zephyr_packing_model6
Text Generation
•
Updated
•
4
weqweasdas/raft_baseline_openchat_llama13b_model1
Text Generation
•
Updated
•
6
datasets
171
weqweasdas/llama31_70b_chosen_type12_mix
Viewer
•
Updated
•
21.5k
•
12
weqweasdas/prompt_math_test
Viewer
•
Updated
•
15k
•
15
weqweasdas/fixed05_llasft_math_7ktype2_7ktype3_ver2_150_tmp10_generation_with_rewards
Viewer
•
Updated
•
30k
•
28
weqweasdas/filtered_numia_prompt15k
Viewer
•
Updated
•
15k
•
16
weqweasdas/filtered_numia_prompt30k
Viewer
•
Updated
•
30.6k
•
14
weqweasdas/prompt_numinamath
Viewer
•
Updated
•
119k
•
17
weqweasdas/prompt_numinamath_with_gts
Viewer
•
Updated
•
168k
•
14
weqweasdas/fixed05_llasft_math_3ktype2_7ktype3_ver2_250_tmp10_generation_with_rewards
Viewer
•
Updated
•
50k
•
15
weqweasdas/fixed05_llasft_math_3ktype2_7ktype3_ver2_250_more_datatmp10_vllmexp_retest2_generation
Viewer
•
Updated
•
50k
•
13
weqweasdas/fixed05_llasft_math_3ktype2_7ktype3_ver2_100_tmp10_generation_with_rewards
Viewer
•
Updated
•
50k
•
16