arxiv:2412.04653
Benjamin Feuer PRO
penfever
AI & ML interests
Deep learning, computer vision, large language models, large vision language models
Recent Activity
updated
a model
about 16 hours ago
penfever/dpo-rewild-8b-v0.06
published
a model
about 16 hours ago
penfever/dpo-rewild-8b-v0.06
updated
a model
about 16 hours ago
penfever/dpo-rewild-8b-v0.05
Organizations
models
10
penfever/dpo-rewild-8b-v0.06
Updated
penfever/dpo-rewild-8b-v0.05
Updated
penfever/dpo-q2572b-a70b-jllm3-Harmlessness-A
Updated
penfever/dpo-rewild-8b-v0.03
Updated
•
5
penfever/dpo-rewild-8b-v0.02
Updated
•
5
penfever/dpo-rewild-8b-v0.04
Updated
•
5
penfever/rewild_sft_tulu_dpo_8b
Updated
•
7
penfever/dpo-q2572b-a70b-jllm3-Readability-A
Updated
•
5
penfever/dpo-q2572b-a70b-jllm3-Factuality-A
Updated
•
7
penfever/tulu3-dpo-repro
Updated
•
14
datasets
84
penfever/llama-3.1-tulu-3-8b-preference-mixture-tulu-3-sft-reused-if
Viewer
•
Updated
•
65.5k
•
19
penfever/t3-8b-t3uf-on-policy
Viewer
•
Updated
•
41.6k
•
8
penfever/t3-8b-wc-onpolicy
Viewer
•
Updated
•
17.2k
•
6
penfever/llama-3.1-tulu-3-8b-preference-mixture-tulu-3-wildchat-if
Viewer
•
Updated
•
10.8k
•
17
penfever/t3-8b-t3sft-onpolicy
Viewer
•
Updated
•
19.4k
•
7
penfever/llama-3.1-tulu-3-8b-preference-mixture-tulu-3-sft-reused-off-policy
Viewer
•
Updated
•
96.9k
•
15
penfever/llama-3.1-tulu-3-8b-preference-mixture-tulu-3-persona-if
Viewer
•
Updated
•
19.9k
•
29
penfever/dpo-q2572b-a70b-jllm3-Harmlessness-A
Viewer
•
Updated
•
270k
•
13
penfever/dpo-qalfac
Viewer
•
Updated
•
360k
•
20
penfever/dpo-q2572b-a70b-jllm3-Readability-A
Viewer
•
Updated
•
272k
•
40