-
-
-
-
-
-
Inference status
Active filters:
dpo
SimaFarazi/gpt2-dpo
Text Generation
•
Updated
•
5
sumitxenon/HW2-dpo
Text Generation
•
Updated
•
3
NicholasCorrado/uf-tulu-2-7b-dpo
Text Generation
•
Updated
•
5
mertgulexe/HW2-dpo
mradermacher/tulu-2-7b-hh-dpo-GGUF
NicholasCorrado/tinyllama-1.1b-chat-v1.0-hh-dpo
Text Generation
•
Updated
•
4
NicholasCorrado/tinyllama-1.1b-chat-v1.0-arena-hh-dpo
Text Generation
•
Updated
•
3
SongTonyLi/SFT_D1chosenThenDPO_D2a_top3KSamples
Text Generation
•
Updated
•
4
NicholasCorrado/tinyllama-1.1b-chat-v1.0-arena-dpo
Text Generation
•
Updated
•
4
NicholasCorrado/tinyllama-1.1b-chat-v1.0-ui-math-dpo
Text Generation
•
Updated
•
13
NicholasCorrado/tinyllama-1.1b-chat-v1.0-ui-coding-dpo
Text Generation
•
Updated
•
4
NicholasCorrado/tinyllama-1.1b-chat-v1.0-ui-logic-dpo
Text Generation
•
Updated
•
6
NicholasCorrado/tinyllama-1.1b-chat-v1.0-ui-math-coding-dpo
Text Generation
•
Updated
•
4
NicholasCorrado/uf-rlced-conifer_tulu-2-7b-group-dpo-no-clip
Text Generation
•
Updated
•
4
mradermacher/uf-tulu-2-7b-dpo-GGUF
Updated
mradermacher/zephyr-7b-hh-dpo-GGUF
Updated
sfulay/zephyr-7b-dpo-full-gpt-reward-scale-05
sfulay/zephyr-7b-dpo-full-gpt_consistent-reward-scale-1-rpo-gamma-2
mradermacher/uf-tulu-2-7b-dpo-i1-GGUF
Updated
•
189
sfulay/zephyr-7b-dpo-full-gpt_consistent-reward-scale-05
sfulay/zephyr-7b-dpo-full-gpt_consistent-reward-scale-1-rpo-gamma-05
NicholasCorrado/tinyllama-1.1b-chat-v1.0-ui-math-coding-group-dpo
Text Generation
•
Updated
•
5
mradermacher/zephyr-7b-hh-dpo-i1-GGUF
NicholasCorrado/tinyllama-1.1b-chat-v1.0-ui-math-dpo-2
Text Generation
•
Updated
•
4
NicholasCorrado/tinyllama-1.1b-chat-v1.0-ui-coding-dpo-2
Text Generation
•
Updated
•
5
NicholasCorrado/tinyllama-1.1b-chat-v1.0-ui-math-coding-dpo-2
Text Generation
•
Updated
•
23
NicholasCorrado/tinyllama-1.1b-chat-v1.0-ui-logic-dpo-2
Text Generation
•
Updated
•
4
NicholasCorrado/tinyllama-1.1b-chat-v1.0-ui-dpo-2
Text Generation
•
Updated
•
4
tsavage68/Na_L3_1000steps_1e6rate_03beta_cSFTDPO
Text Generation
•
Updated
•
5
NicholasCorrado/tinyllama-1.1b-chat-v1.0-rlced-conifer-3-1-dpo
Text Generation
•
Updated
•
4