Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
wzhouad
's Collections
WPO
WPO
updated
Aug 22, 2024
Models and datasets in paper "WPO: Enhancing RLHF with Weighted Preference Optimization".
Upvote
7
wzhouad/Llama3-Instruct-8B-WPO-FP
Text Generation
•
Updated
Jul 24, 2024
•
15
wzhouad/Llama3-Instruct-8B-WPO-HB
Text Generation
•
Updated
Aug 22, 2024
•
18
•
1
wzhouad/zephyr-7B-WPO-FP
Text Generation
•
Updated
Jul 24, 2024
•
13
wzhouad/zephyr-7B-WPO-HB
Text Generation
•
Updated
Aug 21, 2024
•
16
wzhouad/Llama3-Instruct-8B-WPO-HB-v2
Text Generation
•
Updated
Aug 22, 2024
•
18
•
5
wzhouad/gemma-2-9b-it-WPO-FP
Text Generation
•
Updated
Aug 8, 2024
•
14
wzhouad/gemma-2-9b-it-WPO-HB
Text Generation
•
Updated
Aug 21, 2024
•
54
•
34
wzhouad/zephyr-ultrafeedback-hybrid
Viewer
•
Updated
Aug 21, 2024
•
64.7k
•
71
•
2
wzhouad/gemma-2-ultrafeedback-hybrid
Viewer
•
Updated
Aug 21, 2024
•
61.6k
•
83
•
7
wzhouad/llama3-ultrafeedback-hybrid
Viewer
•
Updated
Aug 22, 2024
•
64.5k
•
75
•
2
wzhouad/llama3-ultrafeedback-hybrid-v2
Viewer
•
Updated
Aug 22, 2024
•
64.5k
•
78
•
5
Upvote
7
+3
Share collection
View history
Collection guide
Browse collections