Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
Zikang Shan
zkshan2002
Follow
AI & ML interests
Reinforcement Learning
Recent Activity
published
a dataset
3 days ago
zkshan2002/hh-rlhf_preprocessed
updated
a dataset
3 days ago
zkshan2002/hh-rlhf_preprocessed
updated
a model
about 2 months ago
zkshan2002/ppo-0.44
View all activity
Organizations
None yet
models
6
Sort: Recently updated
zkshan2002/ppo-0.44
Updated
Dec 1, 2024
•
2
zkshan2002/r1B-sft_tokenizer
Updated
Nov 18, 2024
•
378
zkshan2002/RewardModel-uf-llama3.2-1B-OpenRLHF
Updated
Oct 24, 2024
•
2
zkshan2002/DPO-uf-llama3-8B-OpenRLHF
Updated
Oct 14, 2024
•
221
zkshan2002/PPO-uf-llama3-8B-OpenRLHF
Updated
Oct 11, 2024
•
5
zkshan2002/RewardModel-uf-llama3-8B-OpenRLHF
Updated
Oct 11, 2024
•
356
datasets
1
zkshan2002/hh-rlhf_preprocessed
Viewer
•
Updated
3 days ago
•
46.1k
•
10