Nguyễn Minh Phúc
DatPySci
AI & ML interests
Reinforcement learning, NLP
Organizations
Collections
1
models
89
DatPySci/EleutherAI_pythia-2.8b-deduped__ipo_pythia-2.8b_beta-0.1__tldr
Updated
DatPySci/EleutherAI_pythia-2.8b-deduped__dpo_pythia-2.8b_beta-0.05__tldr
Updated
DatPySci/EleutherAI_pythia-2.8b-deduped__length_IS_pythia-2.8b_beta-0.05__tldr
Updated
DatPySci/EleutherAI_pythia-2.8b-deduped__length_IS_pythia-2.8b_beta-0.1__tldr
Updated
DatPySci/EleutherAI_pythia-2.8b-deduped__dpo_pythia-2.8b_beta-0.1__tldr
Updated
DatPySci/llama3-1b_reward_tldr
Text Classification
•
Updated
•
43
DatPySci/EleutherAI_pythia-2.8b-deduped__dpo_pythia-2.8b_beta-0.01__tldr
Updated
DatPySci/EleutherAI_pythia-2.8b-deduped__length_IS_pythia-2.8b_beta-0.01__tldr
Updated
DatPySci/EleutherAI_pythia-410m-deduped__ipo_ipo_pythia-1b_beta-0.03__tldr
Updated
DatPySci/EleutherAI_pythia-410m-deduped__length_IS_ipo_pythia-1b_beta-0.03__tldr
Updated
datasets
8
DatPySci/HH-RLHF-preprocessed
Viewer
•
Updated
•
119k
•
44
DatPySci/tldr_preference_dataset
Viewer
•
Updated
•
179k
•
39
DatPySci/tldr_sft_dataset
Viewer
•
Updated
•
130k
•
181
DatPySci/policy_shift_dataset
Viewer
•
Updated
•
150k
•
34
DatPySci/shift_dataset
Viewer
•
Updated
•
156k
•
38
DatPySci/summarize_from_feedback_oai_preprocessing
Viewer
•
Updated
•
179k
•
32
DatPySci/anthropic_hh_rlhf_filtered_oai_preprocessing
Viewer
•
Updated
•
169k
•
56
DatPySci/summarize_from_feedback_oai_preprocessing_pythia-6.9b-gold
Viewer
•
Updated
•
115k
•
73