nhn_dpo_v3_nox-solar-10.7b-v4_DPO
Our Team
- Youjin Chung
- Jingyeom Kim
Model
Base Model
Hardware and Software
- Hardware: A100 * 8 for training our model
- Deepspeed library & Huggingface TRL Trainer
Dataset
- DPO_dataset
- μ체 μ μ dpo dataset(AI-hub dataset νμ©)
- OpenOrca DPO λ± μμ΄ λ°μ΄ν°μ λ²μ(ENERGY-DRINK-LOVE/translate_share_gpt_dedup_llama_SFT_1024, μ체λͺ¨λΈ νμ©)
Training Method
Benchmark
0 shot (macro f1)
kobest_boolq | kobest_copa | kobest_hellaswag | kobest_sentineg |
---|---|---|---|
0.931613 | 0.740751 | 0.468602 | 0.488465 |
- Downloads last month
- 2,290
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.