--- # For reference on model card metadata, see the spec: https://github.com/huggingface/hub-docs/blob/main/modelcard.md?plain=1 # Doc / guide: https://huggingface.co./docs/hub/model-cards {} --- # dataset Intruction --- **datasets:** \ - mxz/CValues_DPO \ **language:** \ - zh \ - en \ **metrics:** \ - perplexity \ **pipeline_tag:** \ - text-generation \ **tags:** \ - DPO \ - fintune \ - alignment \ - LoRA \ - Llama-3 --- # About mxz-llama-3-8B-sft This model trained by SFT and PPO. It's have coding, reasoing, chinese QA . # evaluation Result: | Model | MMLU | C-EVAL | C-MMLU | | ------------------- | ----- | ------ | ------ | | Llama-3-8B | 55.5 | 47.0 | 48.0 | | Llama-3-8B-Instruct | 60.1 | 49.7 | 49.3 | | Llama-3-8B-dpo | 62.2 | 49.9 | 49.4 | - Llama-3-8B evaluation result from [ymcui/Chinese-LLaMA-Alpaca-3](https://github.com/ymcui/Chinese-LLaMA-Alpaca-3)