zhichen commited on
Commit
1eca4fe
1 Parent(s): 5a65290
Files changed (4) hide show
  1. README.md +147 -3
  2. README_CN.md +152 -0
  3. images/logo.png +0 -0
  4. images/vllm_web_demo.png +0 -0
README.md CHANGED
@@ -1,3 +1,147 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <p align="left">
2
+ <a href="README_CN.md">中文</a>&nbsp | &nbspEnglish
3
+ </p>
4
+ <br><br>
5
+
6
+ <p align="center">
7
+ <a href='https://huggingface.co/spaces/zhichen'>
8
+ <img src='./images/logo.png'>
9
+ </a>
10
+ </p>
11
+
12
+ <div align="center">
13
+ <p align="center">
14
+ <h3> Llama3-Chinese </h3>
15
+
16
+ <p align="center">
17
+ <a href='https://huggingface.co/zhichen'>
18
+ <img src='https://img.shields.io/badge/%F0%9F%A4%97%20HuggingFace-Llama3%20Chinese-yellow'>
19
+ </a>
20
+ <a href='https://modelscope.cn/profile/seanzhang'>
21
+ <img src='https://img.shields.io/badge/🤖 ModelScope-Llama3%20Chinese-blue'>
22
+ </a>
23
+ <br>
24
+ <a href=href="https://github.com/seanzhang-zhichen/llama3-chinese/stargazers">
25
+ <img src="https://img.shields.io/github/stars/seanzhang-zhichen/llama3-chinese?color=ccf">
26
+ </a>
27
+ <a href="https://github.com/seanzhang-zhichen/llama3-chinese/blob/main/LICENSE">
28
+ <img alt="GitHub Contributors" src="https://img.shields.io/badge/license-Apache%202.0-blue.svg" />
29
+ </a>
30
+ </p>
31
+ </div>
32
+
33
+ ## Introduce
34
+
35
+ **Llama3-Chinese** is a large model trained on 500k high-quality Chinese multi-turn SFT data, 100k English multi-turn SFT data, and 2k single-turn self-cognition data, using the training methods of [DORA](https://arxiv.org/pdf/2402.09353.pdf) and [LORA+](https://arxiv.org/pdf/2402.12354.pdf) based on **Meta-Llama-3-8B** as the base.
36
+
37
+ ![DEMO](./images/vllm_web_demo.png)
38
+
39
+
40
+ ## Download Model
41
+
42
+
43
+ | Model | Download |
44
+ |:-------------------:|:-----------:|
45
+ | Meta-Llama-3-8B |[ 🤗 HuggingFace](https://huggingface.co/meta-llama/Meta-Llama-3-8B) [ 🤖 ModelScope](https://modelscope.cn/models/LLM-Research/Meta-Llama-3-8B)|
46
+ | Llama3-Chinese-Lora |[ 🤗 HuggingFace](https://huggingface.co/zhichen/Llama3-Chinese-Lora) [ 🤖 ModelScope](https://modelscope.cn/models/seanzhang/Llama3-Chinese-Lora)|
47
+ | Llama3-Chinese (merged model) |[ 🤗 HuggingFace](https://huggingface.co/zhichen/Llama3-Chinese) [ 🤖 ModelScope](https://modelscope.cn/models/seanzhang/Llama3-Chinese)|
48
+
49
+
50
+ ## Merge LORA Model (Skippable)
51
+
52
+ 1、Download [Meta-Llama-3-8B](https://modelscope.cn/models/LLM-Research/Meta-Llama-3-8B)
53
+
54
+ ```bash
55
+ git clone https://www.modelscope.cn/LLM-Research/Meta-Llama-3-8B.git
56
+ ```
57
+
58
+ 2、Download [Llama3-Chinese-Lora](https://www.modelscope.cn/models/seanzhang/Llama3-Chinese-Lora)
59
+
60
+ **From ModelScope**
61
+ ```bash
62
+ git lfs install
63
+ git clone https://www.modelscope.cn/seanzhang/Llama3-Chinese-Lora.git
64
+
65
+ ```
66
+
67
+ **From HuggingFace**
68
+ ```bash
69
+ git lfs install
70
+ git clone https://huggingface.co/zhichen/Llama3-Chinese-Lora
71
+ ```
72
+
73
+ 3、Merge Model
74
+
75
+ ```bash
76
+ python merge_lora.py \
77
+ --base_model path/to/Meta-Llama-3-8B \
78
+ --lora_model path/to/lora/Llama3-Chinese-Lora \
79
+ --output_dir ./Llama3-Chinese
80
+ ```
81
+
82
+
83
+ ## Download Llama3-Chinese (Merged Model)
84
+
85
+ **From ModelScope**
86
+ ```bash
87
+ git lfs install
88
+ git clone https://www.modelscope.cn/seanzhang/Llama3-Chinese.git
89
+ ```
90
+
91
+ **From HuggingFace**
92
+ ```bash
93
+ git lfs install
94
+ git clone https://huggingface.co/zhichen/Llama3-Chinese
95
+ ```
96
+
97
+
98
+ ## VLLM WEB DEMO
99
+
100
+ 1、Use [vllm](https://github.com/vllm-project/vllm) deploy model
101
+
102
+ ```bash
103
+ python -m vllm.entrypoints.openai.api_server --served-model-name Llama3-Chinese --model ./Llama3-Chinese(Replace it with your own merged model path)
104
+ ```
105
+
106
+ 2、This command is executed on the CLI
107
+
108
+ ```bash
109
+ python vllm_web_demo.py --model Llama3-Chinese
110
+ ```
111
+
112
+ ## Train Dataset
113
+
114
+ [deepctrl-sft-data](https://modelscope.cn/datasets/deepctrl/deepctrl-sft-data)
115
+
116
+
117
+ ## LICENSE
118
+
119
+ This project can only be used for research purposes, and the project developer shall not bear any harm or loss caused by the use of this project (including but not limited to data, models, codes, etc.). For details, please refer to [DISCLAIMER](https://github.com/seanzhang-zhichen/Llama3-Chinese/blob/main/DISCLAIMER)。
120
+
121
+ The License agreement of the Llama3-Chinese project code is the [Apache License 2.0](./LICENSE). The code is free for commercial use, and the model weights and data can only be used for research purposes. Please attach a link to Llama3-Chinese and the licensing agreement in the product description.
122
+
123
+
124
+ ## Citation
125
+
126
+ If you used Llama3-Chinese in your research, cite it in the following format:
127
+
128
+
129
+ ```latex
130
+ @misc{Llama3-Chinese,
131
+ title={Llama3-Chinese},
132
+ author={Zhichen Zhang},
133
+ year={2024},
134
+ howpublished={\url{https://github.com/seanzhang-zhichen/llama3-chinese}},
135
+ }
136
+ ```
137
+
138
+ ## Acknowledgement
139
+
140
+ [meta-llama/llama3](https://github.com/meta-llama/llama3)
141
+ <br>
142
+ [hiyouga/LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory)
143
+
144
+
145
+ ## Star History
146
+
147
+ [![Star History Chart](https://api.star-history.com/svg?repos=seanzhang-zhichen/Llama3-Chinese&type=Date)](https://star-history.com/#seanzhang-zhichen/Llama3-Chinese&Date)
README_CN.md ADDED
@@ -0,0 +1,152 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <p align="left">
2
+ 中文</a>&nbsp | &nbsp<a href="README.md">English</a>
3
+ </p>
4
+ <br><br>
5
+
6
+ <p align="center">
7
+ <a href='https://huggingface.co/spaces/zhichen'>
8
+ <img src='./images/logo.png'>
9
+ </a>
10
+ </p>
11
+
12
+ <div align="center">
13
+ <p align="center">
14
+ <h3> Llama3-Chinese </h3>
15
+
16
+ <p align="center">
17
+ <a href='https://huggingface.co/zhichen'>
18
+ <img src='https://img.shields.io/badge/%F0%9F%A4%97%20HuggingFace-Llama3%20Chinese-yellow'>
19
+ </a>
20
+ <a href='https://modelscope.cn/profile/seanzhang'>
21
+ <img src='https://img.shields.io/badge/🤖 ModelScope-Llama3%20Chinese-blue'>
22
+ </a>
23
+ <br>
24
+ <a href=href="https://github.com/seanzhang-zhichen/llama3-chinese/stargazers">
25
+ <img src="https://img.shields.io/github/stars/seanzhang-zhichen/llama3-chinese?color=ccf">
26
+ </a>
27
+ <a href="https://github.com/seanzhang-zhichen/llama3-chinese/blob/main/LICENSE">
28
+ <img alt="GitHub Contributors" src="https://img.shields.io/badge/license-Apache%202.0-blue.svg" />
29
+ </a>
30
+ </p>
31
+ </div>
32
+
33
+
34
+ ## 介绍
35
+
36
+ **Llama3-Chinese**是**以Meta-Llama-3-8B为底座**,使用 [DORA](https://arxiv.org/pdf/2402.09353.pdf) + [LORA+](https://arxiv.org/pdf/2402.12354.pdf) 的训练方法,在50w高质量中文多轮SFT数据 + 10w英文多轮SFT数据 + 2000单轮自我认知数据训练而来的大模型。
37
+
38
+ ![DEMO](./images/vllm_web_demo.png)
39
+
40
+
41
+ ## 模型下载
42
+
43
+ | Model | Download |
44
+ |:-------------------:|:-----------:|
45
+ | Meta-Llama-3-8B |[ 🤗 HuggingFace](https://huggingface.co/meta-llama/Meta-Llama-3-8B) [ 🤖 ModelScope](https://modelscope.cn/models/LLM-Research/Meta-Llama-3-8B)|
46
+ | Llama3-Chinese-Lora |[ 🤗 HuggingFace](https://huggingface.co/zhichen/Llama3-Chinese-Lora) [ 🤖 ModelScope](https://modelscope.cn/models/seanzhang/Llama3-Chinese-Lora)|
47
+ | Llama3-Chinese (合并好的模型) |[ 🤗 HuggingFace](https://huggingface.co/zhichen/Llama3-Chinese) [ 🤖 ModelScope](https://modelscope.cn/models/seanzhang/Llama3-Chinese)|
48
+
49
+
50
+
51
+ ## 合并LORA模型(可跳过)
52
+
53
+ 1、下载 [Meta-Llama-3-8B](https://modelscope.cn/models/LLM-Research/Meta-Llama-3-8B)
54
+
55
+ ```bash
56
+ git clone https://www.modelscope.cn/LLM-Research/Meta-Llama-3-8B.git
57
+ ```
58
+
59
+ 2、下载[Llama3-Chinese-Lora](https://www.modelscope.cn/models/seanzhang/Llama3-Chinese-Lora)
60
+
61
+ **From ModelScope**
62
+ ```bash
63
+ git lfs install
64
+ git clone https://www.modelscope.cn/seanzhang/Llama3-Chinese-Lora.git
65
+ ```
66
+
67
+ **From HuggingFace**
68
+ ```bash
69
+ git lfs install
70
+ git clone https://huggingface.co/zhichen/Llama3-Chinese-Lora
71
+ ```
72
+
73
+ 3、合并模型
74
+
75
+ ```bash
76
+ python merge_lora.py \
77
+ --base_model path/to/Meta-Llama-3-8B \
78
+ --lora_model path/to/lora/Llama3-Chinese-Lora \
79
+ --output_dir ./Llama3-Chinese
80
+ ```
81
+
82
+ ## 下载 Llama3-Chinese(合并好的模型)
83
+
84
+ **From ModelScope**
85
+ ```bash
86
+ git lfs install
87
+ git clone https://www.modelscope.cn/seanzhang/Llama3-Chinese.git
88
+ ```
89
+
90
+ **From HuggingFace**
91
+ ```bash
92
+ git lfs install
93
+ git clone https://huggingface.co/zhichen/Llama3-Chinese
94
+ ```
95
+
96
+
97
+
98
+
99
+ ## vllm web 推理
100
+
101
+ 1、使用[vllm](https://github.com/vllm-project/vllm)部署模型
102
+
103
+ ```bash
104
+ python -m vllm.entrypoints.openai.api_server --served-model-name Llama3-Chinese --model ./Llama3-Chinese(换成你自己的合并后的模型路径)
105
+ ```
106
+
107
+ 2、在命令行执行
108
+
109
+ ```bash
110
+ python vllm_web_demo.py --model Llama3-Chinese
111
+ ```
112
+
113
+
114
+
115
+
116
+ ## 训练数据集
117
+
118
+ [匠数科技大模型sft数据集](https://modelscope.cn/datasets/deepctrl/deepctrl-sft-data)
119
+
120
+
121
+ ## LICENSE
122
+
123
+ 本项目仅可应用于研究目的,项目开发者不承担任何因使用本项目(包含但不限于数据、模型、代码等)导致的危害或损失。详细请参考[免责声明](https://github.com/seanzhang-zhichen/Llama3-Chinese/blob/main/DISCLAIMER)。
124
+
125
+ Llama3-Chinese项目代码的授权协议为 [The Apache License 2.0](./LICENSE),代码可免费用做商业用途,模型权重和数据只能用于研究目的。请在产品说明中附加Llama3-Chinese的链接和授权协议。
126
+
127
+ ## Citation
128
+
129
+ 如果你在研究中使用了Llama3-Chinese,请按如下格式引用:
130
+
131
+ ```latex
132
+ @misc{Llama3-Chinese,
133
+ title={Llama3-Chinese},
134
+ author={Zhichen Zhang},
135
+ year={2024},
136
+ howpublished={\url{https://github.com/seanzhang-zhichen/llama3-chinese}},
137
+ }
138
+ ```
139
+
140
+
141
+ ## Acknowledgement
142
+
143
+ [meta-llama/llama3](https://github.com/meta-llama/llama3)
144
+ <br>
145
+ [hiyouga/LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory)
146
+
147
+
148
+
149
+ ## Star History
150
+
151
+ [![Star History Chart](https://api.star-history.com/svg?repos=seanzhang-zhichen/Llama3-Chinese&type=Date)](https://star-history.com/#seanzhang-zhichen/Llama3-Chinese&Date)
152
+
images/logo.png ADDED
images/vllm_web_demo.png ADDED