s3nh
/

AlpachinoNLP-Baichuan-7B-Instruction-GGML

+---
+license: openrail
+pipeline_tag: text-generation
+library_name: transformers
+language:
+- zh
+---
+## Original model card
+Buy me a coffee if you like this project ;)
+<a href="https://www.buymeacoffee.com/s3nh"><img src="https://www.buymeacoffee.com/assets/img/guidelines/download-assets-sm-1.svg" alt=""></a>
+#### Description
+GGML Format model files for [This project](https://huggingface.co/AlpachinoNLP/Baichuan-7B-Instruction).
+### inference
+```python
+import ctransformers
+from ctransformers import AutoModelForCausalLM
+model = AutoModelForCausalLM.from_pretrained(output_dir, ggml_file,
+gpu_layers=32, model_type="llama")
+manual_input: str = "Tell me about your last dream, please."
+llm(manual_input,
+      max_new_tokens=256,
+      temperature=0.9,
+      top_p= 0.7)
+```
+# Original model card
+# Baichuan-7B-Instruction
+![](./alpachino.png)
+<!-- Provide a quick summary of what the model is/does. -->
+## 介绍
+Baichuan-7B-Instruction 为 Baichuan-7B 系列模型进行指令微调后的版本，预训练模型可见 [Baichuan-7B](https://huggingface.co/baichuan-inc/Baichuan-7B)。
+## Demo
+如下是一个使用 gradio 的模型 demo
+```python
+import gradio as gr
+from transformers import AutoTokenizer, AutoModelForCausalLM
+tokenizer = AutoTokenizer.from_pretrained("AlpachinoNLP/Baichuan-7B-Instruction",trust_remote_code=True,use_fast=False)
+model = AutoModelForCausalLM.from_pretrained("AlpachinoNLP/Baichuan-7B-Instruction",trust_remote_code=True ).half()
+model.cuda()
+def generate(histories,  max_new_tokens=2048, do_sample = True, top_p = 0.95, temperature = 0.35, repetition_penalty=1.1):
+    prompt = ""
+    for history in histories:
+        history_with_identity = "\nHuman:" + history[0] + "\n\nAssistant:" + history[1]
+        prompt += history_with_identity
+    input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to(model.device)
+    outputs = model.generate(
+                    input_ids = input_ids,
+                    max_new_tokens=max_new_tokens,
+                    early_stopping=True,
+                    do_sample=do_sample,
+                    top_p=top_p,
+                    temperature=temperature,
+                    repetition_penalty=repetition_penalty,
+        )
+    rets = tokenizer.batch_decode(outputs, skip_special_tokens=True)
+    generate_text = rets[0].replace(prompt, "")
+    return generate_text
+with gr.Blocks() as demo:
+    chatbot = gr.Chatbot()
+    msg = gr.Textbox()
+    clear = gr.Button("clear")
+    def user(user_message, history):
+        return "", history + [[user_message, ""]]
+    def bot(history):
+        print(history)
+        bot_message = generate(history)
+        history[-1][1] = bot_message
+        return history
+    msg.submit(user, [msg, chatbot], [msg, chatbot], queue=False).then(
+        bot, chatbot, chatbot
+    )
+    clear.click(lambda: None, None, chatbot, queue=False)
+if __name__ == "__main__":
+    demo.launch(server_name="0.0.0.0")
+```
+## 量化部署
+Baichuan-7B 支持 int8 和 int4 量化，用户只需在推理代码中简单修改两行即可实现。请注意，如果是为了节省显存而进行量化，应加载原始精度模型到 CPU 后再开始量化；避免在 `from_pretrained` 时添加 `device_map='auto'` 或者其它会导致把原始精度模型直接加载到 GPU 的行为的参数。
+使用 int8 量化 (To use int8 quantization):
+```python
+model = AutoModelForCausalLM.from_pretrained("AlpachinoNLP/Baichuan-7B-Instruction", torch_dtype=torch.float16, trust_remote_code=True)
+model = model.quantize(8).cuda()
+```
+同样的，如需使用 int4 量化 (Similarly, to use int4 quantization):
+```python
+model = AutoModelForCausalLM.from_pretrained("AlpachinoNLP/Baichuan-7B-Instruction", torch_dtype=torch.float16, trust_remote_code=True)
+model = model.quantize(4).cuda()
+```
+## 训练详情
+数据集：https://huggingface.co/datasets/shareAI/ShareGPT-Chinese-English-90k。
+硬件：8*A40
+## 测评结果
+## [CMMLU](https://github.com/haonan-li/CMMLU)
+| Model 5-shot                                               |   STEM    | Humanities | Social Sciences |  Others  | China Specific | Average  |
+| ---------------------------------------------------------- | :-------: | :--------: | :-------------: | :------: | :------------: | :------: |
+| Baichuan-7B |   34.4    |    47.5    |      47.6       |   46.6   |      44.3      |   44.0   |
+| Vicuna-13B                                                 |   31.8    |    36.2    |      37.6       |   39.5   |      34.3      |   36.3   |
+| Chinese-Alpaca-Plus-13B                                    |   29.8    |    33.4    |      33.2       |   37.9   |      32.1      |   33.4   |
+| Chinese-LLaMA-Plus-13B                                     |   28.1    |    33.1    |      35.4       |   35.1   |      33.5      |   33.0   |
+| Ziya-LLaMA-13B-Pretrain                                    |   29.0    |    30.7    |      33.8       |   34.4   |      31.9      |   32.1   |
+| LLaMA-13B                                                  |   29.2    |    30.8    |      31.6       |   33.0   |      30.5      |   31.2   |
+| moss-moon-003-base (16B)                                   |   27.2    |    30.4    |      28.8       |   32.6   |      28.7      |   29.6   |
+| Baichuan-13B-Base                                          |   41.7    |    61.1    |      59.8       |   59.0   |      56.4      |   55.3   |
+| Baichuan-13B-Chat                                          |   42.8    |  62.6  |    59.7  | 59.0 |    56.1    | 55.8 |
+| Baichuan-13B-Instruction                              | 44.50 |   61.16    |      59.07      |  58.34   |     55.55      |  55.61   |
+| **Baichuan-7B-Instruction**                                  | **34.68** | **47.38**  |    **47.13**    | **45.11** |   **44.51**    | **43.57** |
+| Model zero-shot                                              |   STEM    | Humanities | Social Sciences |  Others   | China Specific |  Average  |
+| ------------------------------------------------------------ | :-------: | :--------: | :-------------: | :-------: | :------------: | :-------: |
+| [ChatGLM2-6B](https://huggingface.co/THUDM/chatglm2-6b)      |   41.28   |   52.85    |      53.37      |   52.24   |     50.58      |   49.95   |
+| [Baichuan-7B](https://github.com/baichuan-inc/baichuan-7B)   |   32.79   |   44.43    |      46.78      |   44.79   |     43.11      |   42.33   |
+| [ChatGLM-6B](https://github.com/THUDM/GLM-130B)              |   32.22   |   42.91    |      44.81      |   42.60   |     41.93      |   40.79   |
+| [BatGPT-15B](https://arxiv.org/abs/2307.00360)               |   33.72   |   36.53    |      38.07      |   46.94   |     38.32      |   38.51   |
+| [Chinese-LLaMA-7B](https://github.com/ymcui/Chinese-LLaMA-Alpaca) |   26.76   |   26.57    |      27.42      |   28.33   |     26.73      |   27.34   |
+| [MOSS-SFT-16B](https://github.com/OpenLMLab/MOSS)            |   25.68   |   26.35    |      27.21      |   27.92   |     26.70      |   26.88   |
+| [Chinese-GLM-10B](https://github.com/THUDM/GLM)              |   25.57   |   25.01    |      26.33      |   25.94   |     25.81      |   25.80   |
+| [Baichuan-13B](https://github.com/baichuan-inc/Baichuan-7B)  |   42.04   |   60.49    |      59.55      |   56.60   |     55.72      |   54.63   |
+| [Baichuan-13B-Chat](https://github.com/baichuan-inc/Baichuan-7B) |   37.32   |   56.24    |      54.79      |   54.07   |     52.23      |   50.48   |
+| Baichuan-13B-Instruction                                 | 42.56 | 62.09  |    60.41   | 58.97 |   56.95    | 55.88 |
+| **Baichuan-7B-Instruction**                                  | **33.94** | **46.31**  |    **47.73**    | **45.84** |   **44.88**    | **43.53** |
+> 说明：CMMLU 是一个综合性的中文评估基准，专门用于评估语言模型在中文语境下的知识和推理能力。我们直接使用其官方的[评测脚本](https://github.com/haonan-li/CMMLU)对模型进行评测。Model zero-shot 表格中 [Baichuan-13B-Chat](https://github.com/baichuan-inc/Baichuan-13B) 的得分来自我们直接运行 CMMLU 官方的评测脚本得到，其他模型的的得分来自于 [CMMLU](https://github.com/haonan-li/CMMLU/tree/master) 官方的评测结果.
+### 英文能力评测
+除了中文榜单的测试，我们同样测试了模型在英文榜单 MMLU 上的能力。
+#### MMLU
+[MMLU](https://arxiv.org/abs/2009.03300) 是一个包含了57种任务的英文评测数据集。
+我们采用了开源的[评测方案]((https://github.com/hendrycks/test)) , 评测结果如下:
+| Model                                  | Humanities | Social Sciences | STEM | Other | Average |
+|----------------------------------------|-----------:|:---------------:|:----:|:-----:|:-------:|
+| LLaMA-7B<sup>2</sup>                   |       34.0 |      38.3       | 30.5 | 38.1  |  35.1   |
+| Falcon-7B<sup>1</sup>                  |          - |        -        |  -   |   -   |  35.0   |
+| mpt-7B<sup>1</sup>                     |          - |        -        |  -   |   -   |  35.6   |
+| ChatGLM-6B<sup>0</sup>                 |       35.4 |      41.0       | 31.3 | 40.5  |  36.9   |
+| BLOOM 7B<sup>0</sup>                   |       25.0 |      24.4       | 26.5 | 26.4  |  25.5   |
+| BLOOMZ 7B<sup>0</sup>                  |       31.3 |      42.1       | 34.4 | 39.0  |  36.1   |
+| moss-moon-003-base (16B)<sup>0</sup>   |       24.2 |      22.8       | 22.4 | 24.4  |  23.6   |
+| moss-moon-003-sft (16B)<sup>0</sup>    |       30.5 |      33.8       | 29.3 | 34.4  |  31.9   |
+| Baichuan-7B<sup>0</sup>                |       38.4 |      48.9       | 35.6 | 48.1  |  42.3   |
+| **Baichuan-7B-Instruction(5-shot)**            |       **38.9** |      **49.0**       | **35.3** | **48.8**  |  **42.6**   |
+| **Baichuan-7B-Instruction(0-shot)**            |       **38.7** |      **47.9**       | **34.5** | **48.2**  |  **42.0**   |