File size: 1,129 Bytes
93d33ed
 
 
 
 
 
 
 
a81e517
 
 
 
4d3469c
93d33ed
 
 
 
 
8fe0aed
 
 
93d33ed
4d3469c
93d33ed
 
 
 
8fe0aed
93d33ed
8fe0aed
93d33ed
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
---
language:
- zh
- en
pipeline_tag: text-generation
inference: false
---

原项目见 [https://huggingface.co./baichuan-inc/Baichuan-13B-Chat]

改动点:将原模型量化为8bit 保存为2GB大小的切片。

## 使用方式(int8)

```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from transformers.generation.utils import GenerationConfig
tokenizer = AutoTokenizer.from_pretrained("trillionmonster/Baichuan-13B-Chat-8bit", use_fast=False, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("trillionmonster/Baichuan-13B-Chat-8bit",  device_map="auto", trust_remote_code=True)
model.generation_config = GenerationConfig.from_pretrained("trillionmonster/Baichuan-13B-Chat-8bit")
messages = []
messages.append({"role": "user", "content": "世界上第二高的山峰是哪座"})
response = model.chat(tokenizer, messages)
print(response)
```

如需使用 int4 量化 (Similarly, to use int4 quantization):
```python
model = AutoModelForCausalLM.from_pretrained("trillionmonster/Baichuan-13B-Chat-8bit",  device_map="auto",load_in_4bit=True,trust_remote_code=True)

```