iproskurina commited on
Commit
02a3f91
1 Parent(s): 3760fab

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +104 -0
README.md ADDED
@@ -0,0 +1,104 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: bigscience/bloom-3b
3
+ inference: false
4
+ model_creator: bigscience
5
+ model_name: bloom-3b
6
+ model_type: bloom
7
+ pipeline_tag: text-generation
8
+ quantized_by: iproskurina
9
+ tags:
10
+ - pretrained
11
+ license: bigscience-bloom-rail-1.0
12
+ language:
13
+ - ak
14
+ - ar
15
+ - as
16
+ - bm
17
+ - bn
18
+ - ca
19
+ - code
20
+ - en
21
+ - es
22
+ - eu
23
+ - fon
24
+ - fr
25
+ - gu
26
+ - hi
27
+ - id
28
+ - ig
29
+ - ki
30
+ - kn
31
+ - lg
32
+ - ln
33
+ - ml
34
+ - mr
35
+ - ne
36
+ - nso
37
+ - ny
38
+ - or
39
+ - pa
40
+ - pt
41
+ - rn
42
+ - rw
43
+ - sn
44
+ - st
45
+ - sw
46
+ - ta
47
+ - te
48
+ - tn
49
+ - ts
50
+ - tum
51
+ - tw
52
+ - ur
53
+ - vi
54
+ - wo
55
+ - xh
56
+ - yo
57
+ - zh
58
+ - zhs
59
+ - zht
60
+ - zu
61
+ datasets:
62
+ - c4
63
+ ---
64
+
65
+
66
+
67
+ # 🌸 BLOOM 3B - GPTQ
68
+ - Model creator: [BigScience](https://huggingface.co/bigscience)
69
+ - Original model: [BLOOM 3B](https://huggingface.co/bigscience/bloom-3b)
70
+
71
+ The model published in this repo was quantized to 4bit using [AutoGPTQ](https://github.com/PanQiWei/AutoGPTQ).
72
+
73
+ **Quantization details**
74
+
75
+ **All quantization parameters were taken from [GPTQ paper](https://arxiv.org/abs/2210.17323).**
76
+
77
+ GPTQ calibration data consisted of 128 random 2048 token segments from the [C4 dataset](https://huggingface.co/datasets/c4).
78
+
79
+ The grouping size used for quantization is equal to 128.
80
+
81
+ ## How to use this GPTQ model from Python code
82
+
83
+ ### Install the necessary packages
84
+
85
+ ```shell
86
+ pip install accelerate==0.26.1 datasets==2.16.1 dill==0.3.7 gekko==1.0.6 multiprocess==0.70.15 peft==0.7.1 rouge==1.0.1 sentencepiece==0.1.99
87
+ git clone https://github.com/upunaprosk/AutoGPTQ
88
+ cd AutoGPTQ
89
+ pip install -v .
90
+ ```
91
+ Recommended transformers version: 4.35.2.
92
+
93
+ ### You can then use the following code
94
+
95
+ ```python
96
+
97
+ from transformers import AutoTokenizer, TextGenerationPipeline,AutoModelForCausalLM
98
+ from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig
99
+ pretrained_model_dir = "iproskurina/bloom-3b-gptq-4bit"
100
+ tokenizer = AutoTokenizer.from_pretrained(pretrained_model_dir, use_fast=True)
101
+ model = AutoGPTQForCausalLM.from_quantized(pretrained_model_dir, device="cuda:0", model_basename="model")
102
+ pipeline = TextGenerationPipeline(model=model, tokenizer=tokenizer)
103
+ print(pipeline("auto-gptq is")[0]["generated_text"])
104
+ ```