mmnga commited on
Commit
8997250
1 Parent(s): ba76463

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +49 -0
README.md ADDED
@@ -0,0 +1,49 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - fr
5
+ - it
6
+ - de
7
+ - es
8
+ - en
9
+ inference: false
10
+ ---
11
+ # Model Card for Mixtral-Extraction-4x7B-Instruct-v0.1
12
+ This model is an experimental model created by merging [mistralai/Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) experts.
13
+
14
+ # How we extracted experts
15
+ Experts are selected and extracted.
16
+ This model specifies 4 experts.
17
+
18
+ # How To Convert
19
+ use colab cpu-high-memory.
20
+ You can extract experts 1-7 by selecting experts as bit string.
21
+
22
+ ~~~python
23
+ experts_extract_bit = "11110000"
24
+ ~~~
25
+ [convert_mixtral_8x7b_to_4x7b_extract.ipynb](https://huggingface.co/mmnga/Mixtral-Extraction-4x7B-Instruct-v0.1/new/main/?filename=README.md)
26
+
27
+ # Usage
28
+ ~~~python
29
+ pip install git+https://github.com/huggingface/transformers --upgrade
30
+ pip install torch accelerate bitsandbytes flash_attn
31
+ ~~~
32
+
33
+ ~~~python
34
+ from transformers import AutoTokenizer, AutoModelForCausalLM, MixtralForCausalLM
35
+ import torch
36
+
37
+ model_name_or_path = "mmnga/Mixtral-Extraction-4x7B-Instruct-v0.1"
38
+
39
+ tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
40
+ model = MixtralForCausalLM.from_pretrained(model_name_or_path, load_in_8bit=True)
41
+
42
+ text = "[INST] What was John Holt's vision on education? [/INST] "
43
+ # text = "[INST] What is the best anime? [/INST] "
44
+ inputs = tokenizer("<s> " + text, return_tensors="pt")
45
+
46
+ outputs = model.generate(**inputs, max_new_tokens=128)
47
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
48
+
49
+ ~~~