bjoernp commited on
Commit
a6aa6c8
1 Parent(s): 16bc0f9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -0
README.md CHANGED
@@ -6,4 +6,21 @@ This is a preliminary HuggingFace implementation of the newly released MoE model
6
 
7
  Thanks to @dzhulgakov for his early implementation (https://github.com/dzhulgakov/llama-mistral) that helped me find a working setup.
8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  Come chat about this in our [Disco(rd)](https://discord.gg/S8W8B5nz3v)! :)
 
6
 
7
  Thanks to @dzhulgakov for his early implementation (https://github.com/dzhulgakov/llama-mistral) that helped me find a working setup.
8
 
9
+ # Basic Inference setup
10
+
11
+ ```python
12
+ import torch
13
+ from transformers import AutoModelForCausalLM, AutoTokenizer
14
+
15
+ model = AutoModelForCausalLM.from_pretrained("DiscoResearch/mixtral-7b-8expert", low_cpu_mem_usage=True, device_map="auto", trust_remote_code=True)
16
+ tok = AutoTokenizer.from_pretrained("DiscoResearch/mixtral-7b-8expert")
17
+ x = tok.encode("The mistral wind in is a phenomenon ", return_tensors="pt").cuda()
18
+ x = model.generate(x, max_new_tokens=128).cpu()
19
+ print(tok.batch_decode(x))
20
+ ```
21
+
22
+ # Conversion
23
+
24
+ Use `convert_mistral_moe_weights_to_hf.py --input_dir ./input_dir --model_size 7B --output_dir ./output` to convert the original consolidated weights to this HF setup.
25
+
26
  Come chat about this in our [Disco(rd)](https://discord.gg/S8W8B5nz3v)! :)