File size: 3,986 Bytes
8ea2703
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9b228ce
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100

This is the OpenNMT-py converted version of Mixtral 8x7b, 4-bit AWQ quantized.

The safetensors file is 24GB hence needs 2x24GB GPUs (3090 or 4090) or 1x48GB (A6000).

To run the model on 2 GPU the config file needs to have:
world_size: 2
gpu_ranks: [0, 1]
parallel_mode: "tensor_parallel"

If you are lucky to have a A6000 (or V/A/H100 with more than 32GB), then use:
world_size: 1
gpu_ranks: [0]
#parallel_mode: "tensor_parallel"

Command line to run is:

`python onmt/bin/translate.py --config /pathto/mixtral-inference-awq.yaml --src /pathto/input-vicuna.txt --output /pathto/mistral-output.txt`

Where for instance, input-vicuna.txt contains:

`USER:⦅newline⦆Show me some attractions in Boston.⦅newline⦆⦅newline⦆ASSISTANT:⦅newline⦆`

Output will be:

`Here are some attractions in Boston:⦅newline⦆⦅newline⦆1. Boston Common: This is a historic park located in the heart of Boston. It features a variety of attractions, including the Boston Common Fountain, the Boston Common Bandstand, and the Boston Common Carousel.⦅newline⦆⦅newline⦆2. Boston Public Garden: This is a historic park located in the heart of Boston. It features a variety of attractions, including the Boston Public Garden Fountain, the Boston Public Garden Bandstand, and the Boston Public Garden Carousel.⦅newline⦆⦅newline⦆3. Boston Museum of Fine Arts: This is a world-renowned art museum located in the heart of Boston. It features a variety of attractions, including the Boston Museum of Fine Arts Fountain, the Boston Museum of Fine Arts Bandstand, and the Boston Museum of Fine Arts Carousel.⦅newline⦆⦅newline⦆4. Boston Museum of Science: This is a world-renowned science museum located in the heart of Boston. It features a variety of attractions, including the Boston Museum of Science Fountain, the Boston Museum of Science Bandstand, and the Boston Museum of Science Carousel.⦅newline⦆⦅newline⦆5. Boston Museum of History: This is a world-renowned history museum located in the heart of Boston`


Installation instruction:

Visit: https://github.com/OpenNMT/OpenNMT-py
make sure you install flash-attn and autoawq

Enjoy

detailed MMLU scoring:
```
ACC-abstract_algebra: 0.3600
ACC-anatomy: 0.6444
ACC-astronomy: 0.7303
ACC-business_ethics: 0.6400
ACC-clinical_knowledge: 0.7283
ACC-college_biology: 0.8056
ACC-college_chemistry: 0.5300
ACC-college_computer_science: 0.5900
ACC-college_mathematics: 0.3700
ACC-college_medicine: 0.6936
ACC-college_physics: 0.4510
ACC-computer_security: 0.7900
ACC-conceptual_physics: 0.6468
ACC-econometrics: 0.5614
ACC-electrical_engineering: 0.6414
ACC-elementary_mathematics: 0.4630
ACC-formal_logic: 0.4524
ACC-global_facts: 0.4600
ACC-high_school_biology: 0.8000
ACC-high_school_chemistry: 0.5320
ACC-high_school_computer_science: 0.7400
ACC-high_school_european_history: 0.8121
ACC-high_school_geography: 0.8081
ACC-high_school_government_and_politics: 0.9275
ACC-high_school_macroeconomics: 0.6923
ACC-high_school_mathematics: 0.3667
ACC-high_school_microeconomics: 0.7731
ACC-high_school_physics: 0.4636
ACC-high_school_psychology: 0.8569
ACC-high_school_statistics: 0.5278
ACC-high_school_us_history: 0.8431
ACC-high_school_world_history: 0.8650
ACC-human_aging: 0.7175
ACC-human_sexuality: 0.7710
ACC-international_law: 0.8347
ACC-jurisprudence: 0.7778
ACC-logical_fallacies: 0.7791
ACC-machine_learning: 0.5357
ACC-management: 0.7767
ACC-marketing: 0.9145
ACC-medical_genetics: 0.7100
ACC-miscellaneous: 0.8404
ACC-moral_disputes: 0.7775
ACC-moral_scenarios: 0.4112
ACC-nutrition: 0.7876
ACC-philosophy: 0.7492
ACC-prehistory: 0.7963
ACC-professional_accounting: 0.5177
ACC-professional_law: 0.5111
ACC-professional_medicine: 0.7390
ACC-professional_psychology: 0.7304
ACC-public_relations: 0.6727
ACC-security_studies: 0.7061
ACC-sociology: 0.8706
ACC-us_foreign_policy: 0.9100
ACC-virology: 0.5060
ACC-world_religions: 0.8538
ACC-all: 0.6707
[2023-12-22 16:35:03,999 INFO] total run time 7156.16

```