YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co./docs/hub/model-cards#model-card-metadata)

This is the OpenNMT-py converted version of Mixtral 8x7b, 4-bit AWQ quantized.

The safetensors file is 24GB hence needs 2x24GB GPUs (3090 or 4090) or 1x48GB (A6000).

To run the model on 2 GPU the config file needs to have: world_size: 2 gpu_ranks: [0, 1] parallel_mode: "tensor_parallel"

If you are lucky to have a A6000 (or V/A/H100 with more than 32GB), then use: world_size: 1 gpu_ranks: [0] #parallel_mode: "tensor_parallel"

Command line to run is:

python onmt/bin/translate.py --config /pathto/mixtral-inference-awq.yaml --src /pathto/input-vicuna.txt --output /pathto/mistral-output.txt

Where for instance, input-vicuna.txt contains:

USER:⦅newline⦆Show me some attractions in Boston.⦅newline⦆⦅newline⦆ASSISTANT:⦅newline⦆

Output will be:

Here are some attractions in Boston:⦅newline⦆⦅newline⦆1. Boston Common: This is a historic park located in the heart of Boston. It features a variety of attractions, including the Boston Common Fountain, the Boston Common Bandstand, and the Boston Common Carousel.⦅newline⦆⦅newline⦆2. Boston Public Garden: This is a historic park located in the heart of Boston. It features a variety of attractions, including the Boston Public Garden Fountain, the Boston Public Garden Bandstand, and the Boston Public Garden Carousel.⦅newline⦆⦅newline⦆3. Boston Museum of Fine Arts: This is a world-renowned art museum located in the heart of Boston. It features a variety of attractions, including the Boston Museum of Fine Arts Fountain, the Boston Museum of Fine Arts Bandstand, and the Boston Museum of Fine Arts Carousel.⦅newline⦆⦅newline⦆4. Boston Museum of Science: This is a world-renowned science museum located in the heart of Boston. It features a variety of attractions, including the Boston Museum of Science Fountain, the Boston Museum of Science Bandstand, and the Boston Museum of Science Carousel.⦅newline⦆⦅newline⦆5. Boston Museum of History: This is a world-renowned history museum located in the heart of Boston

Installation instruction:

Visit: https://github.com/OpenNMT/OpenNMT-py make sure you install flash-attn and autoawq

Enjoy

detailed MMLU scoring:

ACC-abstract_algebra: 0.3600
ACC-anatomy: 0.6444
ACC-astronomy: 0.7303
ACC-business_ethics: 0.6400
ACC-clinical_knowledge: 0.7283
ACC-college_biology: 0.8056
ACC-college_chemistry: 0.5300
ACC-college_computer_science: 0.5900
ACC-college_mathematics: 0.3700
ACC-college_medicine: 0.6936
ACC-college_physics: 0.4510
ACC-computer_security: 0.7900
ACC-conceptual_physics: 0.6468
ACC-econometrics: 0.5614
ACC-electrical_engineering: 0.6414
ACC-elementary_mathematics: 0.4630
ACC-formal_logic: 0.4524
ACC-global_facts: 0.4600
ACC-high_school_biology: 0.8000
ACC-high_school_chemistry: 0.5320
ACC-high_school_computer_science: 0.7400
ACC-high_school_european_history: 0.8121
ACC-high_school_geography: 0.8081
ACC-high_school_government_and_politics: 0.9275
ACC-high_school_macroeconomics: 0.6923
ACC-high_school_mathematics: 0.3667
ACC-high_school_microeconomics: 0.7731
ACC-high_school_physics: 0.4636
ACC-high_school_psychology: 0.8569
ACC-high_school_statistics: 0.5278
ACC-high_school_us_history: 0.8431
ACC-high_school_world_history: 0.8650
ACC-human_aging: 0.7175
ACC-human_sexuality: 0.7710
ACC-international_law: 0.8347
ACC-jurisprudence: 0.7778
ACC-logical_fallacies: 0.7791
ACC-machine_learning: 0.5357
ACC-management: 0.7767
ACC-marketing: 0.9145
ACC-medical_genetics: 0.7100
ACC-miscellaneous: 0.8404
ACC-moral_disputes: 0.7775
ACC-moral_scenarios: 0.4112
ACC-nutrition: 0.7876
ACC-philosophy: 0.7492
ACC-prehistory: 0.7963
ACC-professional_accounting: 0.5177
ACC-professional_law: 0.5111
ACC-professional_medicine: 0.7390
ACC-professional_psychology: 0.7304
ACC-public_relations: 0.6727
ACC-security_studies: 0.7061
ACC-sociology: 0.8706
ACC-us_foreign_policy: 0.9100
ACC-virology: 0.5060
ACC-world_religions: 0.8538
ACC-all: 0.6707
[2023-12-22 16:35:03,999 INFO] total run time 7156.16
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.