Update README.md
Browse files
README.md
CHANGED
@@ -17,8 +17,8 @@ Please be sure to set experts per token to 4 for the best results! Context lengt
|
|
17 |
# Quanitized versions
|
18 |
|
19 |
EXL2 (for fast GPU-only inference): <br />
|
20 |
-
8_0bpw: https://huggingface.co/Skylaude/WizardLM-2-4x7B-MoE-exl2-8_0bpw (25+ GB vram) <br />
|
21 |
-
6_0bpw: https://huggingface.co/Skylaude/WizardLM-2-4x7B-MoE-exl2-6_0bpw (20+ GB vram) <br />
|
22 |
5_0bpw: [coming soon] (16+ GB vram) <br />
|
23 |
4_25bpw: https://huggingface.co/Skylaude/WizardLM-2-4x7B-MoE-exl2-4_25bpw (14+ GB vram) <br />
|
24 |
3_5bpw: https://huggingface.co/Skylaude/WizardLM-2-4x7B-MoE-exl2-3_5bpw (12+ GB vram) <br />
|
|
|
17 |
# Quanitized versions
|
18 |
|
19 |
EXL2 (for fast GPU-only inference): <br />
|
20 |
+
8_0bpw: https://huggingface.co/Skylaude/WizardLM-2-4x7B-MoE-exl2-8_0bpw (~ 25+ GB vram) <br />
|
21 |
+
6_0bpw: https://huggingface.co/Skylaude/WizardLM-2-4x7B-MoE-exl2-6_0bpw (~ 20+ GB vram) <br />
|
22 |
5_0bpw: [coming soon] (16+ GB vram) <br />
|
23 |
4_25bpw: https://huggingface.co/Skylaude/WizardLM-2-4x7B-MoE-exl2-4_25bpw (14+ GB vram) <br />
|
24 |
3_5bpw: https://huggingface.co/Skylaude/WizardLM-2-4x7B-MoE-exl2-3_5bpw (12+ GB vram) <br />
|