ibnzterrell
commited on
Update compatibility in README.md
Browse files
README.md
CHANGED
@@ -28,7 +28,7 @@ base_model:
|
|
28 |
|
29 |
This model was quantized using [AutoAWQ](https://github.com/casper-hansen/AutoAWQ) from FP16 down to INT4 using GEMM kernels, with zero-point quantization and a group size of 128.
|
30 |
|
31 |
-
Hardware: Intel Xeon CPU E5-2699A v4 @ 2.40GHz, 256GB of RAM, and 2x NVIDIA RTX 3090.
|
32 |
|
33 |
Model usage (inference) information for Transformers, AutoAWQ, Text Generation Interface (TGI), and vLLM , as well as quantization reproduction details, are below.
|
34 |
|
|
|
28 |
|
29 |
This model was quantized using [AutoAWQ](https://github.com/casper-hansen/AutoAWQ) from FP16 down to INT4 using GEMM kernels, with zero-point quantization and a group size of 128.
|
30 |
|
31 |
+
Hardware: Intel Xeon CPU E5-2699A v4 @ 2.40GHz, 256GB of RAM, and 2x NVIDIA RTX 3090. This should work on any platform that supports LLama 3.1 70B Instruct AWQ INT4.
|
32 |
|
33 |
Model usage (inference) information for Transformers, AutoAWQ, Text Generation Interface (TGI), and vLLM , as well as quantization reproduction details, are below.
|
34 |
|