Update README.md
Browse files
README.md
CHANGED
@@ -196,6 +196,46 @@ Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> release <a
|
|
196 |
|
197 |
Original model: https://huggingface.co/meta-llama/Meta-Llama-3.1-8B
|
198 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
199 |
## Reproducibility
|
200 |
|
201 |
### Download original weights
|
@@ -238,45 +278,4 @@ cd llama.cpp
|
|
238 |
./llama-perplexity -m ../Meta-Llama-3.1-8B-Q3_K_M.gguf -f ../wikitext-2-raw/wiki.test.raw
|
239 |
./llama-perplexity -m ../Meta-Llama-3.1-8B-Q3_K_S.gguf -f ../wikitext-2-raw/wiki.test.raw
|
240 |
./llama-perplexity -m ../Meta-Llama-3.1-8B-Q2_K.gguf -f ../wikitext-2-raw/wiki.test.raw
|
241 |
-
```
|
242 |
-
|
243 |
-
## Download a file (not the whole branch) from below:
|
244 |
-
|
245 |
-
| Filename | Quant type | File Size | Perplexity (wikitext-2-raw-v1.test) |
|
246 |
-
| -------- | ---------- | --------- | ----------- |
|
247 |
-
| [Meta-Llama-3.1-8B-BF16.gguf](https://huggingface.co/fedric95/Meta-Llama-3.1-8B-GGUF/blob/main/Meta-Llama-3.1-8B-BF16.gguf) | BF16 | 16.07GB | 6.4006 +/- 0.03938 |
|
248 |
-
| [Meta-Llama-3.1-8B-FP16.gguf](https://huggingface.co/fedric95/Meta-Llama-3.1-8B-GGUF/blob/main/Meta-Llama-3.1-8B-FP16.gguf) | FP16 | 16.07GB | 6.4016 +/- 0.03939 |
|
249 |
-
| [Meta-Llama-3.1-8B-Q8_0.gguf](https://huggingface.co/fedric95/Meta-Llama-3.1-8B-GGUF/blob/main/Meta-Llama-3.1-8B-Q8_0.gguf) | Q8_0 | 8.54GB | 6.4070 +/- 0.03941 |
|
250 |
-
| [Meta-Llama-3.1-8B-Q6_K.gguf](https://huggingface.co/fedric95/Meta-Llama-3.1-8B-GGUF/blob/main/Meta-Llama-3.1-8B-Q6_K.gguf) | Q6_K | 6.60GB | 6.4231 +/- 0.03957 |
|
251 |
-
| [Meta-Llama-3.1-8B-Q5_K_M.gguf](https://huggingface.co/fedric95/Meta-Llama-3.1-8B-GGUF/blob/main/Meta-Llama-3.1-8B-Q5_K_M.gguf) | Q5_K_M | 5.73GB | 6.4623 +/- 0.03987 |
|
252 |
-
| [Meta-Llama-3.1-8B-Q5_K_S.gguf](https://huggingface.co/fedric95/Meta-Llama-3.1-8B-GGUF/blob/main/Meta-Llama-3.1-8B-Q5_K_S.gguf) | Q5_K_S | 5.60GB | 6.5161 +/- 0.04028 |
|
253 |
-
| [Meta-Llama-3.1-8B-Q4_K_M.gguf](https://huggingface.co/fedric95/Meta-Llama-3.1-8B-GGUF/blob/main/Meta-Llama-3.1-8B-Q4_K_M.gguf) | Q4_K_M | 4.92GB | 6.5837 +/- 0.04068 |
|
254 |
-
| [Meta-Llama-3.1-8B-Q4_K_S.gguf](https://huggingface.co/fedric95/Meta-Llama-3.1-8B-GGUF/blob/main/Meta-Llama-3.1-8B-Q4_K_S.gguf) | Q4_K_S | 4.69GB | 6.6751 +/- 0.04125 |
|
255 |
-
| [Meta-Llama-3.1-8B-Q3_K_L.gguf](https://huggingface.co/fedric95/Meta-Llama-3.1-8B-GGUF/blob/main/Meta-Llama-3.1-8B-Q3_K_L.gguf) | Q3_K_L | 4.32GB | 6.9458 +/- 0.04329 |
|
256 |
-
| [Meta-Llama-3.1-8B-Q3_K_M.gguf](https://huggingface.co/fedric95/Meta-Llama-3.1-8B-GGUF/blob/main/Meta-Llama-3.1-8B-Q3_K_M.gguf) | Q3_K_M | 4.02GB | 7.0488 +/- 0.04384 |
|
257 |
-
| [Meta-Llama-3.1-8B-Q3_K_S.gguf](https://huggingface.co/fedric95/Meta-Llama-3.1-8B-GGUF/blob/main/Meta-Llama-3.1-8B-Q3_K_S.gguf) | Q3_K_S | 3.66GB | 7.8823 +/- 0.04920 |
|
258 |
-
| [Meta-Llama-3.1-8B-Q2_K.gguf](https://huggingface.co/fedric95/Meta-Llama-3.1-8B-GGUF/blob/main/Meta-Llama-3.1-8B-Q2_K.gguf) | Q2_K | 3.18GB | 9.7262 +/- 0.06393 |
|
259 |
-
|
260 |
-
|
261 |
-
|
262 |
-
## Downloading using huggingface-cli
|
263 |
-
|
264 |
-
First, make sure you have hugginface-cli installed:
|
265 |
-
|
266 |
-
```
|
267 |
-
pip install -U "huggingface_hub[cli]"
|
268 |
-
```
|
269 |
-
|
270 |
-
Then, you can target the specific file you want:
|
271 |
-
|
272 |
-
```
|
273 |
-
huggingface-cli download fedric95/Meta-Llama-3.1-8B-GGUF --include "Meta-Llama-3.1-8B-Q4_K_M.gguf" --local-dir ./
|
274 |
-
```
|
275 |
-
|
276 |
-
If the model is bigger than 50GB, it will have been split into multiple files. In order to download them all to a local folder, run:
|
277 |
-
|
278 |
-
```
|
279 |
-
huggingface-cli download fedric95/Meta-Llama-3.1-8B-GGUF --include "Meta-Llama-3.1-8B-Q8_0.gguf/*" --local-dir Meta-Llama-3.1-8B-Q8_0
|
280 |
-
```
|
281 |
-
|
282 |
-
You can either specify a new local-dir (Meta-Llama-3.1-8B-Q8_0) or download them all in place (./)
|
|
|
196 |
|
197 |
Original model: https://huggingface.co/meta-llama/Meta-Llama-3.1-8B
|
198 |
|
199 |
+
## Download a file (not the whole branch) from below:
|
200 |
+
|
201 |
+
| Filename | Quant type | File Size | Perplexity (wikitext-2-raw-v1.test) |
|
202 |
+
| -------- | ---------- | --------- | ----------- |
|
203 |
+
| [Meta-Llama-3.1-8B-BF16.gguf](https://huggingface.co/fedric95/Meta-Llama-3.1-8B-GGUF/blob/main/Meta-Llama-3.1-8B.BF16.gguf) | BF16 | 16.07GB | 6.4006 +/- 0.03938 |
|
204 |
+
| [Meta-Llama-3.1-8B-FP16.gguf](https://huggingface.co/fedric95/Meta-Llama-3.1-8B-GGUF/blob/main/Meta-Llama-3.1-8B.FP16.gguf) | FP16 | 16.07GB | 6.4016 +/- 0.03939 |
|
205 |
+
| [Meta-Llama-3.1-8B-Q8_0.gguf](https://huggingface.co/fedric95/Meta-Llama-3.1-8B-GGUF/blob/main/Meta-Llama-3.1-8B-Q8_0.gguf) | Q8_0 | 8.54GB | 6.4070 +/- 0.03941 |
|
206 |
+
| [Meta-Llama-3.1-8B-Q6_K.gguf](https://huggingface.co/fedric95/Meta-Llama-3.1-8B-GGUF/blob/main/Meta-Llama-3.1-8B-Q6_K.gguf) | Q6_K | 6.60GB | 6.4231 +/- 0.03957 |
|
207 |
+
| [Meta-Llama-3.1-8B-Q5_K_M.gguf](https://huggingface.co/fedric95/Meta-Llama-3.1-8B-GGUF/blob/main/Meta-Llama-3.1-8B-Q5_K_M.gguf) | Q5_K_M | 5.73GB | 6.4623 +/- 0.03987 |
|
208 |
+
| [Meta-Llama-3.1-8B-Q5_K_S.gguf](https://huggingface.co/fedric95/Meta-Llama-3.1-8B-GGUF/blob/main/Meta-Llama-3.1-8B-Q5_K_S.gguf) | Q5_K_S | 5.60GB | 6.5161 +/- 0.04028 |
|
209 |
+
| [Meta-Llama-3.1-8B-Q4_K_M.gguf](https://huggingface.co/fedric95/Meta-Llama-3.1-8B-GGUF/blob/main/Meta-Llama-3.1-8B-Q4_K_M.gguf) | Q4_K_M | 4.92GB | 6.5837 +/- 0.04068 |
|
210 |
+
| [Meta-Llama-3.1-8B-Q4_K_S.gguf](https://huggingface.co/fedric95/Meta-Llama-3.1-8B-GGUF/blob/main/Meta-Llama-3.1-8B-Q4_K_S.gguf) | Q4_K_S | 4.69GB | 6.6751 +/- 0.04125 |
|
211 |
+
| [Meta-Llama-3.1-8B-Q3_K_L.gguf](https://huggingface.co/fedric95/Meta-Llama-3.1-8B-GGUF/blob/main/Meta-Llama-3.1-8B-Q3_K_L.gguf) | Q3_K_L | 4.32GB | 6.9458 +/- 0.04329 |
|
212 |
+
| [Meta-Llama-3.1-8B-Q3_K_M.gguf](https://huggingface.co/fedric95/Meta-Llama-3.1-8B-GGUF/blob/main/Meta-Llama-3.1-8B-Q3_K_M.gguf) | Q3_K_M | 4.02GB | 7.0488 +/- 0.04384 |
|
213 |
+
| [Meta-Llama-3.1-8B-Q3_K_S.gguf](https://huggingface.co/fedric95/Meta-Llama-3.1-8B-GGUF/blob/main/Meta-Llama-3.1-8B-Q3_K_S.gguf) | Q3_K_S | 3.66GB | 7.8823 +/- 0.04920 |
|
214 |
+
| [Meta-Llama-3.1-8B-Q2_K.gguf](https://huggingface.co/fedric95/Meta-Llama-3.1-8B-GGUF/blob/main/Meta-Llama-3.1-8B-Q2_K.gguf) | Q2_K | 3.18GB | 9.7262 +/- 0.06393 |
|
215 |
+
|
216 |
+
## Downloading using huggingface-cli
|
217 |
+
|
218 |
+
First, make sure you have hugginface-cli installed:
|
219 |
+
|
220 |
+
```
|
221 |
+
pip install -U "huggingface_hub[cli]"
|
222 |
+
```
|
223 |
+
|
224 |
+
Then, you can target the specific file you want:
|
225 |
+
|
226 |
+
```
|
227 |
+
huggingface-cli download fedric95/Meta-Llama-3.1-8B-GGUF --include "Meta-Llama-3.1-8B-Q4_K_M.gguf" --local-dir ./
|
228 |
+
```
|
229 |
+
|
230 |
+
If the model is bigger than 50GB, it will have been split into multiple files. In order to download them all to a local folder, run:
|
231 |
+
|
232 |
+
```
|
233 |
+
huggingface-cli download fedric95/Meta-Llama-3.1-8B-GGUF --include "Meta-Llama-3.1-8B-Q8_0.gguf/*" --local-dir Meta-Llama-3.1-8B-Q8_0
|
234 |
+
```
|
235 |
+
|
236 |
+
You can either specify a new local-dir (Meta-Llama-3.1-8B-Q8_0) or download them all in place (./)
|
237 |
+
|
238 |
+
|
239 |
## Reproducibility
|
240 |
|
241 |
### Download original weights
|
|
|
278 |
./llama-perplexity -m ../Meta-Llama-3.1-8B-Q3_K_M.gguf -f ../wikitext-2-raw/wiki.test.raw
|
279 |
./llama-perplexity -m ../Meta-Llama-3.1-8B-Q3_K_S.gguf -f ../wikitext-2-raw/wiki.test.raw
|
280 |
./llama-perplexity -m ../Meta-Llama-3.1-8B-Q2_K.gguf -f ../wikitext-2-raw/wiki.test.raw
|
281 |
+
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|