roleplaiapp commited on
Commit
5f5bb00
·
verified ·
1 Parent(s): 333a262

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +49 -0
README.md ADDED
@@ -0,0 +1,49 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ inference: false
5
+ fine-tuning: false
6
+ tags:
7
+ - llama-cpp
8
+ - Llama-3.1-Nemotron-70B-Instruct-HF
9
+ - gguf
10
+ - Q3_K_M
11
+ - 70b
12
+ - 3-bit
13
+ - nemotron
14
+ - llama-cpp
15
+ - nvidia
16
+ - code
17
+ - math
18
+ - chat
19
+ - roleplay
20
+ - text-generation
21
+ - safetensors
22
+ - nlp
23
+ - code
24
+ datasets:
25
+ - nvidia/HelpSteer2
26
+ base_model: meta-llama/Llama-3.1-70B-Instruct
27
+ pipeline_tag: text-generation
28
+ library_name: transformers
29
+ ---
30
+
31
+ # roleplaiapp/Llama-3.1-Nemotron-70B-Instruct-HF-Q3_K_M-GGUF
32
+
33
+ **Repo:** `roleplaiapp/Llama-3.1-Nemotron-70B-Instruct-HF-Q3_K_M-GGUF`
34
+ **Original Model:** `Llama-3.1-Nemotron-70B-Instruct-HF`
35
+ **Organization:** `nvidia`
36
+ **Quantized File:** `llama-3.1-nemotron-70b-instruct-hf-q3_k_m.gguf`
37
+ **Quantization:** `GGUF`
38
+ **Quantization Method:** `Q3_K_M`
39
+ **Use Imatrix:** `False`
40
+ **Split Model:** `False`
41
+
42
+ ## Overview
43
+ This is an GGUF Q3_K_M quantized version of [Llama-3.1-Nemotron-70B-Instruct-HF](https://huggingface.co/nvidia/Llama-3.1-Nemotron-70B-Instruct-HF).
44
+
45
+ ## Quantization By
46
+ I often have idle A100 GPUs while building/testing and training the RP app, so I put them to use quantizing models.
47
+ I hope the community finds these quantizations useful.
48
+
49
+ Andrew Webby @ [RolePlai](https://roleplai.app/)