Update README.md
Browse files
README.md
CHANGED
@@ -23,14 +23,16 @@ Recent advancements in extending context by RoPE scaling ([kaiokendev](https://k
|
|
23 |
| **bhenrym14/airoboros-33b-gpt4-1.4.1-PI-8192-GPTQ** | **2048** | **4.32** |
|
24 |
| **bhenrym14/airoboros-33b-gpt4-1.4.1-PI-8192-GPTQ** | **3072** | **4.26** |
|
25 |
|
26 |
-
- How does this reduction in perplexity translate into actual performance lift on downstream tasks? I'm not sure yet.
|
27 |
- This comparison isn't perfect. I did use the 1.4.1 dataset and the quantization method is slightly different.
|
28 |
|
29 |
## Quantization:
|
30 |
|
31 |
The merged model was quantized with AutoGPTQ (bits = 4, group_size = 128, desc_act = True). The adapter weights and config are also uploaded.
|
32 |
|
|
|
33 |
|
|
|
34 |
|
35 |
# Original model card: Jon Durbin's Airoboros 33B GPT4 1.4
|
36 |
|
|
|
23 |
| **bhenrym14/airoboros-33b-gpt4-1.4.1-PI-8192-GPTQ** | **2048** | **4.32** |
|
24 |
| **bhenrym14/airoboros-33b-gpt4-1.4.1-PI-8192-GPTQ** | **3072** | **4.26** |
|
25 |
|
26 |
+
- How does this reduction in perplexity translate into actual performance lift on downstream tasks? I'm not sure yet. I've done a few experiments and have been happy with the performance, but I haven't used models with the SuperHOT LoRA enough to have any sense of performance differences.
|
27 |
- This comparison isn't perfect. I did use the 1.4.1 dataset and the quantization method is slightly different.
|
28 |
|
29 |
## Quantization:
|
30 |
|
31 |
The merged model was quantized with AutoGPTQ (bits = 4, group_size = 128, desc_act = True). The adapter weights and config are also uploaded.
|
32 |
|
33 |
+
## Prompting:
|
34 |
|
35 |
+
See original model card below.
|
36 |
|
37 |
# Original model card: Jon Durbin's Airoboros 33B GPT4 1.4
|
38 |
|