bhenrym14
/

airoboros-33b-gpt4-1.4.1-PI-8192-GPTQ

Text Generation

Inference Endpoints

Model card Files Files and versions Community

bhenrym14 commited on Jul 3, 2023

Commit

df05cab

·

1 Parent(s): 57d6645

Update README.md

Files changed (1) hide show

README.md +3 -1

README.md CHANGED Viewed

@@ -23,14 +23,16 @@ Recent advancements in extending context by RoPE scaling ([kaiokendev](https://k
 | **bhenrym14/airoboros-33b-gpt4-1.4.1-PI-8192-GPTQ**    | **2048**    | **4.32**   |
 | **bhenrym14/airoboros-33b-gpt4-1.4.1-PI-8192-GPTQ**    | **3072**    | **4.26**   |
-- How does this reduction in perplexity translate into actual performance lift on downstream tasks? I'm not sure yet.
 - This comparison isn't perfect. I did use the 1.4.1 dataset and the quantization method is slightly different.
 ## Quantization:
 The merged model was quantized with AutoGPTQ (bits = 4, group_size = 128, desc_act = True). The adapter weights and config are also uploaded.
 # Original model card: Jon Durbin's Airoboros 33B GPT4 1.4

 | **bhenrym14/airoboros-33b-gpt4-1.4.1-PI-8192-GPTQ**    | **2048**    | **4.32**   |
 | **bhenrym14/airoboros-33b-gpt4-1.4.1-PI-8192-GPTQ**    | **3072**    | **4.26**   |
+- How does this reduction in perplexity translate into actual performance lift on downstream tasks? I'm not sure yet. I've done a few experiments and have been happy with the performance, but I haven't used models with the SuperHOT LoRA enough to have any sense of performance differences.
 - This comparison isn't perfect. I did use the 1.4.1 dataset and the quantization method is slightly different.
 ## Quantization:
 The merged model was quantized with AutoGPTQ (bits = 4, group_size = 128, desc_act = True). The adapter weights and config are also uploaded.
+## Prompting:
+See original model card below.
 # Original model card: Jon Durbin's Airoboros 33B GPT4 1.4