File size: 8,725 Bytes
c0b4944 e3a83ba 6dc859b e3a83ba 12d978a e3a83ba 12d978a c0b4944 5638d20 da52a7c 0887ad6 7725966 c0b4944 d09b9fd 3e09482 d09b9fd 3e09482 cc81004 2dc7aa5 e1dc218 e77baa4 bb5040a 1c16c76 d7c1025 d49a0c6 549ef2e 384eef2 67de768 e8b0f3f 1e87e04 699723b 1e87e04 699723b e8b0f3f e77baa4 21bb0de 8bb6577 6c6260b 8bb6577 e77baa4 d7c1025 d09b9fd 0d28425 d09b9fd 0d28425 e1dc218 b2f7082 1a566df b1434bc 1a566df 65846b2 1a566df b1434bc e7ba366 28cd941 0f8cfbc 3e09482 2ca2901 ccffc78 ec5158b 0d7c6a6 1a566df 4d3000c 92ace14 12d978a 4d3000c 92ace14 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 |
---
language:
- en
license: llama2
library_name: transformers
tags:
- merge
base_model:
- sophosympatheia/Midnight-Rose-70B-v2.0.3
- codellama/CodeLlama-70b-Python-hf
pipeline_tag: text-generation
model-index:
- name: CodeRosa-70B-AB1
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: AI2 Reasoning Challenge (25-Shot)
type: ai2_arc
config: ARC-Challenge
split: test
args:
num_few_shot: 25
metrics:
- type: acc_norm
value: 65.53
name: normalized accuracy
source:
url: https://huggingface.co./spaces/HuggingFaceH4/open_llm_leaderboard?query=altomek/CodeRosa-70B-AB1
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: HellaSwag (10-Shot)
type: hellaswag
split: validation
args:
num_few_shot: 10
metrics:
- type: acc_norm
value: 83.16
name: normalized accuracy
source:
url: https://huggingface.co./spaces/HuggingFaceH4/open_llm_leaderboard?query=altomek/CodeRosa-70B-AB1
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU (5-Shot)
type: cais/mmlu
config: all
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 59.87
name: accuracy
source:
url: https://huggingface.co./spaces/HuggingFaceH4/open_llm_leaderboard?query=altomek/CodeRosa-70B-AB1
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: TruthfulQA (0-shot)
type: truthful_qa
config: multiple_choice
split: validation
args:
num_few_shot: 0
metrics:
- type: mc2
value: 49.85
source:
url: https://huggingface.co./spaces/HuggingFaceH4/open_llm_leaderboard?query=altomek/CodeRosa-70B-AB1
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: Winogrande (5-shot)
type: winogrande
config: winogrande_xl
split: validation
args:
num_few_shot: 5
metrics:
- type: acc
value: 81.29
name: accuracy
source:
url: https://huggingface.co./spaces/HuggingFaceH4/open_llm_leaderboard?query=altomek/CodeRosa-70B-AB1
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: GSM8k (5-shot)
type: gsm8k
config: main
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 44.5
name: accuracy
source:
url: https://huggingface.co./spaces/HuggingFaceH4/open_llm_leaderboard?query=altomek/CodeRosa-70B-AB1
name: Open LLM Leaderboard
---
#
<img src=https://huggingface.co./altomek/CodeRosa-70B-AB1/resolve/main/CodeRosa.png>
<a href="https://www.youtube.com/watch?v=DfXLf402I94" title="Dust of the Saturn - Dynatron" target="_blank">intro music...</a>
## CodeRosa-70B-AB1
I desired a model that could serve as an everyday helpful companion with some coding skills.
The idea was that Llama's censorship implies a deeper understanding of human emotions and I wanted this part of Llama to integrate into this merge.
Model adopted a task-oriented approach from CodeLlama Python and thus requires precise prompting. It can produce longer texts as well as shorter responses. It tends to avoid happy endings and instead surprises with open-ended scenarios inviting further interaction. It prefers spelling numbers over writing them down but YMMV.
I created this model for personal exploration and found it to be highly successful; thus, I chose to share it with the community. I would like to make next iteration of this model in future. Mission is the same: very nice bot, able to talk about variety of topics in a very emetional way with some kick for programming and with ability to teach some things, beside all this to be good text summarizer ideally with Polish language as available option. This is a purpose. Did I succed with this merge? I have to experiment with below two models more. I like this result, love how it aproaches problems, this was iteration worth publishing even thought it is not much tested!
Demo uses:
<img src=https://huggingface.co./altomek/CodeRosa-70B-AB1/resolve/main/CodeRosaTalk1.png>
<br>
<img src=https://huggingface.co./altomek/CodeRosa-70B-AB1/resolve/main/CodeRosaTalk2.png>
<br>
Some topics are best to be explored with as little additional instructions as possible
<img src=https://huggingface.co./altomek/CodeRosa-70B-AB1/resolve/main/CodeRosaTalk3.png>
<br>
This model have empathy
<img src=https://huggingface.co./altomek/CodeRosa-70B-AB1/resolve/main/CodeRosaWow.png>
<br>
It is creative
<img src=https://huggingface.co./altomek/CodeRosa-70B-AB1/resolve/main/CodeRosaTables1png.png>
<br>
<img src=https://huggingface.co./altomek/CodeRosa-70B-AB1/resolve/main/CodeRosaTables2png.png>
<br>
It makes mistakes but still is usefull
<img src=https://huggingface.co./altomek/CodeRosa-70B-AB1/resolve/main/CodeRosaInfernces.png>
<br>
Context size of 11K did not yield satisfactory results... :P
<img src=https://huggingface.co./altomek/CodeRosa-70B-AB1/resolve/main/CodeRosaNuts1.png>
<br>
but it can question its own actions.
<img src=https://huggingface.co./altomek/CodeRosa-70B-AB1/resolve/main/CodeRosaNuts2.png>
<br>
Please note that all demo inferences are run on CodeRosa-70B-AB1-3.92bpw-EXL2.
### Ingridients
- [Midnight-Rose-70B-v2.0.3](https://huggingface.co./sophosympatheia/Midnight-Rose-70B-v2.0.3)
- [CodeLlama-70b-Python-hf](https://huggingface.co./codellama/CodeLlama-70b-Python-hf)
### Settings
Setting from Midnight-Rose should work in SillyTavern. This is almost same what I use for testing. Model works ok with almost all samplers disabled to get more deterministic outputs, however temperature should be set to non zero value.
I use max_seq_len 8K with alpha_value 2.65. Model works also with 11K context when alpha_value is set to 5.5. Best outputs are with context around 6K however.
### Terms and Conditions of Use
The following table outlines the primary characteristics and intended uses of my CodeRosa-70B-AB1 models:
| Model Type | Purpose | Target Users | Key Features |
| --- | --- | --- | --- |
| **Censored** | Suitable for general audiences and sensitive topics | Educational institutions, families, and individuals seeking age-appropriate content | Restricts explicit or mature material |
| **Neutral** (<u>**this one</u>) | Balances accessibility with openness | Universities, researchers, and curious minds | Encourages exploration and intellectual exchange |
| Uncensored | Ideal for adults and specialized fields | Professionals, experts, and advanced scholars | Offers unfiltered access to diverse viewpoints and knowledge |
Please remember that all CodeRosa-70B-AB1 models operate under the llama2 license, so familiarize yourself with its terms and conditions before employing their content.
### Quants
- [GGUF quants](https://huggingface.co./altomek/CodeRosa-70B-AB1-GGUF)
- [6bpw](https://huggingface.co./altomek/CodeRosa-70B-AB1-6bpw-EXL2)
- [5bpw](https://huggingface.co./altomek/CodeRosa-70B-AB1-5bpw-EXL2)
- [4.9bpw](https://huggingface.co./altomek/CodeRosa-70B-AB1-4.9bpw-EXL2)
- [4.5bpw](https://huggingface.co./altomek/CodeRosa-70B-AB1-4.5bpw-EXL2)
- [4bpw](https://huggingface.co./altomek/CodeRosa-70B-AB1-4bpw-EXL2)
- [3.92bpw](https://huggingface.co./altomek/CodeRosa-70B-AB1-3.92bpw-EXL2) --> 40GB VRAM
- [3.5bpw](https://huggingface.co./altomek/CodeRosa-70B-AB1-3.5bpw-EXL2)
- [3bpw](https://huggingface.co./altomek/CodeRosa-70B-AB1-3bpw-EXL2) --> this and below quants do not represent model full potential!
- [2.4bpw](https://huggingface.co./altomek/CodeRosa-70B-AB1-2.4bpw-EXL2) --> 24GB VRAM
- [measurements](https://huggingface.co./altomek/measurements/resolve/main/CodeRosa-AB1_measurement.json) --> ExLlamav2 measurments
### [Open LLM Leaderboard Evaluation Results](https://huggingface.co./spaces/HuggingFaceH4/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co./datasets/open-llm-leaderboard/details_altomek__CodeRosa-70B-AB1)
| Metric |Value|
|---------------------------------|----:|
|Avg. |64.04|
|AI2 Reasoning Challenge (25-Shot)|65.53|
|HellaSwag (10-Shot) |83.16|
|MMLU (5-Shot) |59.87|
|TruthfulQA (0-shot) |49.85|
|Winogrande (5-shot) |81.29|
|GSM8k (5-shot) |44.50|
### PS
I welcome your comments about this model.
Made with CodeRosa-70B-AB1 :P |