cgus commited on
Commit
b7a2ae6
1 Parent(s): 8a639a2

Upload 6 files

Browse files
README.md ADDED
@@ -0,0 +1,102 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: []
3
+ library_name: transformers
4
+ tags:
5
+ - mergekit
6
+ - merge
7
+ ---
8
+
9
+ ![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/6550b16f7490049d6237f200/zYBXSewLbIxWHZdB3oEHs.jpeg)
10
+
11
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6550b16f7490049d6237f200/eRwPcd9Ox03hn_WRnsotj.png)
12
+
13
+ # Information
14
+ ## Details
15
+
16
+ Okay, I tried really hard to improve my ChatML merges, but that has gone terribly wrong. Everyone is adding special tokens with different IDs so can't even make a proper union tokenizer for them, damn. Not to mention, I made some... interesting discoveres in regards to some models' context lenghts. You can watch the breakdown of how it went down here: https://www.captiongenerator.com/v/2303039/marinaraspaghetti's-merging-experience.
17
+
18
+ This one feels a bit different to my previous attempts and seems less prone to repetition, especially on higher contexts, which is great for me! I'll probably improve on it even further, but for now, it feels rather nice. Great for RP and storytelling. All credits and thanks go to the amazing MistralAI, Intervitens, Sao10K and Nbeerbower for their amazing models! Plus, special shoutouts to Parasitic Rogue for ideas and Prodeus Unity and Statuo for cool exl2 quants of my previous merges. Cheers to folks over at the Drummer's server! Have a good one, everyone.
19
+
20
+ ## Instruct
21
+
22
+ ![image/gif](https://cdn-uploads.huggingface.co/production/uploads/6550b16f7490049d6237f200/JtOSIRNnMdGNycWACobO2.gif)
23
+
24
+ *Sigh,* Mistral Instruct, I'm afraid.
25
+
26
+ ```
27
+ <s>[INST] {system} [/INST]{response}</s>[INST] {user's message} [/INST]{response}</s>
28
+ ```
29
+
30
+ ## Parameters
31
+
32
+ I recommend running Temperature 1.0-1.25 with 0.1 Top A or 0.01-0.1 Min P, and with 0.8/1.75/2/0 DRY. Also works with lower Temperatures below 1.0. Nothing more needed.
33
+
34
+ ### Settings
35
+
36
+ You can use my exact settings from here (use the ones from the Mistral Base/Customized folder, I also recommend checking the Mistral Improved folder): https://huggingface.co/MarinaraSpaghetti/SillyTavern-Settings/tree/main.
37
+
38
+ ## GGUF
39
+
40
+ https://huggingface.co/bartowski/NemoMix-Unleashed-12B-GGUF
41
+
42
+ ## EXL2
43
+
44
+ https://huggingface.co/Statuo/NemoMix-Unleashed-EXL2-8bpw
45
+
46
+ # NemoMix-Unleashed-12B
47
+
48
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
49
+
50
+ ## Merge Details
51
+ ### Merge Method
52
+
53
+ This model was merged using the della_linear merge method using E:\mergekit\mistralaiMistral-Nemo-Base-2407 as a base.
54
+
55
+ ### Models Merged
56
+
57
+ The following models were included in the merge:
58
+ * E:\mergekit\intervitens_mini-magnum-12b-v1.1
59
+ * E:\mergekit\nbeerbower_mistral-nemo-bophades-12B
60
+ * E:\mergekit\Sao10K_MN-12B-Lyra-v1
61
+ * E:\mergekit\nbeerbower_mistral-nemo-gutenberg-12B
62
+ * E:\mergekit\mistralaiMistral-Nemo-Instruct-2407
63
+
64
+ ### Configuration
65
+
66
+ The following YAML configuration was used to produce this model:
67
+
68
+ ```yaml
69
+ models:
70
+ - model: E:\mergekit\mistralaiMistral-Nemo-Instruct-2407
71
+ parameters:
72
+ weight: 0.1
73
+ density: 0.4
74
+ - model: E:\mergekit\nbeerbower_mistral-nemo-bophades-12B
75
+ parameters:
76
+ weight: 0.12
77
+ density: 0.5
78
+ - model: E:\mergekit\nbeerbower_mistral-nemo-gutenberg-12B
79
+ parameters:
80
+ weight: 0.2
81
+ density: 0.6
82
+ - model: E:\mergekit\Sao10K_MN-12B-Lyra-v1
83
+ parameters:
84
+ weight: 0.25
85
+ density: 0.7
86
+ - model: E:\mergekit\intervitens_mini-magnum-12b-v1.1
87
+ parameters:
88
+ weight: 0.33
89
+ density: 0.8
90
+ merge_method: della_linear
91
+ base_model: E:\mergekit\mistralaiMistral-Nemo-Base-2407
92
+ parameters:
93
+ epsilon: 0.05
94
+ lambda: 1
95
+ dtype: bfloat16
96
+ tokenizer_source: base
97
+ ```
98
+
99
+ # Ko-fi
100
+ ## Enjoying what I do? Consider donating here, thank you!
101
+
102
+ https://ko-fi.com/spicy_marinara
config.json ADDED
@@ -0,0 +1,38 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "E:\\mergekit\\mistralaiMistral-Nemo-Base-2407",
3
+ "architectures": [
4
+ "MistralForCausalLM"
5
+ ],
6
+ "attention_dropout": 0.0,
7
+ "bos_token_id": 1,
8
+ "eos_token_id": 2,
9
+ "head_dim": 128,
10
+ "hidden_act": "silu",
11
+ "hidden_size": 5120,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 14336,
14
+ "max_position_embeddings": 1024000,
15
+ "model_type": "mistral",
16
+ "num_attention_heads": 32,
17
+ "num_hidden_layers": 40,
18
+ "num_key_value_heads": 8,
19
+ "rms_norm_eps": 1e-05,
20
+ "rope_theta": 1000000.0,
21
+ "sliding_window": null,
22
+ "tie_word_embeddings": false,
23
+ "torch_dtype": "bfloat16",
24
+ "transformers_version": "4.44.0",
25
+ "use_cache": true,
26
+ "vocab_size": 131072,
27
+ "quantization_config": {
28
+ "quant_method": "exl2",
29
+ "version": "0.1.8",
30
+ "bits": 4.0,
31
+ "head_bits": 6,
32
+ "calibration": {
33
+ "rows": 115,
34
+ "length": 2048,
35
+ "dataset": "(default)"
36
+ }
37
+ }
38
+ }
output.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dd01244fe1070690dd301a4243cee4f2180223fdeb810433a40877aa7a81762b
3
+ size 7326533868
special_tokens_map.json ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<s>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "eos_token": {
10
+ "content": "</s>",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "unk_token": {
17
+ "content": "<unk>",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ }
23
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
The diff for this file is too large to render. See raw diff