LoneStriker commited on
Commit
f59a077
1 Parent(s): 8849c84

Upload 3 files

Browse files
Files changed (3) hide show
  1. README.md +22 -13
  2. config.json +1 -1
  3. mergekit_config.yml +2 -7
README.md CHANGED
@@ -1,11 +1,27 @@
1
  ---
2
- base_model: []
 
 
 
 
3
  tags:
 
4
  - mergekit
5
  - merge
 
 
 
 
 
 
 
 
 
 
 
 
6
 
7
  ---
8
- # airoboros-3.2-mixtral-zloss-merged
9
 
10
  This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
11
 
@@ -17,8 +33,8 @@ This model was merged using the [linear](https://arxiv.org/abs/2203.05482) merge
17
  ### Models Merged
18
 
19
  The following models were included in the merge:
20
- * /home/hien/models/Mixtral-8x7B-Instruct-v0.1
21
- * /home/hien/models/airoboros-3.2-mixtral-zloss
22
 
23
  ### Configuration
24
 
@@ -26,19 +42,12 @@ The following YAML configuration was used to produce this model:
26
 
27
  ```yaml
28
  models:
29
- - model: /home/hien/models/Mixtral-8x7B-Instruct-v0.1
30
  parameters:
31
  weight: 0.5
32
- - model: /home/hien/models/airoboros-3.2-mixtral-zloss
33
  parameters:
34
  weight: 0.5
35
  merge_method: linear
36
- #merge_method: dare_ties
37
- #base_model: ./extra_hdd/Mixtral-8x7B-v0.1
38
- parameters:
39
- #normalize: false
40
- #int8_mask: true
41
  dtype: bfloat16
42
-
43
-
44
  ```
 
1
  ---
2
+ inference: false
3
+ language:
4
+ - en
5
+ library_name: transformers
6
+ pipeline_tag: text-generation
7
  tags:
8
+ - mixtral
9
  - mergekit
10
  - merge
11
+ license: apache-2.0
12
+ datasets:
13
+ - jondurbin/airoboros-3.2
14
+ ---
15
+
16
+ # Air-Striker-Mixtral-8x7B-Instruct-ZLoss
17
+
18
+ Experimental model, trained using config and [Transformers/Axolotl](https://github.com/DocShotgun/axolotl) forks provided by [Doctor-Shotgun](https://huggingface.co/Doctor-Shotgun)
19
+
20
+ Model was fine-tuned from [Mixtral-8x7B-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1) with airoboros-3.2 dataset, for 4 epochs, ChatML prompt format at 8K context length.
21
+
22
+ Additionally, model was then merged with [Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1):
23
 
24
  ---
 
25
 
26
  This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
27
 
 
33
  ### Models Merged
34
 
35
  The following models were included in the merge:
36
+ * mistralai/Mixtral-8x7B-Instruct-v0.1
37
+ * LoneStriker/Air-Striker-Mixtral-8x7B-ZLoss
38
 
39
  ### Configuration
40
 
 
42
 
43
  ```yaml
44
  models:
45
+ - model: mistralai/Mixtral-8x7B-Instruct-v0.1
46
  parameters:
47
  weight: 0.5
48
+ - model: LoneStriker/Air-Striker-Mixtral-8x7B-ZLoss
49
  parameters:
50
  weight: 0.5
51
  merge_method: linear
 
 
 
 
 
52
  dtype: bfloat16
 
 
53
  ```
config.json CHANGED
@@ -1,5 +1,5 @@
1
  {
2
- "_name_or_path": "/home/hien/models/Mixtral-8x7B-Instruct-v0.1",
3
  "architectures": [
4
  "MixtralForCausalLM"
5
  ],
 
1
  {
2
+ "_name_or_path": "/home/ubuntu/models/Mixtral-8x7B-Instruct-v0.1",
3
  "architectures": [
4
  "MixtralForCausalLM"
5
  ],
mergekit_config.yml CHANGED
@@ -1,15 +1,10 @@
1
  models:
2
- - model: /home/hien/models/Mixtral-8x7B-Instruct-v0.1
3
  parameters:
4
  weight: 0.5
5
- - model: /home/hien/models/airoboros-3.2-mixtral-zloss
6
  parameters:
7
  weight: 0.5
8
  merge_method: linear
9
- #merge_method: dare_ties
10
- #base_model: ./extra_hdd/Mixtral-8x7B-v0.1
11
- parameters:
12
- #normalize: false
13
- #int8_mask: true
14
  dtype: bfloat16
15
 
 
1
  models:
2
+ - model: mistralai/Mixtral-8x7B-Instruct-v0.1
3
  parameters:
4
  weight: 0.5
5
+ - model: LoneStriker/Air-Striker-Mixtral-8x7B-ZLoss
6
  parameters:
7
  weight: 0.5
8
  merge_method: linear
 
 
 
 
 
9
  dtype: bfloat16
10