elinas commited on
Commit
f285971
·
verified ·
1 Parent(s): c568000

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -4
README.md CHANGED
@@ -5,16 +5,18 @@ library_name: transformers
5
  tags:
6
  - mergekit
7
  - merge
8
-
9
  ---
10
- # double_stuff_instruct
11
 
12
  This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
13
 
14
  ## Merge Details
15
  ### Merge Method
16
 
17
- This model was merged using the passthrough merge method.
 
 
18
 
19
  ### Models Merged
20
 
@@ -55,4 +57,4 @@ slices:
55
  - sources:
56
  - layer_range: [24, 32]
57
  model: meta-llama/Meta-Llama-3-8B-Instruct
58
- ```
 
5
  tags:
6
  - mergekit
7
  - merge
8
+ license: llama3
9
  ---
10
+ # Llama-3-15B-Instruct-zeroed
11
 
12
  This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
13
 
14
  ## Merge Details
15
  ### Merge Method
16
 
17
+ This model was merged using the passthrough merge method while zeroing `o_proj` and `down_proj` which led to an decrease in perplexity (good)
18
+ compared to similar 15B merges. This was a recommendation from [Charles Goddard](https://huggingface.co/chargoddard) - thank you for sharing the method of merging as well as Toasty
19
+ Pigeon for bringing it to my attention!
20
 
21
  ### Models Merged
22
 
 
57
  - sources:
58
  - layer_range: [24, 32]
59
  model: meta-llama/Meta-Llama-3-8B-Instruct
60
+ ```