trollek commited on
Commit
d3c84dc
1 Parent(s): 59b13b7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +48 -3
README.md CHANGED
@@ -1,3 +1,48 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - cognitivecomputations/dolphin-2.9.3-qwen2-1.5b
4
+ - trollek/Qwen2-1.5B-Instruct-Abliterated
5
+ - M4-ai/Hercules-5.0-Qwen2-1.5B
6
+ - Replete-AI/Replete-Coder-Qwen2-1.5b
7
+ library_name: transformers
8
+ tags:
9
+ - mergekit
10
+ - merge
11
+ license: apache-2.0
12
+ language:
13
+ - en
14
+ ---
15
+ # CleverQwen2-1.5B-GGUF
16
+
17
+ The repo contains GGUF quants for [CleverQwen2-1.5B](https://huggingface.co/trollek/CleverQwen2-1.5B).
18
+
19
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
20
+
21
+ It has grown by about 300M parameters and I don't know why. I would like to know though. It works as expexted - **amazing** - I just can't see any reason for the Qwen2 models to gain parameters when merged.
22
+
23
+ ## Merge Details
24
+ ### Merge Method
25
+
26
+ This model was merged using the [Model Stock](https://arxiv.org/abs/2403.19522) merge method using [trollek/Qwen2-1.5B-Instruct-Abliterated](https://huggingface.co/trollek/Qwen2-1.5B-Instruct-Abliterated) as a base.
27
+
28
+ ### Models Merged
29
+
30
+ The following models were included in the merge:
31
+ * [cognitivecomputations/dolphin-2.9.3-qwen2-1.5b](https://huggingface.co/cognitivecomputations/dolphin-2.9.3-qwen2-1.5b)
32
+ * [M4-ai/Hercules-5.0-Qwen2-1.5B](https://huggingface.co/M4-ai/Hercules-5.0-Qwen2-1.5B)
33
+ * [Replete-AI/Replete-Coder-Qwen2-1.5b](https://huggingface.co/Replete-AI/Replete-Coder-Qwen2-1.5b)
34
+
35
+ ### Configuration
36
+
37
+ The following YAML configuration was used to produce this model:
38
+
39
+ ```yaml
40
+ models:
41
+ - model: Replete-AI/Replete-Coder-Qwen2-1.5b
42
+ - model: M4-ai/Hercules-5.0-Qwen2-1.5B
43
+ - model: cognitivecomputations/dolphin-2.9.3-qwen2-1.5b
44
+ merge_method: model_stock
45
+ base_model: trollek/Qwen2-1.5B-Instruct-Abliterated
46
+ architecture: qwen2
47
+ dtype: bfloat16
48
+ ```