asiansoul commited on
Commit
60a635f
·
verified ·
1 Parent(s): ca90705

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +49 -10
README.md CHANGED
@@ -5,14 +5,53 @@ license_link: LICENSE
5
  ---
6
 
7
  ```
8
- BrainStorming ...
9
- 1. base mode change : winglian/llama-3-8b-1m-PoSE: This model uses PoSE to potentially extend Llama's context length from 8k to 1M and beyond @ rope_theta: 500000.0. For this model, we build upon the 64k and subsequent 256k model with an additional 225M tokens of?
10
- 2. korean mix(instruct by beomi + base korean by beomi) : asiansoul/Llama-3-Open-Ko-Linear-8B
11
- -> method : task arith
12
- 3. translation en -> ko : nayohan/llama3-8b-it-translation-general-en-ko-1sent (style = "written", "colloquial" )
13
- this fine-tuning is necessary to meet the specific needs of users who require precise, reliable translations in the fields of general domain, and who may be dealing with complex information that general models might not translate with the necessary accuracy or nuance.
14
- 4. MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.3(32k) : 1m or 32k choice ?
15
- others
16
-
17
- method : dare_ties
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
  ```
 
5
  ---
6
 
7
  ```
8
+ models:
9
+ - model: NousResearch/Meta-Llama-3-8B
10
+ # Base model providing a general foundation without specific parameters
11
+
12
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
13
+ parameters:
14
+ density: 0.60
15
+ weight: 0.30
16
+
17
+ - model: winglian/llama-3-8b-1m-PoSE
18
+ parameters:
19
+ density: 0.15
20
+ weight: 0.15
21
+
22
+ - model: MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.3
23
+ parameters:
24
+ density: 0.1
25
+ weight: 0.15
26
+
27
+ - model: asiansoul/Llama-3-Open-Ko-Linear-8B
28
+ parameters:
29
+ density: 0.20
30
+ weight: 0.2
31
+
32
+ - model: nayohan/llama3-8b-it-translation-general-en-ko-1sent
33
+ parameters:
34
+ density: 0.55
35
+ weight: 0.1
36
+
37
+ - model: cognitivecomputations/dolphin-2.9-llama3-8b
38
+ parameters:
39
+ density: 0.55
40
+ weight: 0.1
41
+
42
+ - model: Danielbrdz/Barcenas-Llama3-8b-ORPO
43
+ parameters:
44
+ density: 0.55
45
+ weight: 0.05
46
+
47
+ - model: vicgalle/Configurable-Llama-3-8B-v0.3
48
+ parameters:
49
+ density: 0.55
50
+ weight: 0.05
51
+
52
+ merge_method: dare_ties
53
+ base_model: NousResearch/Meta-Llama-3-8B
54
+ parameters:
55
+ int8_mask: true
56
+ dtype: bfloat16
57
  ```