asiansoul
/

Versatile-Llama-3-8B-1m

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

asiansoul commited on May 3, 2024

Commit

60a635f

·

verified ·

1 Parent(s): ca90705

Update README.md

Files changed (1) hide show

README.md +49 -10

README.md CHANGED Viewed

@@ -5,14 +5,53 @@ license_link: LICENSE
 ---
 ```
-BrainStorming ...
-1. base mode change : winglian/llama-3-8b-1m-PoSE: This model uses PoSE to potentially extend Llama's context length from 8k to 1M and beyond @ rope_theta: 500000.0. For this model, we build upon the 64k and subsequent 256k model with an additional 225M tokens of?
-2. korean mix(instruct by beomi + base korean by beomi) : asiansoul/Llama-3-Open-Ko-Linear-8B
--> method : task arith
-3. translation en -> ko : nayohan/llama3-8b-it-translation-general-en-ko-1sent (style = "written", "colloquial" )
- this fine-tuning is necessary to meet the specific needs of users who require precise, reliable translations in the fields of general domain, and who may be dealing with complex information that general models might not translate with the necessary accuracy or nuance.
-4. MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.3(32k) : 1m or 32k choice ?
-others
-method : dare_ties
 ```

 ---
 ```
+models:
+  - model: NousResearch/Meta-Llama-3-8B
+    # Base model providing a general foundation without specific parameters
+  - model: NousResearch/Meta-Llama-3-8B-Instruct
+    parameters:
+      density: 0.60
+      weight: 0.30
+  - model: winglian/llama-3-8b-1m-PoSE
+    parameters:
+      density: 0.15
+      weight: 0.15
+  - model: MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.3
+    parameters:
+      density: 0.1
+      weight: 0.15
+  - model: asiansoul/Llama-3-Open-Ko-Linear-8B
+    parameters:
+      density: 0.20
+      weight: 0.2
+  - model: nayohan/llama3-8b-it-translation-general-en-ko-1sent
+    parameters:
+      density: 0.55
+      weight: 0.1
+  - model: cognitivecomputations/dolphin-2.9-llama3-8b
+    parameters:
+      density: 0.55
+      weight: 0.1
+  - model: Danielbrdz/Barcenas-Llama3-8b-ORPO
+    parameters:
+      density: 0.55
+      weight: 0.05
+  - model: vicgalle/Configurable-Llama-3-8B-v0.3
+    parameters:
+      density: 0.55
+      weight: 0.05
+merge_method: dare_ties
+base_model: NousResearch/Meta-Llama-3-8B
+parameters:
+  int8_mask: true
+dtype: bfloat16
 ```