Triangle104 commited on
Commit
5976cb9
1 Parent(s): 0e2fd6b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +76 -0
README.md CHANGED
@@ -12,6 +12,82 @@ base_model: NeverSleep/Lumimaid-v0.2-8B
12
  This model was converted to GGUF format from [`NeverSleep/Lumimaid-v0.2-8B`](https://huggingface.co/NeverSleep/Lumimaid-v0.2-8B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
13
  Refer to the [original model card](https://huggingface.co/NeverSleep/Lumimaid-v0.2-8B) for more details on the model.
14
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
  ## Use with llama.cpp
16
  Install llama.cpp through brew (works on Mac and Linux)
17
 
 
12
  This model was converted to GGUF format from [`NeverSleep/Lumimaid-v0.2-8B`](https://huggingface.co/NeverSleep/Lumimaid-v0.2-8B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
13
  Refer to the [original model card](https://huggingface.co/NeverSleep/Lumimaid-v0.2-8B) for more details on the model.
14
 
15
+ ---
16
+ Model details:
17
+ -
18
+ This model is based on: Meta-Llama-3.1-8B-Instruct
19
+
20
+ Wandb: https://wandb.ai/undis95/Lumi-Llama-3-1-8B?nw=nwuserundis95
21
+
22
+ Lumimaid 0.1 -> 0.2 is a HUGE step up dataset wise.
23
+
24
+ As some people have told us our models are sloppy, Ikari decided to say fuck it and literally nuke all chats out with most slop.
25
+
26
+ Our dataset stayed the same since day one, we added data over time, cleaned them, and repeat. After not releasing model for a while because we were never satisfied, we think it's time to come back!
27
+ Prompt template: Llama-3-Instruct
28
+
29
+ <|begin_of_text|><|start_header_id|>system<|end_header_id|>
30
+
31
+ {system_prompt}<|eot_id|><|start_header_id|>user<|end_header_id|>
32
+
33
+ {input}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
34
+
35
+ {output}<|eot_id|>
36
+
37
+ Credits:
38
+ -
39
+ Undi
40
+ IkariDev
41
+
42
+ Training data we used to make our dataset:
43
+ -
44
+ Epiculous/Gnosis
45
+ ChaoticNeutrals/Luminous_Opus
46
+ ChaoticNeutrals/Synthetic-Dark-RP
47
+ ChaoticNeutrals/Synthetic-RP
48
+ Gryphe/Sonnet3.5-SlimOrcaDedupCleaned
49
+ Gryphe/Opus-WritingPrompts
50
+ meseca/writing-opus-6k
51
+ meseca/opus-instruct-9k
52
+ PJMixers/grimulkan_theory-of-mind-ShareGPT
53
+ NobodyExistsOnTheInternet/ToxicQAFinal
54
+ Undi95/toxic-dpo-v0.1-sharegpt
55
+ cgato/SlimOrcaDedupCleaned
56
+ kalomaze/Opus_Instruct_25k
57
+ Doctor-Shotgun/no-robots-sharegpt
58
+ Norquinal/claude_multiround_chat_30k
59
+ nothingiisreal/Claude-3-Opus-Instruct-15K
60
+ All the Aesirs dataset, cleaned, unslopped
61
+ All le luminae dataset, cleaned, unslopped
62
+ Small part of Airoboros reduced
63
+
64
+ We sadly didn't find the sources of the following, DM us if you recognize your set !
65
+ -
66
+ Opus_Instruct-v2-6.5K-Filtered-v2-sharegpt
67
+ claude_sharegpt_trimmed
68
+ CapybaraPure_Decontaminated-ShareGPT_reduced
69
+
70
+ Datasets credits:
71
+ -
72
+ Epiculous
73
+ ChaoticNeutrals
74
+ Gryphe
75
+ meseca
76
+ PJMixers
77
+ NobodyExistsOnTheInternet
78
+ cgato
79
+ kalomaze
80
+ Doctor-Shotgun
81
+ Norquinal
82
+ nothingiisreal
83
+
84
+ Others
85
+ -
86
+ Undi: If you want to support us, you can here.
87
+
88
+ IkariDev: Visit my retro/neocities style website please kek
89
+
90
+ ---
91
  ## Use with llama.cpp
92
  Install llama.cpp through brew (works on Mac and Linux)
93