Kquant03
/

PsychoOrca_32x1.1B_MoE_bf16

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Kquant03 commited on Jan 2, 2024

Commit

1f399e7

·

1 Parent(s): 3db9c24

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -22,7 +22,7 @@ and [tiny-llama-1.1b-chat-medical](https://huggingface.co/SumayyaAli/tiny-llama-
 OpenOrca experts have been given the task of creating responses for simple questions about things like pop culture, history, and science...step-1195k experts have been chosen to provide warmth and a positive environment, while chat-medical experts have been chosen to provide further detail about human subjects, and to give small little bits of medical advice: I.E. "how do I get rid of this headache I gave myself from making you?"
-# [What is a Mixture of Experts (MoE)?](https://huggingface.co/blog/moe)
 ### (from the MistralAI papers...click the quoted question above to navigate to it directly.)
 The scale of a model is one of the most important axes for better model quality. Given a fixed computing budget, training a larger model for fewer steps is better than training a smaller model for more steps.

 OpenOrca experts have been given the task of creating responses for simple questions about things like pop culture, history, and science...step-1195k experts have been chosen to provide warmth and a positive environment, while chat-medical experts have been chosen to provide further detail about human subjects, and to give small little bits of medical advice: I.E. "how do I get rid of this headache I gave myself from making you?"
+# "[What is a Mixture of Experts (MoE)?](https://huggingface.co/blog/moe)"
 ### (from the MistralAI papers...click the quoted question above to navigate to it directly.)
 The scale of a model is one of the most important axes for better model quality. Given a fixed computing budget, training a larger model for fewer steps is better than training a smaller model for more steps.